Monday, May 24, 2010

Videos – Prototype 4 (2nd Demo)

    The video below demonstrates the two processing layers I constructed upon the VPF for better language understanding. An early development of the algorithm can be found here. Please note that the few minutes of delay at the beginning of the video are because of the initialization of the tagger.

    This system uses shallow parsing and deep syntactic processing to match the user’s input with the database. In particular, the following steps are taken to find the database phrase closest to the input:

    Stage 1: Shallow Parsing

  • Replace contractions

[didn't, 'll, 're, lets, let's, 've, 'm, won't, 'd, 's, n't] with [did not, will, are, let us, let us, have, am, will not, would, is, not]

  • Remove unnecessary words and POS

[ok, yes, no, hmm, yeah, uh, huh, to, Um, Oh, Alas, Oh, Eh, er, uh, uh huh, um, well]

[Article, Preposition, Conjunction, Determiner, Modal, Interjection, Numeral, Punctuation]

  • Tag the user’s input with its Part of Speech (POS).

  • Tag the VPF match of the user’s input with its Part of Speech (POS)

  • Filter both input and VPF match, based on a list of the global keywords returned by the VPF Web Service.

  • Compare what is left for POS and values. For example, for my question “Does the castle has any other gates?” only the (gates castle) keywords are returned.

  • If the comparison is successful allow the output (i.e., a script containing all the synchronized animations, speech, etc) of the VPF service to be executed by the system

  • If the comparison fails, pass the input to Stage 2 for deep syntactic processing.

    Stage 2: Deep Syntactic Processing

  • Fully Parse the user’s input and extract its predicates and Deep Dependencies (e.g., Subject, DirectObject, etc). For example, my phrase “I would like a brief description about all walls!” fails in the first stage of processing and parses like: 

like(Subject: I, DirectObject: description, SpaceComplement: about walls)

  • If it is a single predicate sentence, conduct 10 similarity tests, to check for similarities between the parsed input and the pre-parsed sentences in the database.

  • If a match is found, query the VPF with the match.

  • Return the output (i.e., a script containing all the synchronized animations, speech, etc) and execute it in the system.

  • If it is a double predicate sentence, conduct 9 similarity tests, to check for similarities between the parsed input and the pre-parsed sentences in the database

  • If a match is found query the VPF with the match. 

  • Return the output (i.e., a script containing all the synchronized animations, speech, etc) and execute it in the system.

  • If this stage fails, ask the user to rephrase or to move on to another question.

    I also experimented with the semantic parsing, but the Antelope parser is currently experimental. I am planning to add a third stage for full semantic processing in the current algorithm once Antelope’s parser is fully matured.

0 comments:

Post a Comment