Monday, September 28, 2009

Current Developments - MGUIDE Prototypes

Prototype 1:

 

1. A 3D Agent with more than 2000 gestures and several face expressions. The agent uses this body and facial language to augment the location presentation and navigation instructions provided.
2. A 3D agent that is aware of its environment and can dynamically evoke the attention of the user during a presentation for a location. For example if the user is poking around for a certain period of time, it can request the attention of the user to the presentation.
3. A 3D agent with fully dynamic 3D clothing with changeable textures during system configuration from the user.
4. A 3D agent capable of using additional multimedia information (on a 3D board) to enhance further the transmitted information (mainly in the presentation mode).
5. A Finite State Machine (FSM) dialogue manager capable of dynamically displaying questions based on the user’s selection and the current context. The questions cover a very broad range of the possible questions/clarifications that a user can ask after a presentation for a location.
6. 12 information scenarios based on what the castle has to offer to the potential visitor both culturally and historically. The total content (presentations and questions) is more than 10 hours long.
7. Designed but not implemented, customization of the agent voice.

Screenshot 1: The animated agent points to an image on the 3D board


Screenshot 2: The animated agent gestures as she speaks

Prototype 2:

Similar to the first system but with one additional feature - QR Code based navigation. A QR-Code is a bar-code capable of storing up to 4,296 characters in a simple geometrical shape. The system uses a QR-Code recognition algorithm to recognize the locations that the user is currently in (the user must photograph the QR-Code in order for the system to process it).
 

Prototype 3:

Similar to the other systems but it focuses only on the provision of navigation instructions. At the moment the system uses only photographs of landmarks but other more automated methods (e.g., GPS positing) have also been considered.
 

Prototype 4:

It features one information scenario only, along with the characteristics of the first prototype and:
2) Dynamic changing voice recognition grammars that allows the user to interact with the system using only h/er voice.
3) Natural language processing abilities using the Stanford/Link Parser. The system utilizes a novel algorithm that allows it to conduct predicate analysis and score keyword matching (if the first stage fails).The second stage of analysis is conducted by a secondary web system (Virtual People Factory).
4) A highly experimental search/comparison algorithm utilizing Semantic interpretation of the user’s input.

Screenshot 3: The system preferences of prototype 3 (Natural Language Processing version)


0 comments:

Post a Comment