Monday, August 30, 2010

iPAD – Multitouch Web (Part B)

 Source: Some blog

 Research

There is already a usability study of the IPad. from the Nielsen Norman group here. A summary of the study can be found here. As Neilsen admits the study is preliminary, but the resulting usability insights serve as a good foundation for the design of the myriads of applications that will follow the release of the device on the global market.

The study is very generic –  it tested several applications and web sites running on the iPad device. As every digital project requires a unique testing context that takes into account its unique range of parameters, more focused studies are necessary. Again existing research methods and techniques must be tailored accordingly to take into consideration the multi-touch style of interaction.

Take as an example, m-commerce, which according to many is the next “big” thing in the mobile world. In my opinion, multi-touch web pages if done correctly, hold a high potential to make our transactions easier than even our desktop computers. How would you design an m-commerce web site in order to achieve such a goal? The guidelines discussed in Part A of this post, are a good place to start in order to construct some initial prototypes. However, as these guidelines are far from best-practises, an iterative cycle of participatory design workshops for a few days is necessary in order to agree on a final prototype.

Gathering user insights after the web site is released, currently seems challenging. The usability studies I’ve read so far, use lab-based testing with one-to-one sessions. However testing with real-users in lab-conditions is always expensive. Cheaper techniques, like remote usability testing currently seem very hard to implement. Then, do existing tools for split or multivariate testing (e.g., Google Web site Optimizer) work on a multi-touch interface? These tools are optimized for a mouse-based environment, and I am not convinced that they can be used effectively on multi-touch. For instance, does Google Web Site optimizer registers multi-touch gestures, as well as clicks when it comes to measuring the success of a web site? Nevertheless, it will be very interesting to see how these techniques for research will be adapted, to serve the new environment in the years to come. 

Enhanced by Zemanta

Sunday, August 29, 2010

iPAD – Multitouch Web (Part A)

iPad con dock y teclado inalámbrico

Image via Wikipedia

I finally had the chance the test the new IPad device. I spent some time trying to figure out if it really worth spending 400 pounds on this device. Here are my findings:

1) The device is excellent for gaming. It is perhaps one of the best gaming devices I’ve ever used. The integrated gyroscope means, that are no annoying arrow keys to use while playing games (all you have to do is to turn the device)

2) For i-reading, although the screen is very clear the absence of integrated back stand (like Samsung’s UMPC devices) make it very hard to hold it for a very long time.

3) From all the applications I tested I only found one of particular interest. It is an application that shows you the star constellations based on your geographical position.

4) The device has to flash support, which means that the majority of the WWW content is out of reach. Advocates of the device say that as the Web is progressively moving to the HTML 5.0 standard, that will soon stop be an issue. Advocates of Flash say that Flash can not die, as it is an integral part of the web. Who is right, and who is wrong only time will tell. For now all I can say is that if I buy the IPad, my favourite episodes of “Eureka” on Hulu are out of reach for good.

5) The device is multi-touch, which means that it is very hard to operate on mouse-based web pages. As the majority of mobile platforms are now moving towards multi-touch,  what does this mean for designers, IA, researchers and other stakeholders? Below I attempt to present some of the possible implications for designers, IA and researchers.

Design

  • Size of Finger

Web pages on touch-sensitive devices are not navigated using a mouse. They are controlled with human fingers, many of which are much fatter than a typical mouse pointer. No matter what apple says about an “ultimate browsing experience” on IPad, clicking on small text links with your finger is painful, and sometimes practically impossible. As touch-sensitive devices become more popular, this could mean the end of traditional text-links and their replacement by big touchable buttons.

  • Secondary Functions

The “fat finger” problem discussed above, and the limited screen estate, also mean that we can not cram thousand of features (or ads) in a tight frame as we would in a desktop web page. The design of web pages should be focus on the essential elements, and it should avoid wasting user attention on processing secondary functions.  

  • Hover effects and Right Clicks

Without a mouse-based interface, you can’t use any mouse over effects. Elements that we are so used to interact with on mouse-driven interfaces, like menus that pop-up when you hover you mouse over a link, or right clicks, do not exist. Apple has a number of replacements in place, like holding your finger down on a link to get a pop-up menu, but they only make clicking itself more complex. 

  • Horizontal Vs. Vertical Styling

Due to the ability to easily switch between vertical and horizontal orientations, web sites will have to automatically adapt their styling to look accordingly in both orientations. Seamless presentation in both landscape and portrait mode is one of the most fundamental guidelines when it come to designing for the IPad/IPhone devices.

  • 3D Objects

Apple is trying to push designs that immediate tangible things – real world interfaces that are easy to understand and familiar in their use. If you create a magazine application make it look like a real magazine, or if you make a word processor make it look like a type writer. Could iPad be a significant milestone towards a more three-dimensional WWW? Some web 3D applications like Second Life, could certainly benefit from the mouse-less interface, as touching and tilting make it much easier to interact with 3D worlds than mousing and keyboarding. In mainstream websites, 3D elements (e.g., material surfaces and SFX)  will probably be used widely as an “invitation to touch”, but never as a basic metaphor.    

Information Architecture

Multi-touch presents a unique set of challenges for information architects. The limited screen size and the size of the human finger tip, means that  limited number of actions needed to complete one task (its tiresome to swipe and touch too often) pushes the IA to create a dead simple architecture with minimal number of actions. Information aggregation, will play a very important role into creating architectures that minimize input and maximize output.

Under the above rule, several more “human-like” modalities of communication, such as speech recognition, text to speech processing, emotion recognition, natural language processing etc, are likely to find their place into the multi-touch web. Multi-touch seems to me the perfect vehicle towards the right direction, away from the dominance of GUI interfaces and towards a more natural way of interacting with computers.    

Continues in the next post

Enhanced by Zemanta

Tuesday, August 17, 2010

Cognitive Walkthrough - ICT Virtual Human Toolkit

As part of the MGUIDE project, I had to complete the cognitive walkthrough of the ICT Virtual Human Toolkit. This toolkit is a collection of the state-of-the-art-technologies including: speech recognition, automatic gesture generation, text to speech synthesis, 3D interfaces, dialogue model creation to name but a few. Current users of the toolkit include, CSI/UCB Vision Group at UC Berkeley, Component Analysis Lab at Carnegie Mellon University, Affective Computing Research group at MIT Media Lab and Microsoft Research.

File:Virtual humans characters.jpg

An assemble of some of the characters created by the toolkit.

Source: University of Southern California Institute for Creative Technologies

The main idea behind the evaluation, was to provide usability insights on what is perhaps the most advanced platform for multimodal creation on the planet today. The process was completed successfully with 2 experts, and revealed a number of insights that were documented carefully. These insights will be fed to the design of the Talos Toolkit – among the MGUIDE deliverables was an authoring toolkit to aid the rapid prototyping of multimodal applications with virtual humans. Talos is currently just an architecture (see here), but the walkthrough of the ICT toolkit provided some valuable insights that should guide its actual design. However the MGUIDE project was completed, with the development of Talos set for the future goals of the project.

I applied the cognitive walkthrough, exactly as I would applied it in any other project. I performed a task analysis first (i.e., i established the tasks I wanted to perform with the toolkit I broke them into actions) and then, I asked the following questions at each step:

1) Will the customer realistically be trying to do this action?

2) Is the control for the action visible?

3) Is there a strong link between the control and the action?

4) Is feedback appropriate?   

Monday, August 9, 2010

Ultimate – A prototype search engine

Lately, I have been experimenting again with Axure on a prototype search engine called Ultimate. The engine is based on actual user requirements collected through a focus group study. I decided to prototype Ultimate, in order to perfect my skills in Axure. The tool enabled me to construct a high-fidelity and fully functional prototype within a few hours. Some of the features of the engine are:

  • It relies on full natural language processing to understand the user’s input.
  • The search algorithm is based on a complex network of software agents – automated software robots programmed to complete tasks (e.g., monitor the prices of 200 airlines, get ratings from tripadvisor.co.uk, etc)  - to deliver accurate and user-tailored results.
  • Some of the engine functionalities are discussed in the user journey shown below. The full functionalities are well documented, but for obvious reasons I can not discuss them in this post.

Screenshots:

 
User Journey:
 
 
 Research:

Run a usability testing of the above design against a more “conventional” search engine (e.g., Skyscanner ), and I am certain that the results will show the clear superiority of Ultimate. Of course, there is the need for a careful usability study in order to compare the designs but I am certain that Ultimate is superior in all usability metrics.

 

  

Friday, August 6, 2010

Bespoke Research Solutions

The research methodologies discussed in previous posts provide an excellent basis to start from, but the web is changing rapidly. Rich Internet Applications (Flash, Silverlight, Ajax, etc), new interaction methods (e.g., multi-touch, gesture recognition, etc) will flood the web are already here. Can we apply what we know in terms of user- research in these new environments? Consider as an example a corporate web site, a mobile artificial intelligent assistant, and an RIA application. Conducting a usability study in the first scenario, is perhaps straightforward. However what techniques and measures are the most relevant to the second and third scenarios? What can we apply in order to ensure that we can indeed gather rich user insights? What are the most relevant tools to deploy? - Siri is a mobile application, while the rest are desktop applications. Unfortunately I don’t have the answers, as I have never attempted a similar study before. I can only imagine that existing techniques would have to be tailored to the unique and complex range of variables. Therefore, research adaptation is far more important that the domain itself.

Present:

halcrow_home_tcr siri-iphone-app 1
Source: HalCrow Source: Some Blog Source: Microsoft

Future (Aurora) ??:

I like thinking about the future, a lot! Aurora from Mozila labs, is a project that aims to redesign the way we browse the web. Currently, It is merely a concept, but it gives a pretty good idea of how the future will be like. There is an excellent critique of the Aurora project here. If the future will be similar to Aurora (i.e., reinventing the user interface), what does this mean for user-research? We need to fundamentally re-think the way that we conduct user research. New techniques will have to be invented and existing ones revisited. Innovation and creativity, will distinguish the companies that survive from the ones that will go out of business.

Aurora (Part 1) from Adaptive Path on Vimeo.

Thursday, August 5, 2010

User Research Deliverables

Different institutes require user research deliverables to be formatted differently. In MGUIDE I had to provide both within the allocated budget and timeframes:

Industry: Usability reports with actionable recommendations were delivered to each of the companies involved in the MGUIDE project. Each company wanted user-insights on the particular piece of technology that contributed to the project. For example, the following recommendation was of particular interest to the text-to-speech company:

Actionable Recommendations:

  • Provide a visible and easy to use method for users to decrease/increase the rate of the Text-to-Speech output while the application speaks.

-Users will likely find the output more natural and easy to understand

Note: I can not release any detailed research findings as they are the property of the institutes that supported the project.

Personas:  Personas are a technique used to summarize user-research findings. In my understanding, personas are made-up people used to represent major segments of a product’s target audience. There is an excellent explanation of personas here. Personas are easy to construct, and a great way to distil research findings into a simple and accessible form.

Source: WebCredible

Academia: In academia statistical significance is of major importance. I presented deliverables in similar format to those above, but accompanied by statistics in the proper format (e.g., F (1, 14) = 7.956; p < 0.05). Statistics appear to be of little interest to the industry though.

Transferable Research Skills (Part B)

Remote Usability Testing:

Remote testing, is about conducting usability testing without having participants come into the lab. Although there are several tools and web services on the market, I prefer to work with userfeel because of the low cost, and their massive network of testers from all over the globe.

A/B and Multivariate Testing:

A/B and multivariate testing, is about testing different versions of the same design, in order to see which performs the best. I use this technique in all of my usability tests, either by differentiating my designs across one variable (i.e., A/B testing) or more (i.e., multivariate testing).

Co-Discovering Learning:

My approach to co-discovering learning, is as follows: I usually ask two or more users to perform a task together, while I observe them. I encourage them to converse and interact with each other to create a “team spirit”. In some cases, I also allow note taking (e.g., when the content is technical/complex). The technique can yield some really powerful results, as it is more natural for users to verbalise their thought during the test.

Participatory Design:

image

Participatory design, is about involving users into the design and decision-making process into an iterative cycle of designing and evaluation. I usually conduct a short participatory design session, prior all of my usability evaluations. In these sessions, the usability issue of a prototype system are determined, and the changes to accommodate for these issues are made. The refined system is then used in the actual usability evaluation.


A4.0 Inspection Methods

Cognitive Walkthrough:

Cognitive Walkthrough

The cognitive walkthrough is a method of “quick and dirty” usability testing requiring a number of expert evaluators. A list of tasks and the actions to complete them is created. The evaluators step  through each task, action by action, noting down problems and difficulties as they go. I can use cognitive walkthroughs on a number of digital interfaces, ranging from web sites to complex authoring toolkits.

Heuristic Evaluation:

Heuristic Evaluation

Heuristic evaluation is about judging the compliance of an interface against a number of recognized usability principles (i.e., the Heuristics). I used this method extensively in the evaluation of e-learning prototypes during my teaching at Middlesex University.


A5.0 Advanced Usability Techniques (in training)

Eye Tracking:

heatmap_lightbox

Eye tracking is a technique that pinpoints where the users look on a system and for how long. I am currently talking with Middlesex University in order to get training on using eye-tracking as a usability testing technique. We plan to conduct a series of eye-tracking session in Middlesex state-of-the-art usability labs, using the MGUIDE prototypes.

Emotion Recognition & Eye Tracking

This is a technique I developed during the MGUIDE project. I discuss it in detail here. It was developed with avatar-based interfaces/presentation systems in mind, but it is universal in nature. It is based on the hypothesis that the perceived accessibility of a system’s content is evident in the user's emotional expressions. The combined “Emotion Recognition and Eye-tracking” technique will be validated in a lab-based study that will be performed at Middlesex University.


A5.0 Audits

Accessibility Audit:

In accessibility audit, an expert checks the compliance of a web site with established guidelines and metrics. The  W3C WAI are the most widely used guidelines in accessibility audits. My approach for accessibility evaluation is framework-based (see here), but a) I haven’t applied my framework with disabled users and b) the W3C WAI heuristics are very well established. Although I have a good knowledge of the W3C WAI heuristics, I have never performed an accessibility audit before.

 

Enhanced by Zemanta

Wednesday, August 4, 2010

Transferable Research Skills (Part A)

MGUIDE is my most up to date research work, and I am very proud of what I have accomplished. However, I’ve become eager to outgrow the domain and transfer my research skills to the digital media world. I am interested in any form of digital interactive applications (websites, social networks, interactive-tv, search engines, games, etc). I am highly experienced in using the following techniques for user research: 

A: User Research

A1.0 Quantitative Research

Surveys/Questionnaires (Online and Offline):

Post-test and pre-test questionnaires provide real insights into user needs, wants and thoughts. I use powerful statistics (e.g., Cronbach's Alpha) to ensure that the questionnaires I create, are both reliable and valid. I can apply these skills into any domain with minimum adaptation time.

Performance Measures:

Performance measures, like for example, time to complete a task, the number of errors conducted, scores in retention tests, etc,  provide strong indication of how easily people can achieve tasks with a system. If this data are correlated with other objective or subjective measures they can provide deeper user insights than surveys/questionnaires alone.

Log File Analysis:

A log is a file that lists actions that have occurred. Both quantitative and qualitative data can be stored in a log file for later analysis. I use device/system logs to automatically collect data such as time to complete a task, items selected on the interface, keys pressed etc.


A2.0 Qualitative Research:

Focus Groups:

I mainly use focus groups for requirements gathering, either through the introduction of new ideas and discussion and/or the evaluation of low-fidelity prototypes. 

Direct Observation:

One of the most common techniques for collecting data in an ethnographic study is direct, first-hand observation of participants. I am experienced in using direct observation for note taking in both indoor and outdoor environments.I find gaining an understanding of users through first-hand observation of their behaviour while they use a digital system, genuinely exciting. During my work in MGUIDE direct observation was used to uncovered a number of interesting user-insights that were then correlated with user views collected from the questionnaires.

User Interviews & Contextual inquiry:

Other common ethnographic techniques are user interviews and contextual Inquiry. I use extensively open-ended interviews  i.e., interviews where the interviewees are all asked the same-open ended questions, in both field and lab conditions. I like the particular style as it is faster and can be more easily analysed and correlated with other data.    

Think Aloud Protocol:

Think-aloud is a technique for gathering data during a usability testing session. It involves participants thinking aloud as they are performing a set of specified tasks. I have used think-aloud very successfully in navigation tasks, where participants had to verbalise their answers to navigation problems as those presented by two interactive systems.


A3.0 Quantitative & Qualitative Research

Usability and Accessibility testing:

usabilitylabs

Lab-based and Field-based testing are the most effective ways of revealing usability and accessibility issues. I am experienced in conducting and managing lab and field testing. I use scenario-based quantitative and qualitative methods for my research.

Continues in the next post

Monday, August 2, 2010

Universality of Research Methods & Techniques

I thought that the universality of methods for research was a fundamental fact of modern science. Isn’t it obvious that having successfully applied quantitative/qualitative research in one domain means that your skills can be applied to any other domain with minimal adaptation time? Is there a real difference between applying qualitative research in a complex avatar-system like MGUIDE and an e-commerce web site? For example, If you apply techniques like unstructured interviews wouldn’t you follow the same principles to design the interviews in both domains?

Or even using more complex techniques like eye tracking  and emotion recognition, aren’t these domain-independent? Consider for instance, my combined emotion recognition + face detection technique for accessibility research, described in the previous post. The technique was developed with avatar-based interfaces/presentation systems in mind. Adapting the technique to different domains is a matter of defining the aspects of the interface you wish to research. The quantitative data that you will collect are the same (emotion intensities, etc), the qualitative of course will differ because the interfaces are different. In general once you establish the objectives/goals of the research, deciding which  techniques you will use (and modifying them if necessary  to suit your needs) is easy and the process is domain-independent.

2516939380_79f2e5dcf6 eyetracking-study-heat-map

Eye tracking used in completely different contexts: a) a 3D avatar-based world and b) a web page.

I am not sure why some people insist otherwise and focus so much on the subject matter. I have to agree that having expertise in a certain area, means that you can produce results fairly quickly. However this is a process easily learnt. Is domain expertise the most important quality a user researcher should have? Or should he perhaps have a solid research skills-set to start from and the willingness to learn more about established techniques and explore new ones?

Saturday, July 31, 2010

Video Games & Online Games

This post in an attempt to disambiguate the domain of virtual humans. Most people have never heard the term “Virtual Human” before, but they all play games (online or offline) and they all have interacted with some limited form of a VH on the web.

Computer games (online and offline) are the closest thing to the domain of Virtual humans.

Online games (e-gaming)

You can argue that online games are much simpler than video games, but they are progressively getting much more complicated. As in video games, fully-fleshed avatars are widely used to immerse the player into the scenario. Below is an example of a poker game I found from a company called PKR. Notice the use of body language, face expressions, etc to create a fully realistic poker simulation.  

Video Games:

Below is a screenshot from my favourite game Mass Effect:

ME2%20choices Source: http://www.jpbrown.co.uk/reviews.html 

Notice the use of dialogue wheels to simulate dialogues between the avatars. There is an excellent analysis of the particular style of conversation here

However, in contrast to most current video games Virtual humans engage players in actual dialogue, using speech recognition, dialogue system technology, and emotional modelling to deepen the experience and make it more entertaining. Such technologies have started only recently to find their way into video games. Tom Clansy’s End War game is using speech recognition to allow users to give commands to their armies.

endwar-beta-02 Source: http://www.the-chiz.com/

Some games go as far as using full natural language processing:

Virtual Humans on the Web:

There are a lot of very superficial virtual humans on the web. This is perhaps one of the main reasons that they have failed so far to become a mainstream interface. What virtual humans should be, is about the whole thing: emotion modelling, cognition speech, dialogue, domain strategies and knowledge, gestures etc. Avatars like Anna of IKEA are mere drawings with very limited dialogue abilities, and are simply there to create a more interesting FAQ (Frequently Asked Question) Section. There is still someway to go before we will see full-scaled avatars on the web, but we will get there.

 

2  Source: http://www.ikeafans.com/forums/swaps-exchanges/1178-malm-bed-help.html

 

Friday, July 30, 2010

E-Learning Prototype

Below is the prototype of a e-learning system that I was asked to do by a company. As I can not draw, I decided to use Microsoft Word to communicate my ideas. There should be a good storyboarding tool out there that could help me to streamline the process.

The design below is based on existing and proven technologies that can be easily integrated into existing e-learning platforms. Codebaby, a company in Canada is already using avatars (such as those shown in my design) [1] in e-learning very successfully for several years. The picture in the last screen of the design is a virtual classroom [2] created in the popular Second Life platform.

 
Compare my solution with a “conventional” e-learning platform shown below. Although I do include several GUI (Graphical User Interface) elements in my work, it is obvious that : a) my interface is minimalistic with fewer elements on the screen. b) accessibility is greater, as instead of clicking on multiple links in order to accomplish tasks, you can simply “ask” the system using the most natural method you know - “natural language”. The benefits of avatar-assisted e-learning will become evident when the web progresses from its current form to Web 2.0 and ultimately to Web 3.0. For now, such solutions should at least be offered as an augmentation to “conventional” GUI-based interfaces. All companies want something more, for example something that would add easier access to module contents and  the “WOW” factor to their products. They just don’t know what it is until you show it to them.
 
1[2]
 
Although the proposed design is based on mature and well-tested technologies, I can understand if someone wants the purely GUI solutions. In fact, I would be more than happy to assist them. I have been working with GUI interfaces for several years, long before I developed an interest for avatar technologies. I developed my first e-learning tool back in 1998 (12 years ago). It was an educational CD-ROM about the robotic telescope platform of Bradford University.
 

[1] http://www.codebaby.com/showcase/

[2] http://horizonproject.wikispaces.com/Virtual+Worlds

Heuristics vs. User Research

People keep asking me about the W3C accessibility guidelines – a set of heuristics that should aid designers towards more accessible web sites. Of course these are not the only guidelines out there, BBC has it own accessibility guidelines and  there are several for web usability as well. Although I am familiar with the W3C guidelines, I didn’t use them in my MGUIDE work because I didn’t find them relevant. The reason is that the W3C guidelines are written specifically for web content and not for multimodal content. The research is the area of virtual humans provide more relevant heuristics, but there is still room for massive additions and improvements. Instead of heuristic evaluation, I decided to built my own theoretical framework to guide my research efforts. The framework is based on the relevant literature in the area and on well documented theories of human cognition. It provides all the necessary tools for iterative user testing and design refinement. 

There is no doubt that relying on user testing is costly and lengthy. This becomes even more difficult, when you have to deal with large groups of people as I did in MGUIDE. However the cost and time can be minimised with the use of proper tools. For example, the global lab project has created a virtual lab (on the popular Second Life platform) in which projects are accessible to anybody, anytime, and from anywhere. New research methods like eye tracking and emotion recognition, can reveal user insights with a relatively small group of people and with minimal effort. Soon enough perhaps, tools will include routines that calculate deep statistics with minimal intervention. User testing has definitely some way to go before it becomes mainstream, but I am sure we will get there.

Until then, Inspection methods (e.g., cognitive walkthroughs, expert surveys of the design, heuristic evaluations etc) are used to replace user testing. In such a process, some high level requirements are usually prototyped, and then judged by the expert against some established guidelines. A major problem with this approach though, is that there are over 1,000 documented design guidelines [1]. How do you choose which is one is proper given the specific context? It is my understanding that each institute/professional uses a set of best-practice guidelines, adapted from the relevant literature and from years of experience. However, even if these guidelines have worked in the past it doesn’t mean they will work again. Technology is progressing extremely fast, and people become more knowledgeable, and more accustomed to technology every single second. Therefore, even when inspection methods are used some form of user testing is necessary. A focus group for example, with a couple of users can provide enough user-insights to amend a design as necessary.

[1]http://www.nngroup.com/events/tutorials/usability.html 

 

 

Wednesday, July 28, 2010

Emotion Recognition for Accessibility Research

There are a number of quantitative techniques that can be used in the user research of avatar-based interfaces. Apart from the “usual” techniques for gathering subjective impressions (through questionnaires, tests, etc) and performance data, I also considered a more objective technique based on emotion recognition. In particular, I thought of evaluating the accessibility of the content presented by my systems through the use of emotion expression recognition. The main hypothesis is that the perceived accessibility of the systems' content is evident in the user's emotional expressions. 

If you think about it for a while, the human face is the strongest indicator of our cognitive state and hence, how we perceive a stimuli (information content, image, etc). Emotion measures (both quantitative and qualitative) can provide data that can augment any traditional technique for accessibility evaluation (e.g., questionnaires, retention tests, etc). For example, with careful logging you can see which part of your content is more confusing, which part requires the users to think more intensively, etc. In addition to the qualitative data, the numeric intensities can be used for some very interesting statistical comparisons.  Manual coding of the video streams is no longer necessary, as there are a number of tools that allow automating analysis of face expressions. To my knowledge the following tools are currently fully functional:

1) Visual Recognition

ScreenShot (1)

2)   SHORE

Mimikanalyse

The idea is fully developed, and I am planning the release of the paper very soon. Finally, If we combine this technique with eye-tracking we can reveal even more user-insights about avatar-based interfaces. We could try for instance to identify, what aspect of the interface make the user to have the particular face expression (positive or negative). For example, one of the participants in my experiments mentioned that she couldn’t pay attention to the information provided by the system, because she was looking at the guide’s hair waving. To such a stimuli humans usually have a calm expression. This comment is just an indication of the user-insights that can be revealed, if these techniques are successfully combined.

Tuesday, July 27, 2010

Accessibility/Universal Access

I recently found a good resource [1] on accessibility from a company called Cimex that says what most designers and UX specialists fail to see – when you design for accessibility you do not cater only for less able users. You are making sure that your content is open and accessible to a variety of people and machines using whatever browser or method they choose.

Now, caring for a variety of people of different physical, cognitive emotional and language background and the methods they choose to use you end up with Universal Access.   

Using traditional interfaces is difficult to achieve the goals of Universal Access. Virtual humans as interfaces hold a high potential of achieving the goals of UA as the modalities (e.g., natural language processing, gestures, facial expressions and others) used in such interfaces are the ones our brains have been trained to understand over thousand of years. Virtual humans can speak several languages with a minimal effort (see the Charamel showroom). Their body and face language can be adjusted easily to highlight the importance of a message. Sign-language can be used to communicate with less-able users (no other interface can currently accomplish that). Accurate simulation of interpersonal scenarios (e.g., a sales scenario) can guarantee that your message gets across as effectively as it would if a real person would speak it.

In my work I did go as far as Universal accessibility by comparing the effects of virtual human interfaces on the cognitive accessibility of information under simulated mobile conditions, using groups of users with different cultural and language background. In order to make the information easier to access, I used a variety of methods found in the VH interfaces (e.g., natural language processing, gestures, interpersonal scenario simulation and others). By making the main functions of your system easier to access you ultimately make the interface easier to use and hence, it was natural to investigate some usability aspects as well (e.g., ease of use, effectiveness, efficiency, user satisfaction, etc). These are all aspects of the user experience (UX), i.e., the quality of experience a person has when interacting with a product. I can not release any more information at this stage, as the necessary publications have not yet been made.

In the future I believe that the existing technologies will merge into two mainstream platforms: a) Robotic assistants from the one and b) software assistants/virtual human interfaces from the other. Accessing the services these systems will offer will be as easy as our daily interactions with other people. The barriers that exist today (cognitive, physical, etc) will become a “thing” of the past.  

Monday, July 26, 2010

MGUIDE Development Process

I thought it would be a good idea to try to explain the methodologies followed in the development of the MGUIDE prototypes. Having a focus mainly on the research outcomes, the development methodology followed was of little concern to the involved stakeholders. Trying to create interpersonal simulations like the ones found in real-life is a process mostly compatible with the a Scrum development methodology (shown below). I am planning to create a paper on the topic, and hence I will not say much in this post.

  800px-Scrum_process_svg

Source: Wikipedia

Gathering the requirements of the users can be done using a variety of ways. I followed a combine literature-user evaluation approach. One of my earliest prototypes was developed using guidelines found in the literature. The prototype was then evaluated with actual users and a set of new requirements was developed. These requirements are what the SCRUM refers to as the “product backlog”. Each spring (usually in my case 1-3 months) a set of the requirements were developed and tested, and then were replaced by a new set of requirements. Doing simulations of interpersonal scenarios gives you the freedom to augment the product backlog with new requirements quite easily. Using methods of research like direct observation and note taking, you can take notes on the interactions found in the scenarios that you want to simulate. My scenario was a guide agent and hence, I went to a number of tours where I made a number of interesting observations. Most of my findings were actually developed in the MGUIDE prototypes, but there are others that still remain in the “product backlog”. Of course these requirements and the work that was done in the MGUIDE is enough to inform Artificial intelligence models of behaviour in order to create completely automated systems.

This iterative process was then repeated prior the actual user research stage, where the full set-up of the MGUIDE evaluation stage was tested. I used a small group of people that tried to find bugs in the software, problems with the data gathering tools and others. The problems were normally corrected on site and the process was repeated again. Once I ensured that all my instruments were free of problems, the official evaluation stage of the prototypes started.

Closing this post, I must highlight the need for future research in gathering data about different situations where interpersonal scenarios occur. In reality different situations produce different reactions in people and this should be researched further. Only through detailed empirical experimentation we can ensure that future avatar-based systems will guarantee superior user experiences.

 

Friday, July 23, 2010

Speech Recognition

In order to successfully simulate an interpersonal scenario with a virtual human, you need speech recognition (in real-life we speak to each other and not click on buttons or use text). For this reason, I have been following closely the evaluation of the speech recognition industry for some time now. 

During the MGUIDE project I successfully integrated speech recognition into one of my prototypes. I used Microsoft Speech Recognition Engine 6.1 (SAPI 5.1) with dictation grammars which I developed using the Chant GrammarKit, in pure XML. The grammars look like this:

<RULE name="Q1" TOPLEVEL="ACTIVE">
<l>
<P><RULEREF NAME="want_phrases"/>to begin</P>
<P><RULEREF NAME="want_phrases"/>to start</P>
<P><RULEREF NAME="want_phrases"/>to start immediately</P>
</l>
<opt>the tour </opt>
<opt>the tour ?then</opt>
</RULE>

I also voice-enabled the control of the interface of my system, so if you would say “Pause” the virtual guide would pause its presentation. I briefly tested both modes with one participant in the lab. In the dictation mode, with just a couple of minutes of training Microsoft’s engine performed with 100% accuracy within the constrains of the grammar. For completely unknown input, the engine performed with less than 40% accuracy. In CnC mode, the engine worked with 100% accuracy without any training. Of course,  SAPI 5.4 in Windows 7 offer much better recognition rates in both dictation and CnC modes. I haven’t tried SAPI 5.4 but is within my plans for the future.I think that true speaker-independent (i.e., without training) recognition in indoor environments, is only 5 years away, at least for the English language.

In mobile environments, Siri appears to be the only solution out there that realises the idea of a virtual assistant on the go using speech recognition. Siri works uses dynamic grammar recognition, similar to my approach. If you say something within the constrains of the grammar the accuracy of recognition reaches 100%. However, as in the case of my prototype, if you say something outside the grammar files the recognition results can be really funny.

Statement to Siri: Tell my husband I’ll be late

Reply: Tell my Husband Ovulate (he he he)

ASR

Source: http://siri.com/v2/assets/Web3.0Jan2010.pdf

Terminology:

Dictation Speech Recognition: Refers to the type of speech recognition where the computer tries to translate what you say into text

Command and Control mode (CnC): This type of speech recognition is used to control applications

MGUIDE Project Funding

As people keep asking me about the funding of the MGUIDE project, I thought to post this in an attempt to clarify the situation further. The MGUIDE was a large and very sophisticated project and money come from a variety of resources. The project started in 2007 and until 2008, Middlesex University was the main funding body and my last official employment institute. The package from Middlesex University covered my project expenses for that year and required me to perform maximum 15 hours of teaching/week. Two other Universities and six companies also provided support in the form of know-how, and funding for tools and hardware. From 2008 until June 2010, I was able to secure funding from an angel investor and thankfully the continued support of the companies and universities. The idea with the MGUIDE was and still remains to develop a commercial product out of it. However, because of the bad economic climate, my investor decided not to proceed any further. I still hope that this work will appeal to a company and I will be able to see MGUIDE as an application/system for Ipad or any other tablet-based computer system.

Wednesday, July 21, 2010

Project Management - E-Learning Projects

This post is not related to the MGUIDE project, but to the work I did at Middlesex University. Most of the modules I taught there were project based. Usually I had to guide several groups of students (20-30 students) into the design and development of projects.  A particular project had to do with the design and implementation of  e-learning games for autistic children. Each of my students was given a case study describing the requirements of the particular autistic children (as these were captured from the teachers of the children, like for instance, that the children needs help in understanding emotion expressions), and had to produce a game under my guidance. The game was hen sent to the relevant school for full-scale evaluation. Each semester I usually ended up marking 100-200 games with at least 80% of them being top class. Below is an example of the projects produced under my guidance. All material is copyrighted by Middlesex University so please ask before you copy anything:

All multimedia elements (including designs) have been produced by my students using Adobe Photoshop. The tools needed for the game development along with best practice techniques, were discussed in detail in the class by myself.

Copyright by Middlesex University – Please do not copy

Each game was evaluated in the class (by myself and the students). The games were then sent to the schools for formal evaluation by the children and their teachers. Below is a sample of the heuristic evaluation that was performed in the class:

Evaluation comments

Negative Comments

Positive Comments

Background

First scene using a suitable background and clearly stating your own title for the topic and clear instructions

   

Text

A variety and clear use of Text (Spelling and Grammar?) Too small – Too large – not clear- inappropriate words used

   

Monday, July 19, 2010

Art Assets

Below are samples from the art work I completed in the MGUIDE project. Although I have several years of experience in designing using Adobe Photoshop, I do not consider myself a designer. Design is interesting, but I prefer programming and user-based research. However, if a project requires me to produce art assets I am perfectly capable of accomplishing that as well.

 

Friday, July 2, 2010

Experiment 4 setup

Due to demand, I decided to provide some information on the set-up of my experimental work during the evaluation stage of MGUIDE. The information below is the briefing participants had to read for experiment 4. Please note, that the main technique for data collection in experiment 4, is think aloud protocol. I conducted the testing with two user groups of 6 participants.

The purpose of this experiment is to investigate the possible effects of two mobile systems for path finding of variable complexity, on your ability to find your way in the castle. You will have to use the systems to navigate along two different routes visiting a number of landmarks in turn (10 to maximum 18), using the system A on one, and the system B on the other. The total duration of the experiment doesn’t exceed 20 minutes per route. For the purpose of the experiment I have created two video applications representing in detail each route of the castle. At each video-clip you will hear the question “What would you do at the particular point, if you were in the castle?” You must answer the question based on the visual (i.e., gestures and landmarks) and/or audio instructions delivered by the system.

For example:

Given this instruction:From where you are, if you look on your right, you will see two chimneys. Opposite the chimneys there is a path that leads to another square. Please follow this path until you will come across a house with a black front-door!


And this clip:
clip_image002 

You will have to answer: “I will follow the path on the right of the tree until I see the house with the black front-door”. The next video will show the result of this action (i.e., that you have moved towards the house with the black door), and will pose you a new navigational challenge.

Wednesday, June 30, 2010

User Research – General Setup

User research in the MGUIDE project can be divided into the following stages:

  • Requirements Gathering and specification: A prototype system was constructed based on recommendations from the literature on animated agents. The prototype was evaluated in the actual castle of Monemvasia with a number of participants. You can find more details here.  The lessons learned were presented during the M.Phil presentation and influenced the design of the final five (5) prototypes.

 

  • Design and Prototypes: Based on requirements gathered from the pilot in the castle of Monemvasia, five(5) prototypes were developed. A number of novelties were achieved during the developed of the prototypes (e.g., an algorithm for natural language understanding, the design of Talos – a toolkit for system prototyping and research, and others). Although the initial idea was to continue the evaluation in the castle, due to lack of resources and time it was decided to simulate the conditions in the lab. There is an on-going debate on whether mobile applications should be tested in labs or fields. For instance,  in 2009 70% of the developed systems were evaluated under lab conditions using a variety of techniques.

 

  • Evaluation: The evaluation carried out in Greece and the UK, followed the same setup.  I used detailed panoramic photography and high definition video-clips to represent in high detailed all locations and attractions of the castle. The lab was a simple room in which each user participated individually. In general the approach was successful, as participants could follow the same routes and watch the same presentations about the locations of the castle, from the comfort of their chair.

Examples of the panoramas used in the evaluations is shown below: