Wednesday, September 30, 2009

Natural Language Processing

 
VPF is a web service where "you can create virtual people for a variety of uses. Currently the most common use of the Virtual People Factory is to create Virtual Patients for Medical and Pharmacy education.".
Other free VH hosting services are the following:


Although I am sure you can cite more advanced system that these, compared to just mere claims, these three systems are: a) publicly available, b) fully-functional c) free-to-use. Although all three systems, do not actually process language, VPF is by far the most effective of all. As VPF people were kind enough to provide me with their script-matching algorithm and a fully functional API I was able to address this limitation.
 
VPF currently relies on a score-matching algorithm that determines the likelihood a user's input to match with a trigger (i.e., a phrase input in the system by the content developer). In my approach simple keyword matching is the last stage of processing, if all other stages fail. In particular the algorithm in:

Stage 1.

Compares the user's input and the trigger returned by the VPF for common tokens and POS. If comparison is successful (all tokens match), returns the answer associated with the VPF trigger. If comparison fails (keywords are not equal) it moves on to the next stage.

Stage2.

Conducts a series of predicate tests on the input against a DB of predefined phrases. If comparison is successful it passes the phrases to VPF for 100% matching. If comparison fails, it moves on to the final stage where the system reverts back to the simple VPF keyword matching

Stage 3.

VPF keyword Matching
Semantic processing and comparison was also implemented as an additional stage in the algorithm but as it has some problems that need to be addressed first, it was decided not be used in the final prototype. The plan is to integrate the VPF script matching algorithm to the existing one , and create a four-stage approach (with Semantic Processing included) that will enable the NLU component of Talos authoring tool to full process the user's input before matching it with a trigger.
Another fully developed idea for Talos is the creation of a dialogue manager based on HTN (Hierarchical Task Networks) but as it is currently only on paper, I would prefer not to discuss it any further.
 

Code for Stage 1

Sub Syntactic_Keyword_Processing(ByVal userinput As String)
        'we need to load the tagger first
        Try
            tagger_counter += 1
            If tagger_counter = 1 Then
                load_tagger()
            Else
                'tagger is loaded don't loaded again
            End If
            'remove panctuation and contractions first
            Dim contractions As List(Of String) = New List(Of String)(New String() _
           {"didn't", "'ll", "'re", "lets", "let's", "'ve", "'m", "won't", "'d", "'s", "n't"})
            Dim word_contractions As List(Of String) = New List(Of String)(New String() _
        {"did not", "will", "are", "let us", "let us", "have", "am", "will not", "would", "is", "not"})
            Dim end_line As String = "[\,\?\!\.]|\s+$"
            Dim start_line As String = "[\,\?\!]|^\s+"
            Dim userinput2 As String = Regex.Replace(userinput, end_line, "")
            Dim userinput3 As String = Regex.Replace(userinput2, start_line, "")
            'remove contractions
            For Each item As String In contractions
                If userinput.Contains(item) Then
                    Dim cont_position As Integer = contractions.IndexOf(item)
                    Dim what_word As String = word_contractions.Item(cont_position)
                    userinput3 = Regex.Replace(userinput3, item, Space(1) & what_word)
                    Exit For
                End If
            Next
            Dim ask_step As String = "step_1"
            Select Case ask_step
                Case "step_1"
                    hr.addUserInput(userinput)
                    Dim response As String = hr.findResponses(Current_Script)
                    Dim index As Integer = hr.findMostRelevantResponse()
                    VPF_trigger = hr.getResponseMatchedSentence(index)
                    If VPF_trigger <> "" Then
                        'Perform syntactic comparison between the input and the trigger
                        syntactic_keyword_comparison(userinput3)
                        If comparison = "Sucessful" Then
                            _answer = hr.getResponseMatchedSpeech(index)
                            input_list.Clear()
                            trigger_list.Clear()
                        Else
                            ask_step = "step_2"
                        End If
                    End If
            End Select
            Select Case ask_step
                Case "step_2"
                    If new_question <> "" Then
                        hr.addUserInput(new_question)
                        Dim response As String = hr.findResponses(Current_Script)
                        Dim index As Integer = hr.findMostRelevantResponse()
                        _answer = hr.getResponseMatchedSpeech(index)
                        input_list.Clear()
                        trigger_list.Clear()
                    End If
            End Select
        Catch ex As Exception
            output.Clear()
            output.Text += ex.Message + Environment.NewLine
        End Try
    End Sub

0 comments:

Post a Comment