Please be patient and read my current scenario. My question is below.
My application takes in speech input and is successfully able to group words that match together to form either one word or a group of words – called phrases; be it a name
, an action
, a pet
, or a time
frame.
I have a master list of the phrases that are allowed and are stored in their respective arrays. So I have the following arrays validNamesArray
, validActionsArray
, validPetsArray
, and a validTimeFramesArray
.
A new array of phrases is returned each and every time the user stops speaking.
NSArray *phrasesBeingFedIn = @[@"CHARLIE", @"EAT", @"AT TEN O CLOCK",
@"CAT",
@"DOG", "URINATE",
@"CHILDREN", @"ITS TIME TO", @"PLAY"];
Knowing that its ok to have the following combination to create a command:
COMMAND 1: NAME + ACTION + TIME FRAME
COMMAND 2: PET + ACTION
COMMAND n: n + n, .. + n
//In the example above, only the groups of phrases 'Charlie eat at ten o clock' and 'dog urinate'
//would be valid commands, the phrase 'cat' would not qualify any of the commands
//and will therefor be ignored
Question
What is the best way for me to parse through the phrases being fed in and determine which combination phrases will satisfy my list of commands?
POSSIBLE solution I’ve come up with
One way is to step through the array and have if and else statements that check the phrases ahead and see if they satisfy any valid command patterns from the list, however my solution is not dynamic, I would have to add a new set of if and else statements for every single new command permutation I create.
My solution is not efficient. Any ideas on how I could go about creating something like this that will work and is dynamic no matter if I add a new command sequence of phrase combination?
I have minimal experience with objective-c so I’m going to run though the solution I would build using c#, but will explain as I go. It should be easy to replicate in any similar language.
I would start with defining an enum to specify the phrase types:
enum PhraseTypes {
name = 0,
action = 1,
pet = 2,
timeFrames = 3
}
I would then define a Rule as having the following structure:
class Rule {
List<PhraseTypes> components;
Action action; //or whatever makes sense for your app
}
The users command could then be represented as:
List<String> command;
The valid phrases could be:
List<String> validNamesArray;
List<String> validActionsArray;
// ...
And then compiled into:
List<List<String>> validPhrases = new List<List<String>>( new List<String>[]
{validNamesArray, validActionsArray, validPetsArray, validTimeFramesArray}
);
Performing a check against the rules, and calling the correct action could then be done like this:
bool CheckCommandAgainstRules(Rules rules, List<String> command) {
foreach (Rule rule in rules) {
if (CheckCommandAgainstRule(rule, command)) {
rule.Action(command); //or whatever makes sense for your app
return true;
}
}
return false;
}
bool CheckCommandAgainstRule(Rule rule, List<String> command) {
for (i = 0; i < rule.components.length; i++) {
PhraseType phraseType = rule.components[i];
String phrase = command[i];
if(!validPhrases[(int)phraseType].Contains(phrase)) {
return false;
}
}
return true;
}
This overall structure should easily scale to any number of rules and/or phrase types.
Edit:
1) Yeah, I figured that once you have solved the problem of determining if a command is valid, and identified which rule it matches the next issue is triggering some kind of action. One way is to let the Rule
know which action is required.. In c# I would use a delegate or a class with a known interface so I could call it’s .run()
method. The best is likely vary from language to language.
2) My understanding was that you are using phrase
to mean an atomic group of one or more words e.g. “AT TEN O CLOCK” , “DOG” or “PLAY”. I have used command
to represent a list of phrases as derived from user input. Rereading your question it looks like you have used command
to refer to a valid combinations of phrase types, this something I have been calling a Rule
.
13