Sunday, 18 January 2015

Automated Spellings Test Practice with SL4A

A busy morning a week or so ago gave me an idea for a Geek Dad project.  I was rushing around, getting chores done before work when my daughter reminded me I needed to test her on her spellings ahead of a test that day at school.  If only we had an automatic way of doing this that allowed her to practise on her own....

A detailed Ethnographic* study of how we conduct spelling test practises led to these basic requirements :
  • It's possible to "load" this weeks spellings into the system.
  • The system will read out the spellings to my daughter.
  • The system will take voice commands to manage the practise (e.g. repeat existing word, move on to next word, explain a word).
(*I had a quick think about it!).

First a video of it, then an explanation:


I chose to use SL4A, my favourite scripting language for Android phones, to prototype this.  This allows for a decent Python environment and can access the Android API for text to speech and speech to text capabilities.  One day I'll build a full blown application...

Loading the spellings is done by sending a text message to my phone.  The message format has to be "spellings,<child's name>,word 1,word 2,word n".  I then use the smsGetMessages method to read the SMS inbox, find a spellings test related message and read it into a string array.

The prototype starts by reading a welcome message to the user (gathered from the second part of the SMS) and then reading out  the first spelling.  I use the ttsSpeak method to do the text to speech conversion.  The SpeakSomething function takes an Android object and a string array as parameters.  The reason I use a string array rather than one single string is that it allows you to introduce a short pause whilst the system is speaking.  This makes it sound a little more natural.

The prototype then gets into a While loop, prompting the user to speak and interpreting the speech using the recognizeSpeech method which communicates with Google's most excellent speech to text capability.

The following voice commands have been implemented:
  • The "next" command is achieved by maintaining an index of which word in the string array is the current word.  If "next" is heard, the index is incremented and the new word is played.
  • The "repeat" command just maintains the index and re-plays the current word.
  • The "cheat" command just assigns the current word to the string array (rather than appending it).  This has the effect of putting each letter of the word into an individual element of the string array.  The net result of this is that each letter of the word is spelled out individually.
  • The "define" command makes use of a Dictionary API published by Cambridge University Press.  It's free to sign up for an evaluation key and it offers a lightweight JSON API that takes a word as an argument and provides a response with several definitions, examples and links to online media files.  I simply take the first definition and example from the API response and add them to the string array.The API provides for a bunch of other methods that may be useful for future Geek Dad projects...
  • The "exit" command simply ends the script.

Limitations
  • The recognizeSpeech method is not 100% reliable (not surprising really, what it actually does is pretty epic).  For example it's not great at recognising my 7 year old daughters voice, most likely because it's too "young".  It works better for my 10 year old's voice; I'm guessing because it's more mature.
  • I'd like to be able to pause program execution while the system is speaking.  The API provides a ttsIsSpeaking() method.  However if I test this using while (DroidObject.ttsIsSpeaking() == True): then it doesn't pause at all and if I do while DroidObject.ttsIsSpeaking(): it just pauses forever, even after speech has stopped.  More investigation required....

Full code listing:

#Import statements
import android
import time
import sys
import urllib2
import re

#Constants
DictionaryURLStart ='https://dictionary.cambridge.org/api/v1/dictionaries/british/entries/'
DictionaryURLEnd = '/?format=xml'

#The API Key.  Get this from http://dictionary-api.cambridge.org/
ApiKey = '<Your Key Here>'

#Actually does the speaking
def SpeakSomething(DroidObject,TheMessage):
  for i in range(len(TheMessage)):
    DroidObject.ttsSpeak(TheMessage[i])
    while (DroidObject.ttsIsSpeaking() == True):
      time.sleep(0.1)
      print '.'
    time.sleep(1)

#Used to trim tags from HTML, XML or similar.  Nabbed this from a Stack Overflow page
def TrimTags(InStr):
  try:
    #Uses regular expression module
    cleanr =re.compile('<.*?>')
    cleantext = re.sub(cleanr,'', InStr)
    return cleantext
  except:
    return 'Error trimming tags from a string\r\n'


#################Main Part of the code
print 'starting'
#time.sleep(5)

#Create a Droid object
droid = android.Android()

#Get the messages - True means only un-read messages
result = droid.smsGetMessages(False)

#Used to check if we found a spellings message
MessageFound = False

#Outer loop, look at each message
for i in result.result:
  #Get the message we'll manipulate
  CurrentMessage = (i['body']).encode('utf-8')

  #print CurrentMessage
  #Split in a delimited way  
  MessageStr = CurrentMessage.split(',')

  #See of the first word is "spellings"
  if (MessageStr[0] == "spellings"):
    #See if we have already found a spellings text
    if MessageFound:
      #Do nothing
      print 'message already found'
    else:
      MessageFound = True
      SpellingsMessage = MessageStr
      print SpellingsMessage

print 'end of spellings search loop'

#At this point we should have found a message
if MessageFound:
  #Play a welcome message
  print 'Playing the welcome message'
  MessageToSpeak = []
  MessageToSpeak.append('Hi ' + SpellingsMessage[1])
  MessageToSpeak.append('.  Welcome to your spellings test.')
  print MessageToSpeak
  SpeakSomething(droid,MessageToSpeak)   
  
  #Speak the first spelling
  SpellingNumber = 2  
  MessageToSpeak = []
  MessageToSpeak.append('Word number ' + str(SpellingNumber - 1))
  MessageToSpeak.append(SpellingsMessage[SpellingNumber])
  SpeakSomething(droid,MessageToSpeak)    

  #Now get in a loop  awaiting voice commands
  EndLooping = False  
  while (EndLooping == False):  
    speech = droid.recognizeSpeech('Command',None,None)  
    #See what the user said
    if speech[1] == 'next':
      #Increment thespelling number and speak
      print 'I heard next'
      SpellingNumber = SpellingNumber + 1
      MessageToSpeak = []
      MessageToSpeak.append('Word number ' + str(SpellingNumber - 1))
      MessageToSpeak.append (SpellingsMessage[SpellingNumber])
      SpeakSomething(droid,MessageToSpeak)    
    elif speech [1] == 'cheat':
      MessageToSpeak = []
      MessageToSpeak = SpellingsMessage [SpellingNumber]
      SpeakSomething (droid,MessageToSpeak)
    elif speech[1] == 'define':
      #Get a definition from the Cambridge dictionary API
      DictionaryURL = DictionaryURLStart + SpellingsMessage[SpellingNumber] + DictionaryURLEnd
      print DictionaryURL
      #Do the API request
      request = urllib2.Request(DictionaryURL, headers={"accessKey" : ApiKey})
      APIResponse = urllib2.urlopen(request).read()
      #Start a string array to hold our definition
      MessageToSpeak = []

      #First of all - Get the first definition in the API response
      StartPos = APIResponse.find('<def>')
      EndPos = APIResponse.find('<\/def>',StartPos)
      PartString = APIResponse[StartPos:EndPos]
      PrintString = TrimTags(PartString)
      MessageToSpeak.append('Definition. ' + PrintString)

      #Now get the first example
      StartPos = APIResponse.find('<eg>')
      EndPos = APIResponse.find('<\/eg>',StartPos)
      PartString = APIResponse[StartPos:EndPos]
      PrintString = TrimTags(PartString)
      MessageToSpeak.append('Example. ' + PrintString)
      SpeakSomething (droid,MessageToSpeak)
    elif speech[1] == 'exit':
      print 'I heard exit'
      MessageToSpeak = []
      MessageToSpeak.append('Thank you very much and goodbye!')
      SpeakSomething(droid,MessageToSpeak)    
      EndLooping = True
    elif speech[1] == 'repeat':
      #Dont Increment the spelling number and speak
      print 'I heard repeat'
      MessageToSpeak = []
      MessageToSpeak.append('Word number ' + str(SpellingNumber - 1))
      MessageToSpeak.append(SpellingsMessage [SpellingNumber])
      SpeakSomething (droid,MessageToSpeak)
    else:
      print 'Unknown command'
      MessageToSpeak = []
      MessageToSpeak.append('That was an unknown command!')
      SpeakSomething(droid,MessageToSpeak)