Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Friday, 4 February 2022

Cheating at Wordle with Python, NLTK and a Raspberry Pi

At the time of writing, the game Wordle is taking the world by storm.  Quite rightly, it's genius and so is the guy who invented it.  

Here's one I did earlier:


Two things to know about me:

  • I'm rubbish with word games.
  • I like to trick my children into thinking I'm cleverer than I am.

So having struggled a bit with Wordle I wondered if I could write some code to more efficiently (for that read cheat) find Wordle answers.  For this I used my trusty Raspberry Pi, Python and the Natural Language ToolKit (NLTK) module.  I'd used NLTK previously for some online data science courses so I knew it could give me a "corpus" (so set) of words to play with. 

First I installed NLTK for use with Python 3 on the Raspberry Pi using this command: sudo pip3 install nltk

Then I looked at which word corpus NLTK has that I could use.  There's some details here and a few places pointed me to the "Brown" corpus as a good place to start.  To make this corpus available for NLTK in a Python script I opened a Python3 shell and ran these commands:

Python 3.7.3 (default, Jan 22 2021, 20:04:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download('brown')

So that sets a corpus of words ready to use in a Python script.  I won't go into the detail of how Wordle is played but overall you get 6 goes to guess a 5 letter word.  With each guess, the game tells you for each letter:
  • Whether it is in the final word in the exact position you guessed it it,  I call these exact matches.
  • Whether it is in the final word but not in the exact position you guessed it in.  I call these partial matches.
  • Whether it is not in the final word.  I call these non-matches.
I then played around with snippets of code to examine words from the corpus and rule them in or out based upon whether they had exact matches in them or non-matches not in them.  That led me overall to an algorithm of:

-Load word corpuses (at the time of writing I use Brown, Webtext and Gutenberg)
-Build a dictionary of 5 letter words and their frequency of occurrence in the corpus
-Setup data structures to log exact, partial and non-matches.

-Loop at least six times doing:
1-Make a prediction (which I then enter into Wordle) based on the data structures
2-Input the result of the prediction from Wordle
3-Update some data structures

(Full code is at the end of this post)

Taking step 2 above first, I simply enter the result of each Wordle round in a coded string.  So take this result:

I enter this as S_E,U_N,G_N,A_P,R_P.  Where _E is for exact, _N is for non-matched and _P is for partial.

Looking at step 3, I update 3 data structures based upon the input the user provides.  The data structures are:
  • A tuple of exact matches where the tuple contains the letter and the exact match position.  So from the above it would be [('S', 0)].
  • A dictionary of partial matches where the key is the letter and the value is a list of positions that the letter is not in.  S from the above it would be {'A': [3], 'R': [4]}
  • A list of non-matches.  So from the above it would be ['U', 'G']
If it's an exact match I do two things:
1)Update the tuple of exact matches with the new matched letter/position combination.
2)Update the partial match dictionary as this is a a)new position a partial matched letter can't be in and b)it may require removal of an entry as what was previously partially matched is now fully matched.

If it's a partial match I do two things:
1)Add a new partial (with the letter position) or update the list of positions for an existing partial
2)If it's a new partial, don't just add the position it is in, add the position of all the other exact matches as the letter can't be in this position either.

If it's a non-match I just add the letter to the list of non-matches.

So for game above, the log output showed this:
Enter the result of that round in format A_E,B_P,C_N where _E for exact, _P for partial, _N for non matched:S_E,U_N,G_N,A_P,R_P
########Processing an exact match for letter S
########Processing a non match for letter U
########Processing a non match for letter G
########Processing a partial match for letter A
########Processing a partial match for letter R
########Exact matches [('S', 0)]
########Partial matches {'A': [0, 1, 3], 'R': [1, 2, 4]}
########Non matches ['O', 'E', 'P', 'T', 'U', 'G']

Finally looking at step 1, for round 1 I recommend a initial word based upon the following logic:
1)Do a letter frequency count across all the 5 letter words in the corpus.
2)Find a word in the corpus that has each of the 5 most common letters.  This leads me to use "AROSE".

Then for subsequent rounds I do the following:
1)Eliminate any words from the corpus that have the letters in the non-matching list.
2)Eliminate any words from the corpus that don't have the exact matching letters in the exact matching position.
3)Eliminate any words from corpus that don't have the partial matching letters or, if they do, have the partial matching letters in the positions logged that they can't be in.

...which results in a list of words which I augment with the frequency of occurrence in the corpus.  I then print this and let the use choose the word to enter next.  

The story so far is that I have played 4 games with this code and:
1)I have got the right answer every time, but
2)My 17 year old daughter has got the right answer in fewer guesses 3 times out of 4!

Full code listing:




#!/usr/bin/env python3
from nltk.corpus import brown
from nltk.corpus import webtext
from nltk.corpus import gutenberg
import sys

#Our main 5 letter word list
five_letters = {}    #A dictionary of 5 letter words with frequencies

#Gets a list of 5 letter words
def add_to_list(in_word_list):
  for word in in_word_list:
    if word.isalpha():
      if len(word) == 5:
        if word.upper() not in five_letters:    #as variety of case of same word could be present
          five_letters[word.upper()] = 1
        else:
          five_letters[word.upper()] += 1

#Compares our 5 letter word corpus with the list of letter frequencies to find the best start word
#The best word has all the highest frequency letters just once
def get_start_words(letter_frequencies, start_pos):
  words_found = 0               #Counts when our set of 5 high frequency letters matches a word from our corpus  
  list_start_pos  = start_pos   #Used to track where we are in our letter frequency list
  word_list = []

  #Loop until we've found words from the word corpus or we've exhausted the
  while words_found == 0 and list_start_pos < 22:    #It's 22 as this will mean we've got to positions 21,22,23,24,25 in the character frequency list
    #Loop for each of the words in the corpus
    for my_word in five_letters:
      letter_count = 0     #Incremented if we find a letter in the word
      #Then for each of the five letters identified by the outer while loop
      for i in range (list_start_pos, list_start_pos + 5):
        my_letter = letter_frequencies[i][0]
        if my_letter in my_word:
          letter_count += 1 #Count if the letter is in the word
      #return word_list
      #See if we found all our letters in the word.  So if one letter is there twice we should not 
      if letter_count == 5:
        word_list.append(my_word)
        words_found += 1
    list_start_pos += 1

  #Return our word list
  return word_list


print ("########Building five letter word list")
add_to_list(brown.words())
print ("Added Brown corpus and word list has {} entries".format(len(five_letters)))
add_to_list(webtext.words())
print ("Added Webtext corpus and word list has {} entries".format(len(five_letters)))
add_to_list(gutenberg.words())
print ("Added Gutenberg corpus and word list has {} entries".format(len(five_letters)))

#Calculate letter frequencies
print ("########Calculating letter frequencies")
freq_dict = {}
for my_word in five_letters:
  #Build letter frequencies
  for my_letter in my_word:
    #See of the letter is a key in the dictionary
    if my_letter in freq_dict.keys():
      freq_dict[my_letter] += 1    #Increment value
    else:
      freq_dict[my_letter] = 1     #Add value

#Sort the dictionary to get the highest probability letters and show the user
sorted_values = sorted(freq_dict.items(), key=lambda x:x[1], reverse=True)
print (sorted_values)

#Holds the round number
round_number = 1

#Holds the exact matches.  A list of tuples
exact_matches = []
#Holds the partial matches.  A dictionary of key is letter, value is list of positions letter is not in
partial_matches = {}
#Holds the non-matched letters.  A list
non_matches = []

#Loop for all the rounds
while round_number < 7 and len(exact_matches) < 5:
  print ("######################################################")
  print ("########This is round number {}".format(round_number))
  #Special case for round 1
  if round_number == 1:
    #Get a list of possible starting words based upon the letter frequencies
    print ("########Getting a list of words to start off with")

    #Imagine we get 6 wrong starting words!  Accoutn for each
    for k in range (0,3):
      start_words = get_start_words(sorted_values, k)
      print ("Where we start at position {} of the letter frequencies the start words are: {}".format(k, start_words))
  else:
    #Step 1, rule out a bunch of words that have eliminated letters in them
    print ("########Assessing data from previous round to make a recommendation")
    print ("########First rule out words based on letters found not to exist in the answer")
    after_non_match_check = []
    for my_word in five_letters:
      has_ruled_out = False
      for letter in non_matches:
        if letter in my_word:
          has_ruled_out = True
      if not has_ruled_out:
        after_non_match_check.append(my_word)
    print ("########At the end of this we are down to {} words".format(len(after_non_match_check)))

    #Step 2, rule in a set of words based on the matched list
    print ("########Second rule in words based on exact matched letters")
    after_full_match_check = []
    for my_word in after_non_match_check:
      has_ruled_out = False
      for match_tuple in exact_matches:
        if my_word[match_tuple[1]] != match_tuple[0]:
          has_ruled_out = True
      if not has_ruled_out:
        after_full_match_check.append(my_word)
    print ("########At the end of this we are down to {} words".format(len(after_full_match_check)))
    #print (after_full_match_check)

    #Step 3, rule out a set of words where letters are in partial match positions
    print ("########Third rule in words based on partial matched letters")
    after_partial_match_check = []
    for my_word in after_full_match_check:
      has_ruled_out = False
      for my_partial in partial_matches:  #Loop through each dictionary entry
        #First simply check the partial is in the word
        if my_partial in my_word:
          #Now check for the partial positions
          for my_partial_pos in partial_matches[my_partial]:    #Loop through each item in the partial match
            if my_partial == my_word[my_partial_pos]:
              has_ruled_out = True
        else:
          has_ruled_out = True

      if not has_ruled_out:
        after_partial_match_check.append(my_word)
    print ("########At the end of this we are down to {} words.  Recommendation:".format(len(after_partial_match_check)))

    #Form an ordered list of words based on overall word frequency from the corpus that was built right at the start
    suggestion_dict = {}
    ordered_suggestion = {}
    for word_suggestion in after_partial_match_check:
      suggestion_dict[word_suggestion] = five_letters[word_suggestion]

    #Order the suggestions
    ordered_suggestion = sorted(suggestion_dict.items(), key=lambda x:x[1], reverse=True)
    #Tell the user
    if len(ordered_suggestion) < 11:
      print (ordered_suggestion)
    else:
      #Print just the first 10
      print (list(ordered_suggestion)[:10])

  #Get input from the user as to what happened in that round
  round_result = input("Enter the result of that round in format A_E,B_P,C_N where _E for exact, _P for partial, _N for non matched:")
  #Pull round result apart and process each
  result_list = round_result.split(",")
  #Loop for each result
  letter_pos = 0        #Holds which letter result position we're looking at
  for result in result_list:
    #Get the entered letter and the result
    entered_letter = result[0]
    letter_result = result[2]
    if letter_result == "E":
      print("########Processing an exact match for letter {}".format(entered_letter))
      #See if we already have this exact match.  If not, add it
      letter_found = False
      for my_tuple in exact_matches:
        if my_tuple[0] == entered_letter and my_tuple[1] == letter_pos:    #So we could have double letters so this checks for existence of the letter in the given position
          letter_found = True
      if not letter_found:
        exact_matches.append((entered_letter,letter_pos))
      #Update existing partial matches as well, i.e. 1)They can't be in the position of the found letter.  Also, if there is already a partial for what is now exact, remove it
      if entered_letter in partial_matches:
        partial_matches.pop(entered_letter)
      else:
        for partial in partial_matches:
          if letter_pos not in partial_matches[partial]:
            partial_matches[partial].append(letter_pos)

    elif letter_result == "P":
      print("########Processing a partial match for letter {}".format(entered_letter))
      if entered_letter in partial_matches:
        #Look partial record, seeing if the current position is in the position list.  If not, add it
        if letter_pos not in partial_matches[entered_letter]:
          partial_matches[entered_letter].append(letter_pos)
      else:
        #Entered letter not in partial matches dictionary.  Add it together with the position
        partial_list = []
        partial_list.append(letter_pos)
        #But as this is a new partial we also need to add all existing exact matches which the letter can also not be in that position
        for exact in exact_matches:
          partial_list.append(exact[1])
        #FInal update of the partial list
        partial_matches[entered_letter] = partial_list

    elif letter_result == "N":
      print("########Processing a non match for letter {}".format(entered_letter))
      #See if we already have this non-match.  If not, add it
      if entered_letter not in non_matches:
        non_matches.append(entered_letter)
    #Update so we get the next letter position
    letter_pos +=1

  #Show what the structures are at the end of this round
  print ("########Exact matches {}".format(exact_matches))
  print ("########Partial matches {}".format(partial_matches))
  print ("########Non matches {}".format(non_matches))

  #End of turn Update for next loop
  round_number += 1

#See how we came out of the main loop
if len(exact_matches) == 5:
  print ("########You won!  Way to go/cheat")
else:
  print ("########All rounds completed, you lost!")

Wednesday, 19 April 2017

Half Marathon Comparison Using Azure, Google Maps, Python, MongoDB and Javascript

This time last year I blogged about a half-marathon I had run where I paced it badly and slowed down massively at the end.  I did the same race this year and ran a faster time but more importantly paced it more consistently and so enjoyed the experience more.

The run was over the same course and the weather was similar so this provides a good opportunity to compare and contrast both years.  At a superficial level, as part of the results that are provided you get to see your split after every 5K.  Hence it was possible to compare the splits of last year with this year:

So simply put:
  • After 5K, in 2017 I was 38 seconds behind where I was in 2016.
  • After 10K I was 44 seconds behind and after 15K I was a massive 80 seconds behind.
  • However after 20K in 2017 I had turned this around was 14 seconds ahead of 2016.
  • Then I ran the final 1.1K 27 seconds faster in 2017 than in 2016 to finish 41 seconds up.
Note that none of this was down to significantly better fitness, I just paced the run more sensibly in 2017.  (Put differently I was a lot more stupid in 2016!).

As a Geek I wanted to go further in this analysis so I thought it would be fun to visually compare 2016 versus 2017 on a map.  i.e. See my 2016 self zoom past my 2017 self then see my 2017 self catch up and pass 2016.  Having tinkered with AWS and Bluemix it was time to drive a different cloud computing offering so I decided to take up Microsoft's kind offer of £150 of credit.  

Here's the result.  The "6" marker is 2017, the "7" marker is 2017.



So you can see:
  • Me starting further up the road in 2017.
  • The 2016 me catching up and passing the 2017 me around the University.
  • 2016 me staying ahead for a long period of time.
  • 2017 me catching up and quickly passing 2016 me on the final straight stretch to the finish.

So a fascinating profile!

Here's a diagram of what I put together.  Full description and code then follows.



The above diagram shows the following key steps:
  • Garmin Sports watch syncs with Garmin Connect (standard activity)
  • GPX files downloaded from Garmin Connect and uploaded to Azure Virtual Machine (covered in Step 1 below).
  • Python script to parse GPX files and load them in a MongoDB instance (Step 2)
  • Apache webserver and Python cgi-bin to extract data from the MongoDB instance and offer a simple API (step 3)
  • HTML, CSS and Javascript to access API and present animated map markers using the Google Maps Javascript API (step 4)
Step 1 - Getting a Azure Linux Virtual Machine
Microsoft Azure is very easy and intuitive to use.  I already had a Microsoft account for Outlook.com so just used this to go through the Azure free trial sign up process.  This gave me £150 worth of free credit on the Azure platform.

After quickly reviewing tutorials I requested a Linux Virtual Machine using the steps New - Compute - Ubuntu Server 16.04TS and then providing some basic configuration details.  Within roughly a minute the server had been setup and I could get details as to how to SSH onto the VM using PuTTY.  The size of the platform was Standard DS1 v2 (1 core, 3.5 GB memory) which was suitable for my needs.

A tile on the Azure dashboard gave me access to all manner of information and configuration options for the VM.  Example below:



Take a step back now - for an olde skool Technology guy such as myself I am still super impressed by cloud computing capabilities.  No massive forms to fill out, no tetchy administrators to haggle with, no IP networking to organise - just BOOM! and you've got a machine to play with.

The final part of this step was to use an FTP client (WinSCP) to upload the Garmin GPX files to the VM.

Step 2 - MongoDB, GPX File Parsing and Database Loading
The plan was to use the GPX files recorded by my Garmin sports watch in 2016 and 2017 to allow map markers to be animated.  So what's a GPX file?  Here's a definition:

GPX, or GPS Exchange Format, is an XML schema designed as a common GPS data format for software applications. It can be used to describe waypoints, tracks, and routes. The format is open and can be used without the need to pay license fees.

Here's the top section of one of my half-marathon GPX files:

<?xml version="1.0" encoding="UTF-8"?>
<gpx creator="Garmin Connect" version="1.1"
  xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/11.xsd"
  xmlns="http://www.topografix.com/GPX/1/1"
  xmlns:ns3="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
  xmlns:ns2="http://www.garmin.com/xmlschemas/GpxExtensions/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <metadata>
    <link href="connect.garmin.com">
      <text>Garmin Connect</text>
    </link>
    <time>2016-04-03T09:19:19.000Z</time>
  </metadata>
  <trk>
    <name>Whitley Ward Running</name>
    <type>running</type>
    <trkseg>
      <trkpt lat="51.42623794265091419219970703125" lon="-0.992680527269840240478515625">
        <ele>45.40000152587890625</ele>
        <time>2016-04-03T09:19:19.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:hr>82</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
      <trkpt lat="51.42622503452003002166748046875" lon="-0.9927202574908733367919921875">
        <ele>45.40000152587890625</ele>
        <time>2016-04-03T09:19:20.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:hr>82</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>

So some metadata and then a <trk> section with a <trkseg> subsection which is a container for a bunch of <trkpt> elements.  Within each of these you can see:

  • The position logged (latitude and longitude)
  • Elevation
  • Date and time
  • Heart rate 

I wanted to store all the data in a database and I chose to use MongoDB on Azure as I enjoyed using it for a Raspberry Pi project last year (and so also had cracked using Python to write to and read from the database).

Getting the database was super easy.  Within the Azure I did New - Databases - "Database as a Service for MongoDB", entered a few details and minute or two later had a MongoDB instance.

Remembering that the structure of a document database is different to a relational database as follows:

Relational Database TermDocument Database Term
DatabaseDatabase
TableCollection
RowDocument

...I set up a database called "geekmongo" and created a collection within it called "Test".  Hence the task at hand was to parse the GPX files, create JSON documents and write them to the Test collection in the geekmongo database.

#Import statements
import xml.etree.ElementTree as ET
from datetime import datetime
from pymongo import MongoClient

#File to process
FileOne = '/home/map/rhm/activity_1109977624_2016.gpx'

#Database related - Got all these from "Connection String" area on Azure for the database instance
#Created the collection test myself manually on Azure
dBAddress = 'Your Connection String'

#Start parsing that XML
tree = ET.parse(FileOne)
root = tree.getroot()

#Set up for the mongo instance, the database then the collection
#Connect to the database
client = MongoClient(dBAddress)
db = client.geekmongo
collection = db.Test

#Get the first timestamp as we'll reference all subsequent ones to this in order to be able to calculate elapsed timestamp
FirstTimeStamp = root[1][2][0][1].text
#Turn it into a time object we can use
TStart = datetime.strptime(FirstTimeStamp[:-5],"%Y-%m-%dT%H:%M:%S")

#Loop through the XML file picking out lat and lng and writing them to a file, (unless they're the last ones on the list)
#Example is {"elapsed": 0, "lat":51.4566827,"lng":-0.9690389},
LoopVar = 0
for myTrkpt in root[1][2]:
  #Calculate the elapsed time
  TimeNow = myTrkpt[1].text
  TNow = datetime.strptime(TimeNow[:-5],"%Y-%m-%dT%H:%M:%S")
  TimeElapsed = abs(TNow - TStart).seconds

  #Build a Python dictionary that we'll write to the MongoDB
  MongoDoc = {}
  MongoDoc["elapsed"] = TimeElapsed
  MongoDoc["lat"] =  myTrkpt.attrib.get('lat')
  MongoDoc["lng"] =  myTrkpt.attrib.get('lon')
  MongoDoc["elevation"] =  myTrkpt[0].text
  MongoDoc["timestamp"] =  TimeNow[:-5]
  MongoDoc["heart"] = myTrkpt[2][0][0].text
  MongoDoc["cadence"] = 0

  #Write the document to the footie collection
  collection.insert_one(MongoDoc)

print("Done")


So here I use the Python XML and pymongo modules to parse the GPX files and write to the database respectively.

With pymongo you create an object and then connect to the database using a "Connection String".  You get this from the Azure management console for the MongoDB instance under the "Connection String" settings area.  This string contains the database address and the credentials required to access it.  You then can create a collection object which you write documents to using the insert_one() method.

Using the XML module you create an object called "root" and can use indices to access the different parts of the GPX structure.  So for example root[0] will be the first part of the GPX file.

The code then loops through the <trkpt> elements of the GPX file, picks out all the relevant data and then creates a Python dictionary which will be written to the database.  I also calculate an "elapsed" field which is the difference in seconds between the first <trkpt> elements and the element in question.  I foresee this being useful later...

It was interesting to look at the Azure console as I ran the scripts to parse the GPX files and write documents to the database.  Here's what was shown:

Here you can see a peaks "insert" requests as the data was being inserted.  The two peaks represent the two separate files being parsed and loaded.

Step 3 - A Web Server and an API
I wanted to create an API such that Javascript running within a browser could make a AJAX request to extract the data.  At some point I'll explore an Azure Web App for this sort of thing but for now I decided to use an Apache web server running on the Azure Linux VM and use a cgi-bin Python script to provide the functionality of the API.  I simply ran the sudo apt-get install apache2 command to install Apache and used this guide to get cgi-bin working for Python scripts.

To get the web server to work I had to do some configuration within the Azure console.  Specifically I had to configure a rule to enable HTTP traffic (port 80) on the platform.  To do this I selected the VM from the console, selected "Network Interfaces" then selected the Network Security Group.  I then configured the "HTTPRule" shown below:


To create the API I wrote the following Python script:



#!/usr/bin/env python

#Import statements
from pymongo import MongoClient
import cgi
import re
import cgitb

#Enable error logging
cgitb.enable()

#Database related - Got all these from "Connection String" area on Azure for the database instance
dBAddress = 'Your Connection String' 
CollectionID = 'Test'

#Get the query string parameters provided.  'name' field is the mongo name, 'value' field is the value
arguments = cgi.FieldStorage()
MongoName = arguments['name'].value
MongoValue = arguments['value'].value

#Form the document to use for the database access.  We will do a Regex because we may be searching on a partial date
MongoRegex = {}
MongoRegex['$regex'] = MongoValue
MongoDoc = {}
MongoDoc[MongoName] = MongoRegex

#Connect to the database, get a database object and get a collection
client = MongoClient(dBAddress)
db = client.geekmongo
collection = db.Test

print ('content-type: application/json\n\n')

#Get the total number of documents returned and set up a counter variable
TotalDocCount = collection.count(MongoDoc)
DocCounter = 0

#Start the output string
OutString = '{"markers":['

#Do a database find based upon the parameters provided.  Use this to form the output.  Need elapsed (integer), lat (long 4dp), lng (long 4dp)
#elevation (1dp), timestamp (string) and heart rate (integer)
for rhmDoc in collection.find(MongoDoc):
  OutString = OutString + '{"elapsed":' + str(rhmDoc["elapsed"]) + ','
  OutString = OutString + '"lat":' + str(round(float(rhmDoc["lat"]),4)) + ',' 
  OutString = OutString + '"lng":' + str(round(float(rhmDoc["lng"]),4)) + ','
  OutString = OutString + '"elevation":' + str(round(float(rhmDoc["elevation"]),1)) + ','
  OutString = OutString + '"timestamp":' + chr(34) + rhmDoc["timestamp"] + '",'
  OutString = OutString + '"heart":' + str(rhmDoc["heart"]) + '}'

  #See how many documents we've dealt with and whether we need to add a , to the end of the document
  DocCounter += 1
  #If we're on the last document then we add the ] to close the JSON array
  if (DocCounter == TotalDocCount):
    OutString = OutString + '],'
  else:
    OutString = OutString + ','

#Add the total items part
OutString = OutString + '"TotalItems":' + str(TotalDocCount) + '}'

#Stream the output to the client
print OutString

Here I use the CGI module to read query string parameters provided by the client.  So for example the URL:

http://<server URL>/GetGPXData.py?name=heart&value=82


...will result in a database query being made for all documents that contain the heart rate value of 82.

The script then takes the response of the database query and forms a string JSON document with all the values to pass back to the client.


The "regex" component of the MongoDB query document means you can do a "contains" search on the database.  i.e. "contains 2016" to return all the values for 2016.

Step 4 - Web Page, Javascript and Google Maps API
So the final part of the project was to write a web page that could use Javascript to a)download data from the API I just created and b)plot it on a map using the Google Maps API.
  • The HTML, CSS and Javascript is below.  Highlights:
  • Uses Google maps Javascript API to bring up a map, place and move markers.  I started with this tutorial.
  • Uses the API previously described to acquire the position data to plot (function startRace).
  • Uses a Javascript interval to "fire" and cause an assessment of position and the map to be updated
  • Calculates the straight line distance between the markers.
<!DOCTYPE html>
<html>
<head>
<style>
#map {
height: 500px;
width: 100%;
}
</style>

</head>
<body>
<h3>Reading Half Marathon Analysis</h3>
<div id="map"></div>
<input type='button' id='btnLoad' value='Load One' onclick='loadLocsOne();'>
<input type='button' id='btnLoad' value='Load Two' onclick='loadLocsTwo();'>
<p id="Distance"></p>
<input type='button' id='btnLoad' value='Race' onclick='startRace();'>
<input type='button' id='btnLoad' value='Stop' onclick='stopRace();'>
<input type='button' id='btnLoad' value='Heart Chart' onclick='heartChart();'>
<script type="text/javascript">
//This is V10 that adds getting the data from an 'API'
//Some nasty global variables. Discovered needed to use setInterval to control a marker and these were needed for that.
var map //Enables us to reference the map in all parts of the code
var markerOne //A marker entity
var markerTwo //A marker entity
var timeElapsed //A variable to hold how many seconds have elapsed
var maxElapsed //Defines the maximum elapsed time we'll have across the two JSON structures
var locsOne = {}; //A position array
var locsTwo = {}; //A position array
var intervalVar //Use for the setInterval thingy
var raceStarted //Boolean that defines whether the race has started
//Initialise the map and put a marker on it
function initMap() {
var readingOne = {lat: 51.4366827, lng: -0.9680389};
var readingTwo = {lat: 51.4466827, lng: -0.9780389};
map = new google.maps.Map(document.getElementById('map'), {
zoom: 13,
center: readingOne
});
markerOne = new google.maps.Marker({
position: readingOne,
map: map,
label: "6"
});
markerTwo = new google.maps.Marker({
position: readingTwo,
map: map,
label: "7"
//color: 0xFFFFFF
});
//This is a load or reload so state that the race is not started
raceStarted = false;
}
//Just move a marker
function positionMarker(inMarker, inLat, inLng)
{
//Set the position of the marker
var newPos = {lat: inLat, lng: inLng};
//Set the position of the marker
inMarker.setPosition(newPos);
}
//Initialises matters when user presses "Race"
function startRace()
{
//What we do first depends on whether the race is started!
if (raceStarted == false)
{
//Initialise the position number
timeElapsed = 0;
//We need to find out the max elapsed time across the two structures. In this way we'll increment the elapsed time every time the interval handler
//fires. If we find a position we update the marker. If not we leave the marker where it is. When we've exhausted all possible elapsed times then
//we know to stop the handler
var maxElapsedOne = locsOne.markers[locsOne.TotalItems - 1].elapsed;
var maxElapsedTwo = locsTwo.markers[locsTwo.TotalItems - 1].elapsed;
//Set up the max elapsed value
if (maxElapsedOne > maxElapsedTwo)
{
maxElapsed = maxElapsedOne;
}
else
{
maxElapsed = maxElapsedTwo;
}
raceStarted = true;
}
//Set up to move the marker
intervalVar = setInterval(function(){ assessMarkerMove()}, 10);
}
//Stops the race
function stopRace()
{
clearInterval(intervalVar);
}
//Handles assessing whether to move the markers and if required doing so
//locs.markers[posNumber].lat,locs.markers[posNumber].lng
function assessMarkerMove()
{
//Variables
var i
//See if we can find a marker associated with the current elapsed time
for (i in locsOne.markers)
{
if (locsOne.markers[i].elapsed == timeElapsed)
{
positionMarker(markerOne, locsOne.markers[i].lat, locsOne.markers[i].lng);
}
}
for (i in locsTwo.markers)
{
if (locsTwo.markers[i].elapsed == timeElapsed)
{
positionMarker(markerTwo, locsTwo.markers[i].lat, locsTwo.markers[i].lng)
}
}
//Calculate the straightline distance between the markers. Only do this every 10 iterations else it look messy
if (Number.isInteger(timeElapsed / 10) == true){
distanceBetween = Math.round(google.maps.geometry.spherical.computeDistanceBetween(markerOne.position, markerTwo.position));
document.getElementById("Distance").innerHTML = distanceBetween + ' metres between!';}
//Increment the counter of how many times this has been called
timeElapsed++;
//See whether we've reached the end of the array
if (timeElapsed > maxElapsed)
{
clearInterval(intervalVar);
raceStarted = false;
}
}
//Called when the load data button is pressed
function loadData()
{
loadLocsOne();
loadLocsTwo();
document.getElementById("Distance").innerHTML = 'Data Loaded!!';
}
//Load the first array
function loadLocsOne() {
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
//alert(this.responseText);
locsOne = JSON.parse(this.responseText);
document.getElementById("Distance").innerHTML = 'locsOne Loaded';
}
};
xhttp.open("GET", "http://a.b.c.d/cgi-bin/GetGPXData.py?name=timestamp&value=2016", true);

}
//Load the second array
function loadLocsTwo() {
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
locsTwo = JSON.parse(this.responseText);
document.getElementById("Distance").innerHTML = 'locsTwo Loaded';}
};
xhttp.open("GET", "http://a.b.c.d/cgi-bin/GetGPXData.py?name=timestamp&value=2017", true);
xhttp.send();
}
</script>
<script async defer
src="https://maps.googleapis.com/maps/api/js?key=<Your Key Here>&callback=initMap&libraries=geometry">
</script>
</body>
</html>


Wednesday, 29 March 2017

Giving Alexa a new Sense - Vision! (Using Microsoft Cognitive APIs)

Amazon Alexa is just awesome so I thought it would be fun to let her "see".  Here's a video of this in action:



...and here's a diagram of the "architecture" for this solution:



The clever bit is the Microsoft Cognitive API so let's look at that first!  You can get a description of the APIs here and sign up for a free account.  To give Alexa "vision" I decided to use the Computer Vision API which takes a image URL or an uploaded image, analyses it and provides a description.

Using the Microsoft Cognitive API developer console I used the API to analyse the image of a famous person shown below and requested a "Description":



...and within the response JSON I got:

"captions": [ { "text": "Elizabeth II wearing a hat and glasses", "confidence": 0.28962254803103227 } ]

...now that's quite some "hat" she's wearing there (!) but it's a pretty good description.

OK - So here's a step-by-step guide as to how I put it all together.

Step 1 - An Apache Webserver with Python Scripts
I needed AWS Lambda to be able to trigger a picture to be taken and a Cognitive API call to be made so I decided to run this all from a Raspberry Pi + camera in my home.

I already have a Apache webserver running on my Raspberry Pi 2 and there's plenty of descriptions on the internet of how to do it (like this one).

I like a bit of Python so I decided to use Python scripts to carry out the various tasks.  Enabling Python for cgi-bin is very easy; here's an example of how to do it.

So to test it I created the following script:

#!/usr/bin/env python
print "Content-type: text/html\n\n"
print "<h1>Hello World</h1>"

...and saved it as /usr/lib/cgi-bin/hello.py.  I then tested it by browsing to http://192.168.1.3/cgi-bin/hello.py (where 192.168.1.3 is the IP address on my home LAN that my Pi is sitting on).  I saw this:



Step 2 - cgi-bin Python Script to Take a Picture
The first script I needed was one to trigger my Pi to take a picture with the Raspberry Pi camera.  (More here on setting up and using the camera).

After some trial and error I ended up with this script:

#!/usr/bin/env python
from subprocess import call
import cgi

def picture(PicName):
  call("/usr/bin/raspistill -o /var/www/html/" + PicName + " -t 1s -w 720 -h 480", shell=True)

#Get arguments
ArgStr = ""
arguments = cgi.FieldStorage()
for i in arguments.keys():
 ArgStr = ArgStr + arguments[i].value

#Call a function to get a picture
picture(ArgStr)

print "Content-type: application/json\n\n"

print "{'Response':'OK','Arguments':" + "'" + ArgStr + "'}"

So what does this do?  The ArgString and for i in arguments.keys() etc. code section makes the Python script analyse the URL entered by the user and extract any query strings.  The query string can be used to specify the file name of the photo that is taken.  So for example this URL:

http://192.168.1.3/cgi-bin/take_picture_v1.py?name=hello.jpg

...will mean a picture is taken and saved as hello.jpg.

The "def Picture" function then uses the "call" module to run a command line command to take a picture with the Raspberry pi camera and save it in the root directory for the Apache 2 webserver.

Finally the script responds with a simple JSON string that can be rendered in a browser or used by AWS Lambda.  The response looks like this in a browser:


Step 3 - Microsoft Cognitive API for Image Analysis
So now we've got a we need to analyse it.  For this task I leaned heavily on the code published here so all plaudits and credit to chsienki and none to me. I used most of the code but removed the lines that overlaid on top of the image and showed it on screen.

#!/usr/bin/env python
import time
from subprocess import call
import requests
import cgi

# Variables
#_url = 'https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze'
_url = 'https://westus.api.cognitive.microsoft.com/vision/v1.0/describe'

_key = "your key"   #Here you have to paste your primary key
_maxNumRetries = 10

#Does the actual results request
def processRequest( json, data, headers, params ):

    """
    Helper function to process the request to Project Oxford

    Parameters:
    json: Used when processing images from its URL. See API Documentation
    data: Used when processing image read from disk. See API Documentation
    headers: Used to pass the key information and the data type request
    """

    retries = 0
    result = None

    while True:
        print("This is the URL: " + _url)
        response = requests.request( 'post', _url, json = json, data = data, headers = headers, params = params )

        if response.status_code == 429:

            print( "Message: %s" % ( response.json()['error']['message'] ) )

            if retries <= _maxNumRetries:
                time.sleep(1)
                retries += 1
                continue
            else:
                print( 'Error: failed after retrying!' )
                break

        elif response.status_code == 200 or response.status_code == 201:

            if 'content-length' in response.headers and int(response.headers['content-length']) == 0:
                result = None
            elif 'content-type' in response.headers and isinstance(response.headers['content-type'], str):
                if 'application/json' in response.headers['content-type'].lower():
                    result = response.json() if response.content else None
                elif 'image' in response.headers['content-type'].lower():
                    result = response.content
        else:
            print( "Error code: %d" % ( response.status_code ) )
            #print( "Message: %s" % ( response.json()['error']['message'] ) )
            print (str(response))

        break

    return result

#Get arguments from the query string sent
ArgStr = ""
arguments = cgi.FieldStorage()
for i in arguments.keys():
 ArgStr = ArgStr + arguments[i].value

# Load raw image file into memory
pathToFileInDisk = r'/var/www/html/' + ArgStr

with open( pathToFileInDisk, 'rb' ) as f:
    data = f.read()

# Computer Vision parameters
params = { 'visualFeatures' : 'Color,Categories'}

headers = dict()
headers['Ocp-Apim-Subscription-Key'] = _key
headers['Content-Type'] = 'application/octet-stream'

json = None

result = processRequest( json, data, headers, params )

#Turn to a string
JSONStr = str(result)

#Change single to double quotes
JSONStr = JSONStr.replace(chr(39),chr(34))

#Get rid of preceding u in string
JSONStr = JSONStr.replace("u"+chr(34),chr(34))


if result is not None:
  print "content-type: application/json\n\n"

  print JSONStr

So here I take arguments as before to know which file to process, "read" the file and then use the API to get a description of it.  I had to play a bit with the response to get it into a format that could be parsed by the Python json module.  This is where I turn single quotes to double quotes and get rid of preceding "u" characters.  There's maybe a more Pythonic way to do this, please let me know if you know a way....

When you call the script via a browser you get:


Looking at the JSON structure in more detail you can see a "description" element which is how the Microsoft Cognitive API has described the image.

Step 4 - Alexa Skills Kit Configuration and Lambda Development
The next step is to configure the Alexa Skills kit and write the associated AWS Lambda function.  I've covered how to do this previously (like here) so won't cover all that again here.

The invocation name is "yourself"; hence you can say "Alexa, ask yourself...".

There is only one utterance which is:
AlexaSeeIntent what can you see

...so what you actually say to Alexa is "Alexa, ask yourself what can you see".  

This then maps to the intent structure below:

{
  "intents": [
    {
      "intent": "AlexaSeeIntent"
    },
    {
      "intent": "AMAZON.HelpIntent"
    },
    {
      "intent": "AMAZON.StopIntent"
    },
    {
      "intent": "AMAZON.CancelIntent"
    }
  ]
}

Here we have a boilerplate intent structure with the addition on AlexaSeeIntent which is what will be passed to the AWS Lambda function.

I won't list the whole AWS Lambda function below, but here's the relevant bits:

#Some constants
TakePictureURL = "http://<URL or IP Address>/cgi-bin/take_picture_v1.py?name=hello.jpg"
DescribePictureURL = "http://<URL or IP Address>/cgi-bin/picture3.py?name=hello.jpg"

Then the main Lambda function to handle the AlexaSeeIntent:

def handle_see(intent, session):
  session_attributes = {}
  reprompt_text = None
  
  #Call the Python script that takes the picture
  APIResponse = urllib2.urlopen(TakePictureURL).read()
  
  #Call the Python script that analyses the picture.  Strip newlines
  APIResponse = urllib2.urlopen(DescribePictureURL).read().strip()
  
  #Turn the response into a JSON object we can parse
  JSONData = json.loads(APIResponse)
  
  PicDescription = str(JSONData["description"]["captions"][0]["text"])
  
  speech_output = PicDescription
  should_end_session = True
  
  # Setting reprompt_text to None signifies that we do not want to reprompt
  # the user. If the user does not respond or says something that is not
  # understood, the session will end.
  return build_response(session_attributes, build_speechlet_response(
        intent['name'], speech_output, reprompt_text, should_end_session))

So super, super simple.  Call the API to take the picture, call the API to analyse it, pick out the description and read it out.

Here's the image that was analysed for the Teddy Bear video:



Here's another example:


The image being:


...and another:


...based upon this image:
  

Now to think about what other senses I can give to Alexa...