Wednesday 19 April 2017

Half Marathon Comparison Using Azure, Google Maps, Python, MongoDB and Javascript

This time last year I blogged about a half-marathon I had run where I paced it badly and slowed down massively at the end.  I did the same race this year and ran a faster time but more importantly paced it more consistently and so enjoyed the experience more.

The run was over the same course and the weather was similar so this provides a good opportunity to compare and contrast both years.  At a superficial level, as part of the results that are provided you get to see your split after every 5K.  Hence it was possible to compare the splits of last year with this year:

So simply put:
  • After 5K, in 2017 I was 38 seconds behind where I was in 2016.
  • After 10K I was 44 seconds behind and after 15K I was a massive 80 seconds behind.
  • However after 20K in 2017 I had turned this around was 14 seconds ahead of 2016.
  • Then I ran the final 1.1K 27 seconds faster in 2017 than in 2016 to finish 41 seconds up.
Note that none of this was down to significantly better fitness, I just paced the run more sensibly in 2017.  (Put differently I was a lot more stupid in 2016!).

As a Geek I wanted to go further in this analysis so I thought it would be fun to visually compare 2016 versus 2017 on a map.  i.e. See my 2016 self zoom past my 2017 self then see my 2017 self catch up and pass 2016.  Having tinkered with AWS and Bluemix it was time to drive a different cloud computing offering so I decided to take up Microsoft's kind offer of £150 of credit.  

Here's the result.  The "6" marker is 2017, the "7" marker is 2017.



So you can see:
  • Me starting further up the road in 2017.
  • The 2016 me catching up and passing the 2017 me around the University.
  • 2016 me staying ahead for a long period of time.
  • 2017 me catching up and quickly passing 2016 me on the final straight stretch to the finish.

So a fascinating profile!

Here's a diagram of what I put together.  Full description and code then follows.



The above diagram shows the following key steps:
  • Garmin Sports watch syncs with Garmin Connect (standard activity)
  • GPX files downloaded from Garmin Connect and uploaded to Azure Virtual Machine (covered in Step 1 below).
  • Python script to parse GPX files and load them in a MongoDB instance (Step 2)
  • Apache webserver and Python cgi-bin to extract data from the MongoDB instance and offer a simple API (step 3)
  • HTML, CSS and Javascript to access API and present animated map markers using the Google Maps Javascript API (step 4)
Step 1 - Getting a Azure Linux Virtual Machine
Microsoft Azure is very easy and intuitive to use.  I already had a Microsoft account for Outlook.com so just used this to go through the Azure free trial sign up process.  This gave me £150 worth of free credit on the Azure platform.

After quickly reviewing tutorials I requested a Linux Virtual Machine using the steps New - Compute - Ubuntu Server 16.04TS and then providing some basic configuration details.  Within roughly a minute the server had been setup and I could get details as to how to SSH onto the VM using PuTTY.  The size of the platform was Standard DS1 v2 (1 core, 3.5 GB memory) which was suitable for my needs.

A tile on the Azure dashboard gave me access to all manner of information and configuration options for the VM.  Example below:



Take a step back now - for an olde skool Technology guy such as myself I am still super impressed by cloud computing capabilities.  No massive forms to fill out, no tetchy administrators to haggle with, no IP networking to organise - just BOOM! and you've got a machine to play with.

The final part of this step was to use an FTP client (WinSCP) to upload the Garmin GPX files to the VM.

Step 2 - MongoDB, GPX File Parsing and Database Loading
The plan was to use the GPX files recorded by my Garmin sports watch in 2016 and 2017 to allow map markers to be animated.  So what's a GPX file?  Here's a definition:

GPX, or GPS Exchange Format, is an XML schema designed as a common GPS data format for software applications. It can be used to describe waypoints, tracks, and routes. The format is open and can be used without the need to pay license fees.

Here's the top section of one of my half-marathon GPX files:

<?xml version="1.0" encoding="UTF-8"?>
<gpx creator="Garmin Connect" version="1.1"
  xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/11.xsd"
  xmlns="http://www.topografix.com/GPX/1/1"
  xmlns:ns3="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
  xmlns:ns2="http://www.garmin.com/xmlschemas/GpxExtensions/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <metadata>
    <link href="connect.garmin.com">
      <text>Garmin Connect</text>
    </link>
    <time>2016-04-03T09:19:19.000Z</time>
  </metadata>
  <trk>
    <name>Whitley Ward Running</name>
    <type>running</type>
    <trkseg>
      <trkpt lat="51.42623794265091419219970703125" lon="-0.992680527269840240478515625">
        <ele>45.40000152587890625</ele>
        <time>2016-04-03T09:19:19.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:hr>82</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
      <trkpt lat="51.42622503452003002166748046875" lon="-0.9927202574908733367919921875">
        <ele>45.40000152587890625</ele>
        <time>2016-04-03T09:19:20.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:hr>82</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>

So some metadata and then a <trk> section with a <trkseg> subsection which is a container for a bunch of <trkpt> elements.  Within each of these you can see:

  • The position logged (latitude and longitude)
  • Elevation
  • Date and time
  • Heart rate 

I wanted to store all the data in a database and I chose to use MongoDB on Azure as I enjoyed using it for a Raspberry Pi project last year (and so also had cracked using Python to write to and read from the database).

Getting the database was super easy.  Within the Azure I did New - Databases - "Database as a Service for MongoDB", entered a few details and minute or two later had a MongoDB instance.

Remembering that the structure of a document database is different to a relational database as follows:

Relational Database TermDocument Database Term
DatabaseDatabase
TableCollection
RowDocument

...I set up a database called "geekmongo" and created a collection within it called "Test".  Hence the task at hand was to parse the GPX files, create JSON documents and write them to the Test collection in the geekmongo database.

#Import statements
import xml.etree.ElementTree as ET
from datetime import datetime
from pymongo import MongoClient

#File to process
FileOne = '/home/map/rhm/activity_1109977624_2016.gpx'

#Database related - Got all these from "Connection String" area on Azure for the database instance
#Created the collection test myself manually on Azure
dBAddress = 'Your Connection String'

#Start parsing that XML
tree = ET.parse(FileOne)
root = tree.getroot()

#Set up for the mongo instance, the database then the collection
#Connect to the database
client = MongoClient(dBAddress)
db = client.geekmongo
collection = db.Test

#Get the first timestamp as we'll reference all subsequent ones to this in order to be able to calculate elapsed timestamp
FirstTimeStamp = root[1][2][0][1].text
#Turn it into a time object we can use
TStart = datetime.strptime(FirstTimeStamp[:-5],"%Y-%m-%dT%H:%M:%S")

#Loop through the XML file picking out lat and lng and writing them to a file, (unless they're the last ones on the list)
#Example is {"elapsed": 0, "lat":51.4566827,"lng":-0.9690389},
LoopVar = 0
for myTrkpt in root[1][2]:
  #Calculate the elapsed time
  TimeNow = myTrkpt[1].text
  TNow = datetime.strptime(TimeNow[:-5],"%Y-%m-%dT%H:%M:%S")
  TimeElapsed = abs(TNow - TStart).seconds

  #Build a Python dictionary that we'll write to the MongoDB
  MongoDoc = {}
  MongoDoc["elapsed"] = TimeElapsed
  MongoDoc["lat"] =  myTrkpt.attrib.get('lat')
  MongoDoc["lng"] =  myTrkpt.attrib.get('lon')
  MongoDoc["elevation"] =  myTrkpt[0].text
  MongoDoc["timestamp"] =  TimeNow[:-5]
  MongoDoc["heart"] = myTrkpt[2][0][0].text
  MongoDoc["cadence"] = 0

  #Write the document to the footie collection
  collection.insert_one(MongoDoc)

print("Done")


So here I use the Python XML and pymongo modules to parse the GPX files and write to the database respectively.

With pymongo you create an object and then connect to the database using a "Connection String".  You get this from the Azure management console for the MongoDB instance under the "Connection String" settings area.  This string contains the database address and the credentials required to access it.  You then can create a collection object which you write documents to using the insert_one() method.

Using the XML module you create an object called "root" and can use indices to access the different parts of the GPX structure.  So for example root[0] will be the first part of the GPX file.

The code then loops through the <trkpt> elements of the GPX file, picks out all the relevant data and then creates a Python dictionary which will be written to the database.  I also calculate an "elapsed" field which is the difference in seconds between the first <trkpt> elements and the element in question.  I foresee this being useful later...

It was interesting to look at the Azure console as I ran the scripts to parse the GPX files and write documents to the database.  Here's what was shown:

Here you can see a peaks "insert" requests as the data was being inserted.  The two peaks represent the two separate files being parsed and loaded.

Step 3 - A Web Server and an API
I wanted to create an API such that Javascript running within a browser could make a AJAX request to extract the data.  At some point I'll explore an Azure Web App for this sort of thing but for now I decided to use an Apache web server running on the Azure Linux VM and use a cgi-bin Python script to provide the functionality of the API.  I simply ran the sudo apt-get install apache2 command to install Apache and used this guide to get cgi-bin working for Python scripts.

To get the web server to work I had to do some configuration within the Azure console.  Specifically I had to configure a rule to enable HTTP traffic (port 80) on the platform.  To do this I selected the VM from the console, selected "Network Interfaces" then selected the Network Security Group.  I then configured the "HTTPRule" shown below:


To create the API I wrote the following Python script:



#!/usr/bin/env python

#Import statements
from pymongo import MongoClient
import cgi
import re
import cgitb

#Enable error logging
cgitb.enable()

#Database related - Got all these from "Connection String" area on Azure for the database instance
dBAddress = 'Your Connection String' 
CollectionID = 'Test'

#Get the query string parameters provided.  'name' field is the mongo name, 'value' field is the value
arguments = cgi.FieldStorage()
MongoName = arguments['name'].value
MongoValue = arguments['value'].value

#Form the document to use for the database access.  We will do a Regex because we may be searching on a partial date
MongoRegex = {}
MongoRegex['$regex'] = MongoValue
MongoDoc = {}
MongoDoc[MongoName] = MongoRegex

#Connect to the database, get a database object and get a collection
client = MongoClient(dBAddress)
db = client.geekmongo
collection = db.Test

print ('content-type: application/json\n\n')

#Get the total number of documents returned and set up a counter variable
TotalDocCount = collection.count(MongoDoc)
DocCounter = 0

#Start the output string
OutString = '{"markers":['

#Do a database find based upon the parameters provided.  Use this to form the output.  Need elapsed (integer), lat (long 4dp), lng (long 4dp)
#elevation (1dp), timestamp (string) and heart rate (integer)
for rhmDoc in collection.find(MongoDoc):
  OutString = OutString + '{"elapsed":' + str(rhmDoc["elapsed"]) + ','
  OutString = OutString + '"lat":' + str(round(float(rhmDoc["lat"]),4)) + ',' 
  OutString = OutString + '"lng":' + str(round(float(rhmDoc["lng"]),4)) + ','
  OutString = OutString + '"elevation":' + str(round(float(rhmDoc["elevation"]),1)) + ','
  OutString = OutString + '"timestamp":' + chr(34) + rhmDoc["timestamp"] + '",'
  OutString = OutString + '"heart":' + str(rhmDoc["heart"]) + '}'

  #See how many documents we've dealt with and whether we need to add a , to the end of the document
  DocCounter += 1
  #If we're on the last document then we add the ] to close the JSON array
  if (DocCounter == TotalDocCount):
    OutString = OutString + '],'
  else:
    OutString = OutString + ','

#Add the total items part
OutString = OutString + '"TotalItems":' + str(TotalDocCount) + '}'

#Stream the output to the client
print OutString

Here I use the CGI module to read query string parameters provided by the client.  So for example the URL:

http://<server URL>/GetGPXData.py?name=heart&value=82


...will result in a database query being made for all documents that contain the heart rate value of 82.

The script then takes the response of the database query and forms a string JSON document with all the values to pass back to the client.


The "regex" component of the MongoDB query document means you can do a "contains" search on the database.  i.e. "contains 2016" to return all the values for 2016.

Step 4 - Web Page, Javascript and Google Maps API
So the final part of the project was to write a web page that could use Javascript to a)download data from the API I just created and b)plot it on a map using the Google Maps API.
  • The HTML, CSS and Javascript is below.  Highlights:
  • Uses Google maps Javascript API to bring up a map, place and move markers.  I started with this tutorial.
  • Uses the API previously described to acquire the position data to plot (function startRace).
  • Uses a Javascript interval to "fire" and cause an assessment of position and the map to be updated
  • Calculates the straight line distance between the markers.
<!DOCTYPE html>
<html>
<head>
<style>
#map {
height: 500px;
width: 100%;
}
</style>

</head>
<body>
<h3>Reading Half Marathon Analysis</h3>
<div id="map"></div>
<input type='button' id='btnLoad' value='Load One' onclick='loadLocsOne();'>
<input type='button' id='btnLoad' value='Load Two' onclick='loadLocsTwo();'>
<p id="Distance"></p>
<input type='button' id='btnLoad' value='Race' onclick='startRace();'>
<input type='button' id='btnLoad' value='Stop' onclick='stopRace();'>
<input type='button' id='btnLoad' value='Heart Chart' onclick='heartChart();'>
<script type="text/javascript">
//This is V10 that adds getting the data from an 'API'
//Some nasty global variables. Discovered needed to use setInterval to control a marker and these were needed for that.
var map //Enables us to reference the map in all parts of the code
var markerOne //A marker entity
var markerTwo //A marker entity
var timeElapsed //A variable to hold how many seconds have elapsed
var maxElapsed //Defines the maximum elapsed time we'll have across the two JSON structures
var locsOne = {}; //A position array
var locsTwo = {}; //A position array
var intervalVar //Use for the setInterval thingy
var raceStarted //Boolean that defines whether the race has started
//Initialise the map and put a marker on it
function initMap() {
var readingOne = {lat: 51.4366827, lng: -0.9680389};
var readingTwo = {lat: 51.4466827, lng: -0.9780389};
map = new google.maps.Map(document.getElementById('map'), {
zoom: 13,
center: readingOne
});
markerOne = new google.maps.Marker({
position: readingOne,
map: map,
label: "6"
});
markerTwo = new google.maps.Marker({
position: readingTwo,
map: map,
label: "7"
//color: 0xFFFFFF
});
//This is a load or reload so state that the race is not started
raceStarted = false;
}
//Just move a marker
function positionMarker(inMarker, inLat, inLng)
{
//Set the position of the marker
var newPos = {lat: inLat, lng: inLng};
//Set the position of the marker
inMarker.setPosition(newPos);
}
//Initialises matters when user presses "Race"
function startRace()
{
//What we do first depends on whether the race is started!
if (raceStarted == false)
{
//Initialise the position number
timeElapsed = 0;
//We need to find out the max elapsed time across the two structures. In this way we'll increment the elapsed time every time the interval handler
//fires. If we find a position we update the marker. If not we leave the marker where it is. When we've exhausted all possible elapsed times then
//we know to stop the handler
var maxElapsedOne = locsOne.markers[locsOne.TotalItems - 1].elapsed;
var maxElapsedTwo = locsTwo.markers[locsTwo.TotalItems - 1].elapsed;
//Set up the max elapsed value
if (maxElapsedOne > maxElapsedTwo)
{
maxElapsed = maxElapsedOne;
}
else
{
maxElapsed = maxElapsedTwo;
}
raceStarted = true;
}
//Set up to move the marker
intervalVar = setInterval(function(){ assessMarkerMove()}, 10);
}
//Stops the race
function stopRace()
{
clearInterval(intervalVar);
}
//Handles assessing whether to move the markers and if required doing so
//locs.markers[posNumber].lat,locs.markers[posNumber].lng
function assessMarkerMove()
{
//Variables
var i
//See if we can find a marker associated with the current elapsed time
for (i in locsOne.markers)
{
if (locsOne.markers[i].elapsed == timeElapsed)
{
positionMarker(markerOne, locsOne.markers[i].lat, locsOne.markers[i].lng);
}
}
for (i in locsTwo.markers)
{
if (locsTwo.markers[i].elapsed == timeElapsed)
{
positionMarker(markerTwo, locsTwo.markers[i].lat, locsTwo.markers[i].lng)
}
}
//Calculate the straightline distance between the markers. Only do this every 10 iterations else it look messy
if (Number.isInteger(timeElapsed / 10) == true){
distanceBetween = Math.round(google.maps.geometry.spherical.computeDistanceBetween(markerOne.position, markerTwo.position));
document.getElementById("Distance").innerHTML = distanceBetween + ' metres between!';}
//Increment the counter of how many times this has been called
timeElapsed++;
//See whether we've reached the end of the array
if (timeElapsed > maxElapsed)
{
clearInterval(intervalVar);
raceStarted = false;
}
}
//Called when the load data button is pressed
function loadData()
{
loadLocsOne();
loadLocsTwo();
document.getElementById("Distance").innerHTML = 'Data Loaded!!';
}
//Load the first array
function loadLocsOne() {
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
//alert(this.responseText);
locsOne = JSON.parse(this.responseText);
document.getElementById("Distance").innerHTML = 'locsOne Loaded';
}
};
xhttp.open("GET", "http://a.b.c.d/cgi-bin/GetGPXData.py?name=timestamp&value=2016", true);

}
//Load the second array
function loadLocsTwo() {
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
locsTwo = JSON.parse(this.responseText);
document.getElementById("Distance").innerHTML = 'locsTwo Loaded';}
};
xhttp.open("GET", "http://a.b.c.d/cgi-bin/GetGPXData.py?name=timestamp&value=2017", true);
xhttp.send();
}
</script>
<script async defer
src="https://maps.googleapis.com/maps/api/js?key=<Your Key Here>&callback=initMap&libraries=geometry">
</script>
</body>
</html>


Wednesday 29 March 2017

Giving Alexa a new Sense - Vision! (Using Microsoft Cognitive APIs)

Amazon Alexa is just awesome so I thought it would be fun to let her "see".  Here's a video of this in action:



...and here's a diagram of the "architecture" for this solution:



The clever bit is the Microsoft Cognitive API so let's look at that first!  You can get a description of the APIs here and sign up for a free account.  To give Alexa "vision" I decided to use the Computer Vision API which takes a image URL or an uploaded image, analyses it and provides a description.

Using the Microsoft Cognitive API developer console I used the API to analyse the image of a famous person shown below and requested a "Description":



...and within the response JSON I got:

"captions": [ { "text": "Elizabeth II wearing a hat and glasses", "confidence": 0.28962254803103227 } ]

...now that's quite some "hat" she's wearing there (!) but it's a pretty good description.

OK - So here's a step-by-step guide as to how I put it all together.

Step 1 - An Apache Webserver with Python Scripts
I needed AWS Lambda to be able to trigger a picture to be taken and a Cognitive API call to be made so I decided to run this all from a Raspberry Pi + camera in my home.

I already have a Apache webserver running on my Raspberry Pi 2 and there's plenty of descriptions on the internet of how to do it (like this one).

I like a bit of Python so I decided to use Python scripts to carry out the various tasks.  Enabling Python for cgi-bin is very easy; here's an example of how to do it.

So to test it I created the following script:

#!/usr/bin/env python
print "Content-type: text/html\n\n"
print "<h1>Hello World</h1>"

...and saved it as /usr/lib/cgi-bin/hello.py.  I then tested it by browsing to http://192.168.1.3/cgi-bin/hello.py (where 192.168.1.3 is the IP address on my home LAN that my Pi is sitting on).  I saw this:



Step 2 - cgi-bin Python Script to Take a Picture
The first script I needed was one to trigger my Pi to take a picture with the Raspberry Pi camera.  (More here on setting up and using the camera).

After some trial and error I ended up with this script:

#!/usr/bin/env python
from subprocess import call
import cgi

def picture(PicName):
  call("/usr/bin/raspistill -o /var/www/html/" + PicName + " -t 1s -w 720 -h 480", shell=True)

#Get arguments
ArgStr = ""
arguments = cgi.FieldStorage()
for i in arguments.keys():
 ArgStr = ArgStr + arguments[i].value

#Call a function to get a picture
picture(ArgStr)

print "Content-type: application/json\n\n"

print "{'Response':'OK','Arguments':" + "'" + ArgStr + "'}"

So what does this do?  The ArgString and for i in arguments.keys() etc. code section makes the Python script analyse the URL entered by the user and extract any query strings.  The query string can be used to specify the file name of the photo that is taken.  So for example this URL:

http://192.168.1.3/cgi-bin/take_picture_v1.py?name=hello.jpg

...will mean a picture is taken and saved as hello.jpg.

The "def Picture" function then uses the "call" module to run a command line command to take a picture with the Raspberry pi camera and save it in the root directory for the Apache 2 webserver.

Finally the script responds with a simple JSON string that can be rendered in a browser or used by AWS Lambda.  The response looks like this in a browser:


Step 3 - Microsoft Cognitive API for Image Analysis
So now we've got a we need to analyse it.  For this task I leaned heavily on the code published here so all plaudits and credit to chsienki and none to me. I used most of the code but removed the lines that overlaid on top of the image and showed it on screen.

#!/usr/bin/env python
import time
from subprocess import call
import requests
import cgi

# Variables
#_url = 'https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze'
_url = 'https://westus.api.cognitive.microsoft.com/vision/v1.0/describe'

_key = "your key"   #Here you have to paste your primary key
_maxNumRetries = 10

#Does the actual results request
def processRequest( json, data, headers, params ):

    """
    Helper function to process the request to Project Oxford

    Parameters:
    json: Used when processing images from its URL. See API Documentation
    data: Used when processing image read from disk. See API Documentation
    headers: Used to pass the key information and the data type request
    """

    retries = 0
    result = None

    while True:
        print("This is the URL: " + _url)
        response = requests.request( 'post', _url, json = json, data = data, headers = headers, params = params )

        if response.status_code == 429:

            print( "Message: %s" % ( response.json()['error']['message'] ) )

            if retries <= _maxNumRetries:
                time.sleep(1)
                retries += 1
                continue
            else:
                print( 'Error: failed after retrying!' )
                break

        elif response.status_code == 200 or response.status_code == 201:

            if 'content-length' in response.headers and int(response.headers['content-length']) == 0:
                result = None
            elif 'content-type' in response.headers and isinstance(response.headers['content-type'], str):
                if 'application/json' in response.headers['content-type'].lower():
                    result = response.json() if response.content else None
                elif 'image' in response.headers['content-type'].lower():
                    result = response.content
        else:
            print( "Error code: %d" % ( response.status_code ) )
            #print( "Message: %s" % ( response.json()['error']['message'] ) )
            print (str(response))

        break

    return result

#Get arguments from the query string sent
ArgStr = ""
arguments = cgi.FieldStorage()
for i in arguments.keys():
 ArgStr = ArgStr + arguments[i].value

# Load raw image file into memory
pathToFileInDisk = r'/var/www/html/' + ArgStr

with open( pathToFileInDisk, 'rb' ) as f:
    data = f.read()

# Computer Vision parameters
params = { 'visualFeatures' : 'Color,Categories'}

headers = dict()
headers['Ocp-Apim-Subscription-Key'] = _key
headers['Content-Type'] = 'application/octet-stream'

json = None

result = processRequest( json, data, headers, params )

#Turn to a string
JSONStr = str(result)

#Change single to double quotes
JSONStr = JSONStr.replace(chr(39),chr(34))

#Get rid of preceding u in string
JSONStr = JSONStr.replace("u"+chr(34),chr(34))


if result is not None:
  print "content-type: application/json\n\n"

  print JSONStr

So here I take arguments as before to know which file to process, "read" the file and then use the API to get a description of it.  I had to play a bit with the response to get it into a format that could be parsed by the Python json module.  This is where I turn single quotes to double quotes and get rid of preceding "u" characters.  There's maybe a more Pythonic way to do this, please let me know if you know a way....

When you call the script via a browser you get:


Looking at the JSON structure in more detail you can see a "description" element which is how the Microsoft Cognitive API has described the image.

Step 4 - Alexa Skills Kit Configuration and Lambda Development
The next step is to configure the Alexa Skills kit and write the associated AWS Lambda function.  I've covered how to do this previously (like here) so won't cover all that again here.

The invocation name is "yourself"; hence you can say "Alexa, ask yourself...".

There is only one utterance which is:
AlexaSeeIntent what can you see

...so what you actually say to Alexa is "Alexa, ask yourself what can you see".  

This then maps to the intent structure below:

{
  "intents": [
    {
      "intent": "AlexaSeeIntent"
    },
    {
      "intent": "AMAZON.HelpIntent"
    },
    {
      "intent": "AMAZON.StopIntent"
    },
    {
      "intent": "AMAZON.CancelIntent"
    }
  ]
}

Here we have a boilerplate intent structure with the addition on AlexaSeeIntent which is what will be passed to the AWS Lambda function.

I won't list the whole AWS Lambda function below, but here's the relevant bits:

#Some constants
TakePictureURL = "http://<URL or IP Address>/cgi-bin/take_picture_v1.py?name=hello.jpg"
DescribePictureURL = "http://<URL or IP Address>/cgi-bin/picture3.py?name=hello.jpg"

Then the main Lambda function to handle the AlexaSeeIntent:

def handle_see(intent, session):
  session_attributes = {}
  reprompt_text = None
  
  #Call the Python script that takes the picture
  APIResponse = urllib2.urlopen(TakePictureURL).read()
  
  #Call the Python script that analyses the picture.  Strip newlines
  APIResponse = urllib2.urlopen(DescribePictureURL).read().strip()
  
  #Turn the response into a JSON object we can parse
  JSONData = json.loads(APIResponse)
  
  PicDescription = str(JSONData["description"]["captions"][0]["text"])
  
  speech_output = PicDescription
  should_end_session = True
  
  # Setting reprompt_text to None signifies that we do not want to reprompt
  # the user. If the user does not respond or says something that is not
  # understood, the session will end.
  return build_response(session_attributes, build_speechlet_response(
        intent['name'], speech_output, reprompt_text, should_end_session))

So super, super simple.  Call the API to take the picture, call the API to analyse it, pick out the description and read it out.

Here's the image that was analysed for the Teddy Bear video:



Here's another example:


The image being:


...and another:


...based upon this image:
  

Now to think about what other senses I can give to Alexa...


Using IBM Bluemix Watson APIs to Optimise my CV (Resume)

I kept seeing adverts for IBM Bluemix popping up on my social media feeds and online adverts so I thought I'd take a look and see what it was all about.  You can sign up an account and get 30 days access for free so that was all good for a home hobbyist like me!

So what is Bluemix?  Here's a snippet from the IBM Bluemix website:

The IBM Bluemix cloud platform helps you solve real problems and drive business value with applications, infrastructure and services.


So it does what it says on the tin really.  A bunch of cloud based capabilities that lets you do interesting stuff!  So what interesting stuff to do with this?   Creating an account (free for 30 days - no payment card required) and browsing the Bluemix catalogue my eye was drawn to the Watson APIs.  Watson was made famous through winning the US Jeopardy gameshow and there's a bunch of exciting artificial intelligence capabilities like Natural Language Understanding and Personality Insights you can use.

As a starting point I decided to play with the "Tone Analyzer" API, the description of which is as follows:

People show various tones, such as joy, sadness, anger, and agreeableness, in daily communications. Such tones can impact the effectiveness of communication in different contexts. Tone Analyzer leverages cognitive linguistic analysis to identify a variety of tones at both the sentence and document level. This insight can then used to refine and improve communications.

At the moment I'm updating my CV (resume for you good people in the USA) and I'm told that when faced with an avalanche of CVs, recruiters will sometimes only ready the very first "personal profile" section of the document to make their initial decision.  Additionally recruiters are more-and-more using AI tools to filter CVs.  I thought that if I could use the tone analyser to optimise that first section of my CV then this would be a good use of Bluemix.

To use the API you simply click "Create" and get some credentials to access the API.  IBM provide a lot of guidance as to how to use the API and provide SDKs for languages like Python and node.js.  I decided to use curl as all I wanted to do was throw some text at the API and see the result.

So here's a curl command to access the API:

curl -v -u "username":"password" -H "Content-Type: text/plain" -d "Some text" "https://gateway.watsonplatform.net/tone-analyzer/api/v3/tone?version=2016-05-19"

(Replace username and password with the ones you are provided by Bluemix).

The response is a JSON structure that looks like this (abridged):
{"document_tone":{"tone_categories":[{"tones":[{"score":0.135461,"tone_id":"anger","tone_name":"Anger"},{"score":0.045643,"tone_id":"disgust","tone_name":"Disgust"},{"score":0.71908,"tone_id":"fear","tone_name":"Fear"},{"score":0.232038,"tone_id":"joy","tone_name":"Joy"},{"score":0.524529,"tone_id":"sadness","tone_name":"Sadness"}],"category_id":"emotion_tone","category_name":"Emotion Tone"},

The structure provides numeric values (range 0 to 1) based upon a set of analysis criteria that IBM defines as follows:

It detects three types of tones, including emotion (anger, disgust, fear, joy and sadness), social propensities (openness, conscientiousness, extroversion, agreeableness, and emotional range), and language styles (analytical, confident and tentative) from text.

Numeric values are provided for the whole piece of text you provide plus it's broken down into sentences and each sentence is analysed.  My grand plan it to pick out those attributes that I deem important for the type of job I would like to get and then "tune" the text from my CV to improve those attributes.

First I need to be able to take the JSON API response and turn it into something I could read and interpret.  I decided to use Python to analyse the JSON (because Python rocks) and use an online charting capability called Plotly to visualise the data.  I used Plotly as it has an API that I thought would be fun to learn about.

Plotly provides a REST API that you can HTTP POST to and have Plotly render a chart in your online account that you can, for example, reference in another website. Plotly provide online descriptions of the REST API here but in simple terms you specify the data to plot and some formatting parameters and Plotly does the rest for you.

Here's a example POST message body:

un=chris& key=kdfa3d& origin=plot& platform=lisp& args=[[0, 1, 2], [3, 4, 5], [1, 2, 3], [6, 6, 5]]& kwargs={"filename": "plot from api", "fileopt": "overwrite", "style": { "type": "bar" }, "traces": [1], "layout": { "title": "experimental data" }, "world_readable": true }

Here's some Python I wrote to extract data from the JSON structure and use the Plotly API, (replace Watson API response and credentials with your values):

import json
import pprint
import urllib.request
import sys

#Constants
#Baseline response  
APIResponse = 'Bluemix JSON Response Here'

PlotlyURL = "https://plot.ly/clientresp"
UserName = "username"
APIKey = "YourKey"

#Example simple arguments Strings
#NArgsString = '[["One", "Two", "Three"], [0.98, 0.87, 0.87]]'


#This is a arguments string for plotly formatting
KwargsJSON = {"filename": "plot from api",
        "fileopt": "overwrite",
        "style": {"type": "bar"},
    "traces": [0],
    "layout": {
        "title": "Less Anger!"
    },
    "world_readable": True
}

#First we extract all the data from the JSON from Watson.  Can pretty print if you want
ToneJSON = json.loads(APIResponse)
#pprint.pprint(ToneJSON)

#Initialise the sub components of the plotly argument string
XList = []
YList = []

#Itterate through the JSON structure picking up attributes and values
for MyTone in ToneJSON["document_tone"]["tone_categories"]:
  for TheTones in MyTone["tones"]: 
    #Build the x and y Python lists that we'll use for the plotly argument
    XList.append(str(TheTones["tone_name"]))
    YList.append(float(TheTones["score"]))
    
#This is the arguments string for plotly.  We need to use the .join method to make sure the arguments string is properly formatted for the x axis values
NArgsString = "[[" + ','.join('"{0}"'.format(w) for w in XList) + "], " + str(YList) + "]"

#Make sure we have " not ' around the JSON elements
NArgString = NArgsString.replace(chr(39),chr(34))

#Form the body for the HTTP POST
KwargsString = json.dumps(KwargsJSON)
PostBody = "un=" + UserName + "&" + "key=" + APIKey + "&" + "origin=plot&platform=lisp&args=" + NArgsString + "&" + "kwargs=" + KwargsString

#Encode the whole post body
PostBody = PostBody.encode('utf-8')

#Form the request
MyRequest = urllib.request.Request(PlotlyURL, data=PostBody)

try:
  #Execute the request
  wp = urllib.request.urlopen(MyRequest)
  
  #Read the response and print it for the user
  TheResponse = wp.read()
  print(str(TheResponse))

#Handle pesky errors
except urllib.error.HTTPError as e:
  print("HTTP Error caught when making request " + str(e.code) + "\n")
except urllib.error.URLError as e:
  print("URL Error caught when making request\n")

 So we're ready to analyse some text.  First I used some made up text to test how good the tone analyser API is.  Here's the text:

I am excellent at everything.  There is nothing I can not do.  Throw a challenge at me and I will succeed.  I have beaten every target ever set for me.  Employ me and you will employ a winner.

...and here's the resulting Plotly chart:


That seems about right, in particular the sky-high confidence score!

Here's my current CV profile statement:

A Solution Architect with a wide range of knowledge and experience in the Telecommunications and IT industry.  Has significant experience of leading cross-functional teams to deliver innovative solutions spanning IT, Network and TV systems.  A strong self-starter with proven analytical and problem solving skills.  Able to learn about new technologies quickly and apply this knowledge to design tasks.  Well-developed communication and presentation skills, both written and oral.

...which when analysed by Watson and charted by Plotly yields this:


So I would say that for the type of job I want I need to:

  • Reduce the anger and sadness
  • Maintain analytical
  • Have some confidence!
  • Improve conscientiousness

..but as another test of the API I analysed this version of my profile (addition in red):

A Solution Architect with a wide range of knowledge and experience in the Telecommunications and IT industry.  Has significant experience of leading cross-functional teams to deliver innovative solutions spanning IT, Network and TV systems.  A strong self-starter with proven analytical and problem solving skills.  Able to learn about new technologies quickly and apply this knowledge to design tasks.  Well-developed communication and presentation skills, both written and oral.  I’m so afraid that if I don’t get this CV right then I won’t be employed by anyone; I’m really really scared, worried and frightened about this!

Watson and Plotly yield this:


There we go!  Fear increases from negligible to ~0.7 so there's a definite correlation between text and the analysis.

Back to business.  Here's a modification to my profile to try and boost confidence:

A Solution Architect with a track record of successful delivery in the Telecommunications and IT industry.  Has significant experience of leading cross-functional teams to deliver innovative solutions spanning IT, Network and TV systems.  A strong self-starter with proven analytical and problem solving skills.  In a fast paced, ever changing technology world, is confident in his abilities to quickly learn and apply new skills.  Well-developed communication and presentation skills, both written and oral.

Watson and Plotly they say:


Bingo!

Now to up the conscientiousness.  I actually had to play with the language a lot and even then I only managed to improve it by 0.1.  Here's what I wrote:

A Solution Architect with a track record of successful delivery in the Telecommunications and IT industry.  Has significant experience of leading cross-functional teams to deliver innovative solutions spanning IT, Network and TV systems.  A strong self-starter with proven analytical and problem solving skills.  In a fast paced, ever changing technology world, is confident in his abilities to quickly learn and apply new skills.  A conscientious, reliable individual who always who sets challenging goals, forms structured plans to achieve them and follows through until the job is complete.

Which results in:

Finally to drop the anger levels as anger is never a good look!  Here's what I wrote:

A Solution Architect with a track record of successful delivery in the Telecommunications and IT industry.  Has significant experience of leading cross-functional teams to deliver innovative solutions spanning IT, Network and TV systems.  A strong self-starter with proven analytical and problem solving skills that is never happier than when working with like-minded individuals to harmoniously collaborate and solve problems.  In a fast paced, ever changing technology world, is confident in his abilities to quickly learn and apply new skills.  A conscientious, reliable individual who always who sets challenging goals, forms structured plans to achieve them and follows through until the job is complete.

The net result being:
So finally I used Plotly to compare the initial (baseline) analysis with the final text.  Here's the result:

So less anger, more joy, less sadness, more confidence and more conscientiousness so all looking good here.  However I've also dropped the analytical and openness scores which isn't so good for the type of role I'd like but I can live with that!  So now to use this for my real-life CV.  Wish me luck...