Friday, 2 December 2016

Garmin ConnectIQ - First Attempt at Using the SDK

The Garmin Forerunner 910XT that I have previously blogged about using died a death a couple of weeks ago so I decided to buy a Garmin Fenix 3 which has roughly the same feature set.  The big selling point for a geek like me was the Connect IQ capability that means you can write apps for the watch.  Geektastic!

Overall ConnectIQ let's you write:

  • Fully blown apps
  • Watch faces
  • Widgets
  • Data fields really customising your Garmin product.

Some resources I used to get myself setup:

In particular I spent a long time going through the Getting Started section of the Programmer's Guide.  I'm generally pretty gung ho and like to go it alone but it was worth going through this slowly step-by-step.  I chose to use Eclipse (Luna) and followed the tutorial to produce a simple watch face and load it into the emulator. I won't repeat the steps here as the ones Garmin provide are extremely good.

I then decided to modify the watch face in some way to learn more about ConnectIQ.  My idea was to provide a "countdown until parkrun" watch face.  Keen readers will know I like a bit of parkrun so I decided to do a watch face that both shows you the current time and, as time passes, counts down the days, minutes, hours and seconds until parkrun.

To modify the watch face I just had to change two files within the Eclipse project.  These are selected on the Project Explorer view below.

In simple terms, layout.xml defines the layout of the watch face and the file contains the code required to modify aspects of the watch face.

For the layout I decided to have a simple one of:

  • The current time at the top
  • Then a count down until parkrun
  • Then some text to say "until parkrun"
  • Then some form of "motivational" slogan

The layout.xml for this looks like:

<layout id="WatchFace">
    <label id="TimeLabel" x="center" y="50" font="Gfx.FONT_LARGE" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_BLUE" />
    <label id="TimeToParkrunLabel" x="center" y ="100" font="Gfx.FONT_SMALL" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_RED" />
    <label id="ParkrunTextLabel" x="center" y ="125" font="Gfx.FONT_SMALL" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_RED" /> 
    <label id="SloganTextLabel" x="center" y ="160" font="Gfx.FONT_SMALL" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_GREEN" />

Here we see four labels with their name, position on screen, font size and colour defined.

Without changing anything in the file, this function references the layout file and tells ConnectIQ what to draw on screen:

function initialize() {

    // Load your resources here
    function onLayout(dc) {

Where the WatchFace reference links to the layout.xml contents.

The first function I changed was the "onUpdate" one.  Here it is:

    // Update the view
    function onUpdate(dc) {
        // Get and show the current time
        var clockTime = Sys.getClockTime();
        var timeString = Lang.format("$1$:$2$:$3$", [clockTime.hour, clockTime.min.format("%02d"),clockTime.sec.format("%02d")]);
        var view = View.findDrawableById("TimeLabel");
        //Time to park run
        var TimeToParkrunStr = CalcTimeToParkrun();
        var ParkrunTimeView = View.findDrawableById("TimeToParkrunLabel");
        //Added by Paul Weeks
        var ParkrunLabelStr = "...until next parkrun!";
        var ParkrunLabelView = View.findDrawableById("ParkrunTextLabel");
        //This is a slogan at the bottom.  Could change dynamically in future
        var SloganStr = "#DFYB";
        var SloganLabelView = View.findDrawableById("SloganTextLabel");
        // Call the parent onUpdate function to redraw the layout

Here we can see some simple concepts that show how to update the screen and show the time.  Breaking it down:

1)Create a variable to hold the current time:
var clockTime = Sys.getClockTime();

2)Turn this into a string.  The Lang.format method can be used for this.  Here we see it used to create a string with three components (referenced as $1$:$2$:$3$) formed from the hour, second and min part of the time.  The %02d part simply puts a leading zero on to pad numbers less than 10:

var timeString = Lang.format("$1$:$2$:$3$", [clockTime.hour, clockTime.min.format("%02d"),clockTime.sec.format("%02d")]);

3)Creating a "view variable" that links to layout.xml and then updating this with the time text:
var view = View.findDrawableById("TimeLabel");

But all the heavy lifting I did was related to the "countdown until parkrun" part.  Here you can see I call out to another function called "CalcTimeToParkrun".  It was really slow going writing this!

Unlike with other languages like Python there's not a mass of examples an tutorials on the internet or entries on Stack Overflow.  Instead it was a case of using the API reference and trial and error to get things working.  I learnt a lot!

Here's the function:

//Calculate time until parkrun
//Algorithm may be a little clunky.  Refine over time
function CalcTimeToParkrun() {
  //Some constants
  var parkrunDay = 7;     //Parkrun on Saturday.  Could be setting in future
  var parkrunTime = 9;    //Parkrun at 0900.  Could be setting in future
  //Need to calculate the next parkrun day at parkrun time and then work out the difference between then and now.  
  var now =;
  var info =, Time.FORMAT_SHORT);
  //Format_short means day of week is a number.  1 for Sunday, 2 for Monday etc.
  var dayStr = Lang.format("$1$", [info.day_of_week]);   
  var hourStr = Lang.format("$1$", [info.hour]);
  //Might be useful, shows how to format a date
  //var dateStr = Lang.format("$1$ $2$ $3$", [info.day_of_week, info.month,]);
  //Have a look at the day of week.  The actual day is a special day as parkrun might either be that day or next week.  
  //Saturday is day 7.
  var dayNum = dayStr.toNumber();   //Turn day to an actual number
  var hourNum = hourStr.toNumber(); //Turn hour into actual number
  //What we need to do is calculate how many days to parkrun.  Saturday is the key case, i.e. assessing before or after
  //parkrun time.
  var daysToPR = -1;
  if ((dayNum < parkrunDay) || ((dayNum == parkrunDay) && (hourNum < parkrunTime))){
     daysToPR = 7 - dayNum;
  else {
     daysToPR = 7;
  //Create a moment that represents midnight today
  var todayDict = {:day =>, :month => info.month.toNumber(), :year => info.year.toNumber()};
  var todayMoment = Calendar.moment(todayDict);
  //Create a duration of the number of days until parkrun + hours until parkrun
  var durDict = {:days => daysToPR, :hours => parkrunTime};          
  var myDuration = Calendar.duration(durDict);

  //Add the number of days and hours to midnight today to get a moment that represents parkrun start time
  var parkrunMoment = myDuration.add(todayMoment);
  //Subtract now from when parkrun is to get a duration until parkrun.  .value turns it into seconds
  var durationTillParkrun = parkrunMoment.subtract(now).value(); 
  //So now we have seconds until parkrun. Need to turn into days, hours and mins
  //This seems like hard yards but can't find a better way...
  var daysTillParkrun = durationTillParkrun / 86400;
  var hoursTillParkrun = (durationTillParkrun - (daysTillParkrun * 86400)) / 3600; 
  var minsTillParkrun = (durationTillParkrun - ((daysTillParkrun * 86400)+(hoursTillParkrun * 3600))) / 60;
  var secsTillParkrun = (durationTillParkrun - ((daysTillParkrun * 86400)+(hoursTillParkrun * 3600)+(minsTillParkrun * 60)));
  return daysTillParkrun.toString() + " days " + hoursTillParkrun.toString() + ":" + minsTillParkrun.toString() + ":" + secsTillParkrun.toString();  

The algorithm is pretty simple.  It's as follows:

  • Get a number associated with the day of week.  So Sunday = 1, Monday = 2 etc.
  • Calculate "days until parkrun".  In general this is 7 - Day of week number.  However there's an extra decision to make on parkrun day as to whether it's before or after parkrun (and so mere minutes to go or several days).
  • Calculate a duration* which is from midnight today plus the number of full days until parkrun plus the number of hours to wait on parkrun day.
  • Calculate a moment** which is the actual date and time of parkrun.
  • Calculate the difference between now and the date and time of parkrun.

*A duration is a period of time in Garmin Connect IQ.  So these two lines of code define a duration:
var durDict = {:days => daysToPR, :hours => parkrunTime};          
var myDuration = Calendar.duration(durDict);

So the dictionary defines the number of days and hours for the duration, then the duration is calculated.

**A moment is a moment in time in Garmin Connect IQ.  These two lines of code define a moment:
  var todayDict = {:day =>, :month => info.month.toNumber(), :year => info.year.toNumber()};
  var todayMoment = Calendar.moment(todayDict); 

Again the dictionary defines the parameters for the moment and then the moment is created.

You can then do calculations based upon durations and moments:
var parkrunMoment = myDuration.add(todayMoment);

..and so the big reveal, how does it look in the Eclipse simulator?

...hmmm, yes you're right, somewhere between awful and terrible!

Playing with the layout file I changed it to be:

<layout id="WatchFace">
  <label id="TimeLabel" x="center" y="15" font="Gfx.FONT_NUMBER_THAI_HOT" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_WHITE" />
  <label id="TimeToParkrunLabel" x="center" y ="115" font="Gfx.FONT_MEDIUM" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_WHITE" />
  <label id="ParkrunTextLabel" x="center" y ="140" font="Gfx.FONT_MEDIUM" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_WHITE" /> 
  <label id="SloganTextLabel" x="center" y ="175" font="Gfx.FONT_SMALL" justification="Gfx.TEXT_JUSTIFY_CENTER" color="Gfx.COLOR_WHITE" />

So all white text and bigger fonts.  The "THAI_HOT" font is a built in Garmin font.  I also played with the simulator settings to specify that a Fenix 3 simulator be used:

...and now it looks a lot better in the simulator:

...and it looks vaguely OK on my wrist after side-loading the application!

So what have I learnt:
  • Real estate is at a premium on a Garmin watch.  Just 218 x 218 pixels to play with!
  • You need to think very carefully as to what to put on the very small screen.
  • I need to explore sleep mode as the watch counts down the seconds for about 10 seconds then stops and just updates once a minute.  Lots of wrist waggling is required to get it to start counting down again.  I know this is to save battery but the default watch faces both have a seconds component so it must be possible to get a continuous countdown.

Saturday, 29 October 2016

Resting Heart Rate and Fitness

Previously I've done plenty of posts on using Fitbit Heart Rate data, Strava data and suchlike to assess my fitness.

Two things I've spotted recently:

  1. My resting heart rate seems to be decreasing, as shown on my Fitbit Charge HR.
  2. I seem to be running consistently faster at parkrun.

Conventional wisdom is that a lower heart rate represents improved fitness.  So to find out whether the two are linked...

The analysis was pretty simple.  First I just scraped monthly average resting heart rate data from my Fitbit app and noted it in Excel (I didn't think I'd need the power or R for this analysis).  This nicely smooths out day-to-day variations in heart rate and shows some decent trends.  Example:

I also scraped all parkrun results from the parkrun website.  I chose to use parkrun for this analysis as it's the same distance run at the same time every week in (almost) the same place.  There are some variables that could affect my time (e.g. if it's muddy underfoot, the odd bit of tourism) but these things should cancel themselves out if you take enough data points and allow trends to be spotted.  An example of the data:

With some Excel jiggery-pokery I managed to get resting heart rate (blue line) and park run times (orange dots) on the same chart.  Here it is:

I like this chart as it tells a real story of the correlation between heart rate and fitness (or low parkrun time).

  • On the left hand side you can see when I first got my Fitbit, my resting heart rate was between 65 and 70 BPM and my parkrun time was ~21 minutes.  
  • I then got injured in summer 2015 and there was a gap when I didn't run at all.  At the end of this my resting heart rate was over 70 BPM so I was relatively unfit.
  • I then made a comeback in Autumn 2015, started running regularly and by Spring 2016 had a sub 60 resting heart rate and was running sub 20 minute parkruns.
  • Then the summer 2016 came and for various reasons (holidays, kids' activities, doing cycling) and my heart rate crept up to nearly 60 BPM and my parkrun time went back to the ~21 minute realm.
  • Then most recently I've done a strong block of focused running training, my heart rate is at 55 BPM (lowest ever recorded) and I'm back to 20 minute and sub 20 minute parkruns.

I love graphs like this!  My view is that running is an "honest" sport, the more you put in the more you get out, and this graph underlines this point.

Saturday, 25 June 2016

My Geek Dad "Survival" Tin

It was Father's Day in the UK and I was lucky enough to receive a fine tin from my youngest daughter.  Her cub pack all got one and they all inscribed them with personal messages for their Dad's.

Here's a picture of the tin:

So what to do with such an awesome tin?  Some quick Googling showed me there's a online "movement" where people define the optimal set of kit that can be stored in a small tin (usually an Altoids tin) to be used in survival scenarios.

Now living in the leafy English suburbs and rarely venturing into the wilds I thought it unlikely that I'll ever need a full survival kit.  Instead I tried to think about all the things that a Geek who does a bit of sport and likes days out with the family would need.

Here's a picture of the kit I've selected initially.  I'm sure I'll evolve it over time:

From left to right, top to bottom you can see:
  • A shoe lace.  Useful for family walks if a lace breaks.
  • Fishin' line, hooks and a weight.  If we're out and about and want to do a  spot of fishin'
  • Zinc oxide tape.  Useful strapping up painful places when doing sport.
  • An elastic band.  Useful for pinging at annoying people.
  • Two mini cable ties.  Ever useful for fixing stuff.
  • A pencil.  If I need to jot down a geek idea.
  • A Swiss Army Knife.  Loads of tools for multiple scenarios.
  • Sticking plasters (Band Aids for you in the U S of A).
  • A sewing kit.  If I get the urge to do a spot of embroidery.
  • Safety pins.  For pinning on race day numbers and fixing ripped clothes.
  • Paper clips.  Useful for clipping paper together.  Could be used to get a SIM out a Smartphone.
Here it all is packed tidily in the tin.  There's room for more and I'm sure to evolve it over time!

Sunday, 17 April 2016

Strava and Fitbit API Mash-up Using Raspberry Pi and Python

Previously I blogged on how I used data from the Fitbit API to look at cadence information gathered during runs I'd logged on Strava.

Now that was all very good and informative but:
  • I analysed the Strava results manually.  i.e. Stepped through the website and noted down dates, times and durations of runs.
  • I used the Fitbit OAUTH1.0 URL builder website.  Very manual and using OAUTH1.0, (since deprecated, see here on using OAUTH2.0). it  was time to automate the process and upgrade to OAUTH2.0!  Hence it was time to break out the Raspberry Pi and get coding.

Full code at the bottom of this post (to not interrupt the flow) but the algorithm is as follows:
  • Loop, pulling back activity data from the Strava API (method here).
  • Select each Strava run (i.e. filter out rides and swims) and log key details (start date/time, duration, name, distance)
  • Double check if the date of the run was after I got my Fitbit (the FitbitEpoch constant).  If it is, form the Fitbit API URL using date and time parameters derived from the Strava API output.
  • Call the Fitbit API using the OAUTH2.0 method.
  • Log the results for later processing.
...easy (the trickiest bit was date/time manipulation)!

This provides output like this:
pi@raspberrypi:~/Exercise/logs $ head steps_log_1.txt
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,1,09:02:00,100
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,2,09:03:00,169
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,3,09:04:00,170
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,4,09:05:00,171
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,5,09:06:00,172
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,6,09:07:00,170
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,7,09:08:00,170
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,8,09:09:00,170
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,9,09:10:00,168
2016-04-16T09:02:03Z_4899.7,Parkrun 20160416,10,09:11:00,170

So a date and time,name of the run, minute of run and step count.

So easy to filter out interesting runs to compare:
pi@raspberrypi:~/Exercise/logs $ grep 20150516 steps_log_1.txt > parkrun_steps_1.txt
pi@raspberrypi:~/Exercise/logs $ grep 20160416 steps_log_1.txt >> parkrun_steps_1.txt

Then import to R for post analysis and plotting:
> parkrun1 <- read.csv(file=file.choose(),head=FALSE,sep=",")

> colnames(parkrun1) <- c("DateTimeDist","Name","Minute","TimeOfDay","Steps") 
> head(parkrun1)
                 DateTimeDist                              Name Minute TimeOfDay Steps
1 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      1  09:00:00    85
2 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      2  09:01:00   105
3 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      3  09:02:00   107
4 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      4  09:03:00   136
5 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      5  09:04:00   162
6 2015-05-16T09:00:00Z_5000.0 Naked Winchester Parkrun 20150516      6  09:05:00   168

> ggplot(data = parkrun1, aes(x = Minute, y = Steps, color = Name)) 
+ geom_point() + geom_line() 
+ labs(x="Minute", y="Steps") + ggtitle("Running Cadence - Parkrun") 

Yielding this graph:

Interesting that I took longer to get going on the 2015 run, maybe there was congestion at the start.  The key thing I was looking for was the "steady state" cadence comparison between 2015 and 2016.  It's higher in 2016 which is exactly what I wanted to see as it's something I've worked on improving.

Using the same method I plotted the chart below which shows a long run prior to a half-marathon then the half-marathon itself:

Now this is interesting. The cadence was slightly higher for the whole of the training run (blue line) and much more consistent.  For the half-marathon itself (red line) my cadence really tailed off which is in tune with my last post where I analysed my drop off in speed over the final quarter of the run.

Here's all the code.  Modify for your API credentials, file system and Fitbit "Epoch" accordingly:

pi@raspberrypi:~/Exercise $ more
#here's a typical Fitbit API URL
#FitbitURL = ""

import urllib2
import base64
import json
from datetime import datetime, timedelta
import time
import urllib
import sys
import os

#The base URL we use for activities
BaseURLActivities = "<Strava_Token_Here>per_page=200&page="
StepsLogFile = "/home/pi/Exercise/logs/steps_log_1.txt"

#Start element of Fitbit URL
FitbitURLStart = ""

#Other constants
MyFitbitEpoch = "2015-01-26"

#Use this URL to refresh the access token
TokenURL = ""

#Get and write the tokens from here
IniFile = "/home/pi/Exercise/tokens.txt"

#From the developer site
OAuthTwoClientID = "FitBitClientIDHere"
ClientOrConsumerSecret = "FitbitSecretHere"

#Some contants defining API error handling responses
TokenRefreshedOK = "Token refreshed OK"
ErrorInAPI = "Error when making API call that I couldn't handle"

#Get the config from the config file.  This is the access and refresh tokens
def GetConfig():
  print "Reading from the config file"

  #Open the file
  FileObj = open(IniFile,'r')

  #Read first two lines - first is the access token, second is the refresh token
  AccToken = FileObj.readline()
  RefToken = FileObj.readline()

  #Close the file

  #See if the strings have newline characters on the end.  If so, strip them
  if (AccToken.find("\n") > 0):
    AccToken = AccToken[:-1]
  if (RefToken.find("\n") > 0):
    RefToken = RefToken[:-1]

  #Return values
  return AccToken, RefToken

def WriteConfig(AccToken,RefToken):
  print "Writing new token to the config file"
  print "Writing this: " + AccToken + " and " + RefToken

  #Delete the old config file

  #Open and write to the file
  FileObj = open(IniFile,'w')
  FileObj.write(AccToken + "\n")
  FileObj.write(RefToken + "\n")

#Make a HTTP POST to get a new
def GetNewAccessToken(RefToken):
  print "Getting a new access token"

  #RefToken = "e849e1545d8331308eb344ce27bc6b4fe1929d8f1f9f3a056c5636311ba49014"

  #Form the data payload
  BodyText = {'grant_type' : 'refresh_token',
              'refresh_token' : RefToken}
  #URL Encode it
  BodyURLEncoded = urllib.urlencode(BodyText)
  print "Using this as the body when getting access token >>" + BodyURLEncoded

  #Start the request
  tokenreq = urllib2.Request(TokenURL,BodyURLEncoded)
  #Add the headers, first we base64 encode the client id and client secret with a : inbetween and create the authorisation header
  tokenreq.add_header('Authorization', 'Basic ' + base64.b64encode(OAuthTwoClientID + ":" + ClientOrConsumerSecret))
  tokenreq.add_header('Content-Type', 'application/x-www-form-urlencoded')

  #Fire off the request
    tokenresponse = urllib2.urlopen(tokenreq)

    #See what we got back.  If it's this part of  the code it was OK
    FullResponse =

    #Need to pick out the access token and write it to the config file.  Use a JSON manipluation module
    ResponseJSON = json.loads(FullResponse)

    #Read the access token as a string
    NewAccessToken = str(ResponseJSON['access_token'])
    NewRefreshToken = str(ResponseJSON['refresh_token'])
    #Write the access token to the ini file

    print "New access token output >>> " + FullResponse
  except urllib2.URLError as e:
    #Gettin to this part of the code means we got an error
    print "An error was raised when getting the access token.  Need to stop here"
    print e.code

#This makes an API call.  It also catches errors and tries to deal with them
def MakeAPICall(InURL,AccToken,RefToken):
  #Start the request
  req = urllib2.Request(InURL)

  #Add the access token in the header
  req.add_header('Authorization', 'Bearer ' + AccToken)

  print "I used this access token " + AccToken
  #Fire off the request
    #Do the request
    response = urllib2.urlopen(req)
    #Read the response
    FullResponse =

    #Return values
    return True, FullResponse
  #Catch errors, e.g. A 401 error that signifies the need for a new access token
  except urllib2.URLError as e:
    print "Got this HTTP error: " + str(e.code)
    HTTPErrorMessage =
    print "This was in the HTTP error message: " + HTTPErrorMessage
    #See what the error was
    if (e.code == 401) and (HTTPErrorMessage.find("Access token invalid or expired") > 0):
      return False, TokenRefreshedOK
    elif (e.code == 401) and (HTTPErrorMessage.find("Access token expired") > 0):
      return False, TokenRefreshedOK
    #Return that this didn't work, allowing the calling function to handle it
    return False, ErrorInAPI

#This function takes a date and time and checks whether it's after a given date
def CheckAfterFitbit(InDateTime):
  #See how many days there's been between today and my first Fitbit date.
  StravaDate = datetime.strptime(InDateTime,"%Y-%m-%dT%H:%M:%SZ")    #First Fitbit date as a Python date object
  FitbitDate = datetime.strptime(MyFitbitEpoch,"%Y-%m-%d")                   #Last Fitbit date as a Python date object

  #See if the provided date is greater than the Fitbit date.  If so, return True, else return  false
  if ((StravaDate - FitbitDate).days > -1):
    return True
    return False

#Forms the full URL to use for Fitbit.  Example:
def FormFitbitURL(URLSt,DtTmSt,Dur):
  #First we need to add the date component which should be the first part of the date and time string we got from Strava.  Add the next few static bits as well
  FinalURL = URLSt + DtTmSt[0:10] + "/1d/1min/time/"

  #Now add the first time part which is also provided as a parameter. This will take us back to the start of the minute STrava started which is what we want
  FinalURL = FinalURL + DtTmSt[11:16] + "/"

  #Now we need to compute the end time which needs a bit of maths as we need to turn the start date into a Python date object and then add on elapsed seconds,
  #turn back to a string and take the time part
  StravaStartDateTime = datetime.strptime(DtTmSt,"%Y-%m-%dT%H:%M:%SZ")

  #Now add elapsed time using time delta function
  StravaEndDateTime = StravaStartDateTime + timedelta(seconds=int(Dur))
  EndTimeStr = str(StravaEndDateTime.time())

  #Form the final URL
  FinalURL = FinalURL + EndTimeStr[0:5] + ".json"
  return FinalURL

#@@@@@@@@@@@@@@@@@@@@@@@@@@@This is the main part of the code
#Open the file to use
MyFile = open(StepsLogFile,'w')

#Loop extracting data.  Remember it comes in pages.  Initialise variables first, including the tokens to use
EndFound = False
LoopVar = 1
AccessToken = ""
RefreshToken = ""

#Get the tokens from the config file
AccessToken, RefreshToken = GetConfig()

#Main loop - Getting all activities
while (EndFound == False):
  #Do a HTTP Get - First form the full URL
  ActivityURL = BaseURLActivities + str(LoopVar)
  StravaJSONData = urllib2.urlopen(ActivityURL).read()

  if StravaJSONData != "[]":   #This checks whether we got an empty JSON response and so should end
    #Now we process the JSON
    ActivityJSON = json.loads(StravaJSONData)

    #Loop through the JSON structure
    for JSONActivityDoc in ActivityJSON:
      #See if it was a run.  If so we're interested!!
      if (str(JSONActivityDoc["type"]) == "Run"):
        #We want to grab a date, a start time and a duration for the Fitbit API.  We also want to grab a distance which we'll use as a grpah legend
        print "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
        StartDateTime = str(JSONActivityDoc["start_date_local"])
        StravaDuration = str(JSONActivityDoc["elapsed_time"])
        StravaDistance = str(JSONActivityDoc["distance"])

        StravaName = str(JSONActivityDoc["name"])

        #See if it's after 2015-01-26 which is when I got my Fitbit
        if CheckAfterFitbit(StartDateTime):
          #Tell the user what we're doing
          print "Strava Date and Time: " +  StartDateTime
          print "Strava Duration: " + StravaDuration
          print "Strava Distance: " + StravaDistance

          #Form the URL to use for Fitbit
          FitbitURL = FormFitbitURL(FitbitURLStart,StartDateTime,StravaDuration)
          print "Am going to call FitbitAPI with: " + FitbitURL

          #Make the API call
          APICallOK, APIResponse = MakeAPICall(FitbitURL, AccessToken, RefreshToken)
          #See how this came back.
          if not APICallOK:    #An error in the response.  If we refreshed tokens we go again.  Else we exit baby!
            if (APIResponse == TokenRefreshedOK):
              #Just make the call again
              APICallOK, APIResponse = MakeAPICall(FitbitURL, AccessToken, RefreshToken)
              print "An error occurred when I made the Fitbit API call.  Going to have to exit"

          #If we got to this point then we must have got an OK response.  We need to process this into the text file.  Format is:
          #print APIResponse
          ResponseAsJSON = json.loads(APIResponse)
          MinNum = 1    #Use this to keep track of the minute within the run, incrementing each time
          for StepsJSON in ResponseAsJSON["activities-steps-intraday"]["dataset"]:
            OutString = StartDateTime + "_" + StravaDistance + "," + StravaName + "," + str(MinNum) + "," + str(StepsJSON["time"]) + "," + str(StepsJSON["value"]) + "\r\n"
            #Write to file
            #Increment the loop var
            MinNum += 1

    #Set up for next loop
    LoopVar += 1
    EndFound = True

#Close the log file

Friday, 8 April 2016

Half Marathon Analysis Using Python and R

I recently did a half-marathon run and, whilst I did a personal best, I paced it really badly meaning the last 5km were hard and slow.  This, coupled with some interesting results data that was published gave me a Geek idea!

Here's a snippet from the results PDF:

So for all 10,000 runners a whole plethora of data to compare and contrast.  In particular I was interested in how my 5k splits compared with everyone else's.  Mine were:
  • 0-5km = 22m0s
  • 5-10km = 21m56s
  • 10-15km = 22m03s
  • 15-20km = 23m37s
So really consistent across the first 15km then a bad fade over the last 5.  In fact a quick calculation shows I was 7.4% slower for the last 5km than the average for the first 15km.  However before I beat myself up over this I needed to know whether this was typical, better than average or worse than average.

All the results were in a PDF and what a pain that turned out to be to turn into something I could process.  I tried various online services, saving as text in Adobe Acrobat, avoided the paid for Adobe service and tried a Python module called pyPdf but none would allow me to turn the PDF file into a well formed text file for processing. 

In the end I opened the PDF in Adobe Acrobat, copied all the data then pasted it into Windows Notepad.  The data looked like this (abridged):

GunPos RaceNo Name Gender Cat Club GunTime ChipPos ChipTime Chip 5Km Split Chip 10Km Split Chip 15Km Split Chip 20Km Split 

5 Robert Mbithi M 01:03:57 1 01:03:57 00:15:08 00:29:39 00:44:44 01:00:24 
72 Scott Overall M 01:05:13 2 01:05:13 00:15:33 00:30:46 00:46:13 01:01:59 
81 Chris Thompson M ALDERSHOT FARNHAM & DISTRICT AC 01:05:14 3 01:05:14 00:15:33 00:30:46 00:46:13 01:01:59 
6 Paul Martelletti M RUN FAST 01:05:15 4 01:05:15 00:15:33 00:30:46 00:46:13 01:01:59 
82 Gary Murray M 01:06:12 5 01:06:12 00:15:33 00:30:46 00:46:18 01:02:37 

I then had to work pretty hard to turn this into a file that could be read into my favourite analysis package, R.  Looking at the data above you can see:
  • There's spaces between fields.
  • There's spaces within fields.
  • There's missing fields (e.g. age and club).
  • The PDF format means some long fields overlap with each other.
I actually had to write quite a lot of Python script to turn this into a CSV.  I've put this in the bottom of this post so as to know interrupt the flow (baby). In the script I also added some fields where the hh:mm:ss format times were turned into simple seconds to help with the maths.

Eventually I had a CSV file to play with and I read it into R using this command:

> rhm1 <- read.csv(file=file.choose(),head=FALSE,sep=",")

I added some column names with:

> colnames(rhm1) <- c("GunPos","RaceNo","Gender","Name","AgeCat","Club","GunTime","GunTimeSecs","ChipPos","ChipTime","ChipTimeSecs","FiveKSplit","FiveKSplitSecs","TenKSplit","TenKSplitSecs","FifteenKSplit","FifteenKSplitSecs","TwentyKSplit","TwentyKSplitSecs")

I then computed the net 5k splits (from the elapsed time) with:
> rhm1$TenKSplitSecsNet <- rhm1$TenKSplitSecs - rhm1$FiveKSplitSecs
> rhm1$FifteenKSplitSecsNet <- rhm1$FifteenKSplitSecs - rhm1$TenKSplitSecs
> rhm1$TwentyKSplitSecsNet <- rhm1$TwentyKSplitSecs - rhm1$FifteenKSplitSecs

I then computed the mean time for the first 3 splits:
> rhm1$FirstFifteenKMean <- (rhm1$FiveKSplitSecs + rhm1$TenKSplitSecsNet + rhm1$FifteenKSplitSecsNet) / 3

...and used this to compute the percentage difference in the last 5k split from the average of the first 3:
rhm1$Last5KDelta <- (rhm1$TwentyKSplitSecsNet - rhm1$FirstFifteenKMean) / rhm1$FirstFifteenKMean

Finding me in the data:
> rhm1[grep("Geek Dad",rhm1$Name),]
    GunPos RaceNo Gender       Name AgeCat Club  GunTime GunTimeSecs ChipPos 
869    869  13759      M   Geek Dad     40      01:39:12        5952     918 
ChipTime ChipTimeSecs FiveKSplit FiveKSplitSecs TenKSplit TenKSplitSecs 
01:34:54         5694   00:22:00           1320  00:43:56          2636      
    FifteenKSplitSecs TwentyKSplit TwentyKSplitSecs TenKSplitSecsNet 
                 3959     01:29:36             5376             1316             

FifteenKSplitSecsNet TwentyKSplitSecsNet FirstFifteenKMean Last5KDelta
                1323                1417          1319.667    0.073756

I could then plot all the data using ggplot.  I chose a density plot to look at the proportions of each "Last5KDelta" value.  Here's the command to create the plot (and add some formatting and labels).

> library(ggplot2)
qplot(rhm1$Last5KDelta, geom="density", main="Density Plot of Last 5K Delta",xlab="% Delta", ylab="Density", fill=I("blue"),col=I("red"), alpha=I(.2),xlim=c(-1,1))

So nice looking chart and I can see that there's more people who were slower in the final 5k (positive value) than faster.  Looking good!

However this didn't tell me whether I was better or worse than average.  For this I need a cumulative frequency plot.  This uses the stat_ecdf (empirical cumulative distribution function) to create the plot.  The command below does this, tweaks the x axis to make it tighter and puts in an extra a axis tick at 7% so I can see where "I" sit on the graph.

> chart1 <- ggplot(rhm1,aes(Last5KDelta)) + stat_ecdf(geom = "step",colour="red")  + scale_x_continuous(limits=c(-0.3,0.6),breaks=c(-0.3,-0.2,-0.1,0,0.07,0.1,0.2,0.3,0.4,0.5,0.6))
chart1 + ggtitle("Cumulative Frequency of Last 5K Delta") + labs(y="Cumulative Frequency")

So get in!  0.7% sits at less than 50% cumulative frequency!  More people faded more than me over the last 5k.

However somewhere behind me was a man carrying a fridge!  I decided to look at just those who completed the run in under 2 hours by doing:

> rhmSubTwo <- rhm1[rhm1$ChipTimeSecs<7200,]

Which gives this chart:

Darn it.  About 68% of this cohort faded less than me.  Not looking so good now...

What about those equal or better than me?
> rhmSubMe <- rhm1[rhm1$ChipTimeSecs<5694,]

Looking pretty poor now :-(.  I must pace myself better next time.

Here's all the Python code to create the .csv file:

InputFile = "/home/pi/Documents/RHM/VitalityReadingHalfMarathon_v2.txt"
OutputFile = "/home/pi/Documents/RHM/RHM.csv"

#Open the file
InFile = open(InputFile,'r')
OutFile = open(OutputFile,'w')

#This takes a time in h:m:s or similar and turns to seconds.
def TimeStringToSecs(InputString):
  #There are 2 cases
  #1)A proper time string hh:mm:ss
  #2)Something else with letters and numbers munged together
  if len(InputString) == 8:
    #Compute the time in seconds
    SecondsCount = (float(InputString[0:2]) * 3600) + (float(InputString[3:5]) * 60) + float(InputString[6:8])
    return str(SecondsCount)
    print "Got this weird string: " + InputString
    return "-1"

for i in range(1,10981):
  #Initialise variables
  Outstring = ""
  EndString = ""
  MidString = ""
  GenderFound = False

  #Read a line
  InString = InFile.readline().rstrip()

  print InString

  #Split the line based upon a space
  SplitStr1 = InString.split(" ")

  #We can rely on the first field which is gun position and second field which is race number.  But don't put Gun position as R ignores it!
  OutString = SplitStr1[1] + ","

  #We can also rely on the last 7 fields of the line which respectively are GunTime,ChipPos,ChipTime,5KSplit,10KSplit,15KSplit,20KSplit
  NumFields = len(SplitStr1)
  #Compute the end of the output string
  for z in range(7,0,-1):
    #print "z=" + str(z) + ".  Equates to" + SplitStr1[NumFields - z]
    EndString = EndString + SplitStr1[NumFields - z] + ","
    #Look up the time in seconds.  Not for case 6 which is the gun position
    if (z != 6):
      EndString = EndString + TimeStringToSecs(SplitStr1[NumFields - z]) + ","
  #Hardest bit last.  Name, Gender, Age and Club.  Gender is reliably there, except for long names where it gets mangled.
  #Hence find it and you know everything before is the name
  for a in range(0,len(SplitStr1)):
    if (SplitStr1[a] == "M") or (SplitStr1[a] == "F"):
      #THis is the position of the gender which is the "anchor" for everything else
      GenderPos = a
      #Add it to the middle part of the string.  No worries it's in different order to file.
      MidString = SplitStr1[a] + ","
      #Say we found the gender.
      GenderFound = True

  #Process for the case where gender was found
  if GenderFound:
    #Now we know everything before (exclusing first two numbers was the name.  Add the parts of the name together.  The below code should handle
    #complex names
    for b in range(2,GenderPos):
      MidString = MidString + SplitStr1[b]
      #See if it's not the last part of the name.  If not add a space
      if (b < (GenderPos - 1)):
        MidString = MidString + " "
        MidString = MidString + ","

    #Now test the part after the gender position.  If it's "U23" or a number(but not 100 as cllubs start with this!) then this is the age category
    if (SplitStr1[GenderPos + 1] == "U23"):
      MidString = MidString + SplitStr1[GenderPos + 1] + ","
      #Log where the club might start
      ClubStartPos = GenderPos + 2
    elif SplitStr1[GenderPos + 1].isdigit():
      if SplitStr1[GenderPos + 1] != "100":
        MidString = MidString + SplitStr1[GenderPos + 1] + ","
        #Log where the club might start
        ClubStartPos = GenderPos + 2
        MidString = MidString + ","
        #Log where the club might start
        ClubStartPos = GenderPos + 1
      MidString = MidString + ","
      #Log where the club might start
      ClubStartPos = GenderPos + 1

    #So now everything from ClubStartPos "might" be a club.  We can test this by seeing if what might be the club is actually gun
    #time which is 7th  from the end
    if (SplitStr1[ClubStartPos] == SplitStr1[NumFields - 7]):
      MidString = MidString + ","
      #Loop adding elements of the club
      for c in range(ClubStartPos,NumFields - 7):
        MidString = MidString + SplitStr1[c]
        #See whether to add a space
        if (c < (NumFields - 8)):
          MidString = MidString + " "
          MidString = MidString + ","
  else: #Where there is no gender.  Add commas to represent Name,Age and Club and somethign to say it was a long name!!!
    MidString = ",Long Name,,,"

  #print OutString
  #print MidString
  #print EndString

  print OutString + MidString + EndString
  OutFile.write(OutString + MidString + EndString + '\r\n')