Sunday, 20 March 2016

What to do with 3.5 Million Heart Rate Monitor Readings?

Previously on Paul's Geek Dad blog I've written about how the heart rate monitor readings from my Fitbit Charge HR seem to be showing I'm getting fitter.  Here's the latest chart:

A nice trend but massively aggregated and smoothed.  I decided to play with the data in it's rawest format possible to see what I could see.  Fitbit allow you access to their "intraday" data for personal projects if you ask them nicely.  The webpage explaining this is here and what they say is:

Access to the Intraday Time Series for all other uses is currently granted on a case-by-case basis. Applications must demonstrate necessity to create a great user experience. Fitbit is very supportive of non-profit research and personal projects. Commercial applications require thorough review and are subject to additional requirements. Only select applications are granted access and Fitbit reserves the right to limit this access. To request access, email

I've previously used this intraday data to look at my running cadence, using Fitbit API derived data to see whether attempts to change my running style were actually working.  Looking at the Fitbit API documentation I saw that heart rate data could be obtained at sub-minute granularity.  Whoopee!

An example URL to get 1 minute data from the Fitbit API using the OAUTH2.0 method I previously blogged about is:

...which yelds at the start (abridged):

{"activities-heart":[{"dateTime":"2015-03-01","value":{"customHeartRateZones":[],"heartRateZones":[{"caloriesOut":2184.1542,"max":90,"min":30,"minutes":1169,"name":"Out of Range"},{"caloriesOut":891.10584,"max":126,"min":90,"minutes":167,"name":"Fat Burn"},{"caloriesOut":230.65056,"max":153,"min":126,"minutes":23,"name":"Cardio"},{"caloriesOut":133.98084,"max":220,"min":153,"minutes":11,"name":"Peak"}],"restingHeartRate":66}}],"activities-heart-intraday":{"dataset":[{"time":"00:00:00","value":75},{"time":"00:00:15","value":75},{"time":"00:00:30","value":75},{"time":"00:00:45","value":75},{"time":"00:01:00","value":75},{"time":"00:01:15","value":75},{"time":"00:01:30","value":75},{"time":"00:01:45","value":75},{"time":"00:02:00","value":75},{"time":"00:02:15","value":75},{"time":"00:02:30","value":75},{"time":"00:02:45","value":75},{"time":"00:03:00","value":75},{"time":"00:03:15","value":75},{"time":"00:03:30","value":74},{"time":"00:03:40","value":72}

...and then ends...


So it seems it's not actually per second data, ( measurement per second), but rather a measurement every 10 to 15 seconds.  Which is enough I think!

What I wanted was every single sub-one minute record for the whole time I've had my Fitbit Charge HR( since Jan-15 to ~14 months at the time of writing).  I found that stretching out the time period for which "1sec" data is requested results in it being summarised to daily data.  Hence I needed to write a script to call the API multiple times and log the results.  Bring on the Python (my favourite programming language) on my Raspberry Pi 2.

The full scripted is pasted in below (you'll need to workout your own secret keys etc using my OAUTH2.0 method).  The core of it is my Fitbit OAUTH2,0 API script from before but I've added elements that takes a date range and makes one API call  per day.  Key elements:

  • Constants "StartDate" and "EndDate" that specify the range of dates to make API calls for.
  • Function "CountTheDays" that computes the number of days between the StartDate and EndDate constants.
  • A for loop that counts down in increments of 1 from the value returned by CountTheDays to 0.  This creates an index that is used for....
  • Function "ComputeADate" that takes the index and turns it back into a date string representing the number of days before EndDate.  This means we step from StartDate to EndDate, making....
  • Call to function "MakeAPICall" to actually make the call.
  • Code to take the API response JSON, strip out the key elements and write to a simple comma separated variable text file.

#Gets the heart rate in per second format, parses it and writes it to file.

import base64
import urllib2
import urllib
import sys
import json
import os
from datetime import datetime, timedelta
import time

#Typical URL for heart rate data.  Date goes in the middle
FitbitURLStart = ""
FitbitURLEnd = "/1d/1sec.json"

#The date fields.  Start date and end date can be altered to deal with the period you want to deal with
StartDate = "2016-03-10"
EndDate = "2016-03-16"

#Use this URL to refresh the access token
TokenURL = ""

#Get and write the tokens from here
IniFile = "/home/pi/fitbit/tokens.txt"

#Here's where we log to
LogFile = "/home/pi/fitbit/persecheartlog.txt"

#From the developer site
OAuthTwoClientID = "<ClientIDHere>"
ClientOrConsumerSecret = "<SecretHere>"

#Some contants defining API error handling responses
TokenRefreshedOK = "Token refreshed OK"
ErrorInAPI = "Error when making API call that I couldn't handle"

#Determine how many days to process for.  First day I ever logged was 2015-01-27
def CountTheDays(FirstDate,LastDate):
  #See how many days there's been between today and my first Fitbit date.
  FirstDt = datetime.strptime(FirstDate,"%Y-%m-%d")    #First Fitbit date as a Python date object
  LastDt = datetime.strptime(LastDate,"%Y-%m-%d")      #Last Fitbit date as a Python date object

  #Calculate difference between the two and return it
  return abs((LastDt - FirstDt).days)

#Produce a date in yyyy-mm-dd format that is n days before the end date to be processed
def ComputeADate(DaysDiff, LastDate):
  #Get today's date
  LastDt = datetime.strptime(LastDate,"%Y-%m-%d")      #Last Fitbit date as a Python date object

  #Compute the difference betwen now and the day difference paremeter passed
  DateResult = LastDt - timedelta(days=DaysDiff)
  return DateResult.strftime("%Y-%m-%d")

#Get the config from the config file.  This is the access and refresh tokens
def GetConfig():
  print "Reading from the config file"

  #Open the file
  FileObj = open(IniFile,'r')

  #Read first two lines - first is the access token, second is the refresh token
  AccToken = FileObj.readline()
  RefToken = FileObj.readline()

  #Close the file

  #See if the strings have newline characters on the end.  If so, strip them
  if (AccToken.find("\n") > 0):
    AccToken = AccToken[:-1]
  if (RefToken.find("\n") > 0):
    RefToken = RefToken[:-1]

  #Return values
  return AccToken, RefToken

def WriteConfig(AccToken,RefToken):
  print "Writing new token to the config file"
  print "Writing this: " + AccToken + " and " + RefToken

  #Delete the old config file

  #Open and write to the file
  FileObj = open(IniFile,'w')
  FileObj.write(AccToken + "\n")
  FileObj.write(RefToken + "\n")

#Make a HTTP POST to get a new
def GetNewAccessToken(RefToken):
  print "Getting a new access token"

  #Form the data payload
  BodyText = {'grant_type' : 'refresh_token',
              'refresh_token' : RefToken}
  #URL Encode it
  BodyURLEncoded = urllib.urlencode(BodyText)
  print "Using this as the body when getting access token >>" + BodyURLEncoded

  #Start the request
  tokenreq = urllib2.Request(TokenURL,BodyURLEncoded)

  #Add the headers, first we base64 encode the client id and client secret with a : inbetween and create the authorisation header
  tokenreq.add_header('Authorization', 'Basic ' + base64.b64encode(OAuthTwoClientID + ":" + ClientOrConsumerSecret))
  tokenreq.add_header('Content-Type', 'application/x-www-form-urlencoded')

  #Fire off the request
    tokenresponse = urllib2.urlopen(tokenreq)

    #See what we got back.  If it's this part of  the code it was OK
    FullResponse =

    #Need to pick out the access token and write it to the config file.  Use a JSON manipluation module
    ResponseJSON = json.loads(FullResponse)

    #Read the access token as a string
    NewAccessToken = str(ResponseJSON['access_token'])
    NewRefreshToken = str(ResponseJSON['refresh_token'])
    #Write the access token to the ini file

    print "New access token output >>> " + FullResponse
  except urllib2.URLError as e:
    #Gettin to this part of the code means we got an error
    print "An error was raised when getting the access token.  Need to stop here"
    print e.code

#This makes an API call.  It also catches errors and tries to deal with them
def MakeAPICall(InURL,AccToken,RefToken):
  #Start the request
  req = urllib2.Request(InURL)

  #Add the access token in the header
  req.add_header('Authorization', 'Bearer ' + AccToken)

  print "I used this access token " + AccToken
  #Fire off the request
    #Do the request
    response = urllib2.urlopen(req)
    #Read the response
    FullResponse =

    #Return values
    return True, FullResponse
  #Catch errors, e.g. A 401 error that signifies the need for a new access token
  except urllib2.URLError as e:
    print "Got this HTTP error: " + str(e)
    HTTPErrorMessage =
    print "This was in the HTTP error message: " + HTTPErrorMessage
    #See what the error was
    if (e.code == 401) and (HTTPErrorMessage.find("Access token invalid or expired") > 0):
      return False, TokenRefreshedOK
    elif (e.code == 401) and (HTTPErrorMessage.find("Access token expired") > 0):
      return False, TokenRefreshedOK
    #Return that this didn't work, allowing the calling function to handle it
    return False, ErrorInAPI

#Main part of the code
#Declare these global variables that we'll use for the access and refresh tokens
AccessToken = ""
RefreshToken = ""

print "Fitbit API Heart Rate Data Getter"

#Get the config
AccessToken, RefreshToken = GetConfig()

#Get the number of days to process for
DayCount = CountTheDays(StartDate,EndDate)

#Open a file to log to
MyLog = open(LogFile,'a')

#Loop for the date range
#Process each one of these days stepping back in the for loop and thus stepping up in time
for i in range(DayCount,-1,-1):
  #Get the date to process
  DateForAPI = ComputeADate(i,EndDate)

  #Say what is going on
  print ("Processing for: " + DateForAPI)

  #Form the URL
  FitbitURL = FitbitURLStart + DateForAPI + FitbitURLEnd

  #Make the API call
  APICallOK, APIResponse = MakeAPICall(FitbitURL, AccessToken, RefreshToken)

  if APICallOK:
    #We got a response, let's deal with it
    ResponseAsJSON = json.loads(APIResponse)

    #Get the date from the JSON response just in case.  Then loop through the JSON getting the HR measurements. 
    JSONDate = str(ResponseAsJSON["activities-heart"][0]["dateTime"])

    #Loop through picking out values and forming a string
    for HeartRateJSON in ResponseAsJSON["activities-heart-intraday"]["dataset"]:
      OutString = JSONDate + "," + str(HeartRateJSON["time"]) + "," + str(HeartRateJSON["value"]) + "\r\n"

      #Write to file
  else:  #Not sure I'm making best use of this logic.  Can tweak if necessary
    if (APIResponse == TokenRefreshedOK):
      print "Refreshed the access token.  Can go again"
      print ErrorInAPI


The code does the job; maybe the error handling could be better.  One thing I ran into was that Fitbit rate limit their API call to 150 calls per hour.  As I was grabbing nearly 14 months of data I found I hit the limit and had to wait for the hour to expire before I could re-start the script, (after editting the start and end dates).

The raw data output looks like:

pi@raspberrypi ~/fitbit $ head persecheartlog.txt


pi@raspberrypi ~/fitbit $ tail persecheartlog.txt

...and contained this many measurements:

pi@raspberrypi ~/fitbit $ wc -l persecheartlog.txt
3492490 persecheartlog.txt

So 3.5 million measuresments to play with.  Mmmmmmmmmmmmmm.

As I've been doing lots recently I used R to analyse the data.  I tried this on my  Raspberry Pi 2 and, whilst I could load and manipulate the data using my Pi, R kept crashing when I tried to graph the data :-(.  Hence I resorted to using my PC which is a bit boring but needs must...

Load the CSV file full of heart rate data:
> FitbitHeart1 <- read.csv(file="c:/myfiles/persecheartlog.txt",head=FALSE,sep=",")

Create useful column names:
> colnames(FitbitHeart1) <- c("Date","Time","HeartRate")

Add a Posix style date/time column to help with graphing:
> $DateTimePosix <- as.POSIXlt(paste(FitbitHeart1$Date,FitbitHeart1$Time,sep=" "))

And graph (this took about 5 mins to run on my PC, it got a bit hot and the fan went into over-drive)!
> library(ggplot2)
> qplot(DateTimePosix,HeartRate,data=FitbitHeart1,geom=c("point","smooth"),xlab="Date",ylab="Heart Rate (bpm)",main="Fitbit Heart Rate Data")

Yielding this interesting graph:

Hmmm, so this is what 3.5 million points plotted on a graph looks like!  Maybe a dense Norwegian fir tree forest in the dead of night!  I think there's beauty in any graph and, whilst this one only it's Dad can love I spot:
  • A regression line (blue) which is decreasing and so matches the Fitbit summarised chart and adds further proof that I'm getting fitter.
  • Gaps in the "trees" where my Fitbit has not been working for one reason or another*.
  • The bottom of the dense set of points (the tree roots maybe) nestling at about 50 beats per minute.  Just looking at the graph there currently appears to be more of these now than there were a year ago showing my resting heart rate is decreasing.
  • The "canopy" of the forest at ~125 bpm, meaning my heart generally sits within the range 50 to 125 bpm.
  • Numerous trees peaking above 125 bpm which must be when I exercise.  There's more of these trees now as I do more exercise.
OK, that's the Norwegian forest analogy stretched a bit too far...

So maybe I need to think a bit more as to what to do with 3.5 million heart rate date points.  Something for a future blog post...

(*This was where my Fitbit broke during an upgrade and the lovely people from Fitbit replaced it free-of-charge).