I wanted to try and find some more interesting ways to visualise the data I get from the Fitbit API. My inspiration was a book called “Information is Beautiful”, a book I bought from a well-known South American river book company just before Christmas. Except from my foray into creating a Sleep infographic I’ve always been a bit conservative in terms of how I visualise data, relying on bog-standard, boring bar charts and scatter graphs. “Information is Beautiful” has many and various infographics that make analysing data accessible, intuitive and just, well, beautiful! That was my inspiration, here’s the journey I went on…
Here's what I produced, I'll then tell you how I did it!
I've had my Fitbit Charge HR for just over a year now so I thought I'd "celebrate" by analysing a whole years worth of data from the Fitbit API! To do this I used the OAUTH2.0 method I wrote about here.
To get a years worth of data I simply had to use the following URL for the API call:
https://api.fitbit.com/1/user/-/activities/steps/date/2016-01-31/1y.json
So this is asking for my step data (activities/steps) for the one year period up to and including 2016-01-31. The command I ran was:
sudo python fitbit_oauth_request_v1.py > 2016-01-31.json
...meaning the output was redirected to the file 2016-01-31.json. The content of the file looked like this (after trimming off some initial text that came from the print statements in the Python script):
more 2016-01-31.json
{"activities-steps":[{"dateTime":"2015-02-01"
,"value":"21803"},{"dateTime":"2015-02-02","value":"7324"},{"dateTime":"2015-02-03","value":"10293"},{"dateTime":"2015-02-04","value":"12714"},{"dateTime":"2015-02-05",
"value":"10383"},{"dateTime":"2015-02-06","value":"11496"},{"dateTime":"2015-02-07","value":"17795"},{"dateTime":"2015-02-08","value":"19735"},{"dateTime":"2015-02-09",
"value":"10808"},{"dateTime":"2015-02-10","value":"8897"},{"dateTime":"2015-02-11","value":"10106"},{"dateTime":"2015-02-12","value":"9779"},{"dateTime":"2015-02-13","v
alue":"9850"},{"dateTime":"2015-02-14","value":"12108"},{"dateTime":"2015-02-15","value":"27393"},{"dateTime":"2015-02-16","value":"12992"}
So a simple JSON structure that has one element per day of the year with a simple step count in it. I then transferred the JSON file to my PC to process it with R.
I loaded up the JSON structure in R using:
> library(jsonlite)
> stepdata2015 <- fromJSON(file.choose(),flatten=TRUE)
Where file.choose() means the Windows file chooser form is opened to allow you to select the JSON file. The data looked like this (abridged):
> stepdata2015
$`activities-steps`
dateTime value
1 2015-02-01 21803
2 2015-02-02 7324
3 2015-02-03 10293
4 2015-02-04 12714
5 2015-02-05 10383
Looking at the type of data I saw:
> stepdata2015[0]
named list()
So not the "data frame" I've worked with in the past. This was reflected in the fact that I couldn't manipulate the data in a similar way to how I'd done it in the past. So I turned it into a data frame by doing this:
> stepdata2015_df <- as.data.frame(stepdata2015)
...which made the data look like this (abridged):
> stepdata2015_df
activities.steps.dateTime activities.steps.value
1 2015-02-01 21803
2 2015-02-02 7324
3 2015-02-03 10293
4 2015-02-04 12714
5 2015-02-05 10383
Then I graphed the data using these commands:
> library(ggplot2)
> graphval <- qplot(activities.steps.dateTime, activities.steps.value, data=stepdata2015_df)
> graphval + labs(title="Fitbit Step Data - 2015",x = "Day",y = "Steps")
...which yielded this graph:
> stepdata2015_df$TimePosix <- as.POSIXct(stepdata2015_df$activities.steps.dateTime)
Then to turn the Y axis values into numbers I did:
> stepdata2015_df$StepsInt <- as.integer(stepdata2015_df$activities.steps.value)
...yielding:
> stepdata2015_df
activities.steps.dateTime activities.steps.value TimePosix StepsInt
1 2015-02-01 21803 2015-02-01 21803
2 2015-02-02 7324 2015-02-02 7324
3 2015-02-03 10293 2015-02-03 10293
4 2015-02-04 12714 2015-02-04 12714
5 2015-02-05 10383 2015-02-05 10383
Which means a much nicer looking graph which understands the X axis as a date and the Y axis as a number and intelligently provides fewer labels:
A nicer graph but really just a random collection of points to my eye. A bit of reading showed me you could add a smoother trendline to the chart by using a "geom" parameter and doing this:
> graphval <- qplot(TimePosix, StepsInt, data=stepdata2015_df,geom = c("point", "smooth"))
> graphval + labs(title="Fitbit Step Data - 2015",x = "Day",y = "Steps")
Yielding:
...which actually tells the story of my year quite nicely and shows how my step totals are really influenced by how much running I do. I started 2015 doing a little bit of running, did lots of running up to May/June, then cut back over the summer as I got injured and then did more towards the end of the year and into 2016 as I came back from injury. In fact, I've been really careful coming back from injury, increasing my weekly KM by no more than 10% and this is reflected in the gradient of the trendline.
I then decided the data needed aggregating into monthly totals and so did this:
> stepdata_2015_agg_sum <- aggregate(list(Steps = stepdata2015_df$StepsInt), list(month = cut(stepdata2015_df$TimePosix, "month")), sum)
Yielding (abridged):
> stepdata_2015_agg_sum
month Steps
1 2015-02-01 350767
2 2015-03-01 385209
3 2015-04-01 385578
4 2015-05-01 477423
5 2015-06-01 391484
I also decided to create my own infographic to visualise my month-on-month step count.
To calculate how many footsteps I needed to show on my visualisation I added some summaries:
> stepdata_2015_agg_sum$tenthoublocks <- stepdata_2015_agg_sum$Steps / 10000
> stepdata_2015_agg_sum$footsteps <- round(stepdata_2015_agg_sum$tenthoublocks, digits=0)
...yielding (abridged):
> stepdata_2015_agg_sum
month Steps tenthoublocks footsteps
1 2015-02-01 350767 35.0767 35
2 2015-03-01 385209 38.5209 39
3 2015-04-01 385578 38.5578 39
4 2015-05-01 477423 47.7423 48
5 2015-06-01 391484 39.1484 39
I then opened the data in Excel to graph it (or create a pictograph to use the proper lingo). Using this website to tell me how to create charts with images instead of boring bars I came up with the chart below. Each foot represents 10,000 steps:
I then thought I'd create my own! Each step on the “path” below represents 10,000 steps and I did it by manually copying, pasting and formatting in Excel:
Notwithstanding that months are of a different length, the infographic does nicely tally with my 2015 running profile of running a bit (Feb to April), running a lot (May - too much really), getting injured (June to October), getting back into running (October to Jan). It’s not the most beautiful infographic in the world and Mrs Geek thinks the footsteps look like butterflies but I’m happy with it!!
I think the standard Excel generated one was just fine!