Analysing this data has always been a bit laborious but my new found discovery of R (see here for how I used it for Fitbit data) means it's now easy.
Getting data from the Strava API is pretty easy, you just have to register, get a key and then you can use simple HTTP GET requests to get your data (the first link above shows how I did it). So no HTTP post, no forming payload, no 256 hashes or anything.
This makes it easy to import into R using the jsonlite library. Here's an example (assuming you've installed jsonlite):
library(jsonlite)
stravadata <- fromJSON('https://www.strava.com/api/v3/activities?access_token=7<your key here>&per_page=200&after=1420070400',flatten=TRUE)
This then yields a R data frame that you can manipulate. First have a quick look at the first row of the dataframe:
> stravadata[c(1),]
id resource_state external_id upload_id name distance moving_time elapsed_time total_elevation_gain type start_date
1 236833349 2 <NA> NA First swim of 2015 1300 2700 2700 0 Swim 2015-01-04T20:45:00Z
start_date_local timezone start_latlng end_latlng location_city location_state location_country start_latitude start_longitude
1 2015-01-04T20:45:00Z (GMT+00:00) Europe/London NULL NULL <NA> <NA> United Kingdom NA NA
achievement_count kudos_count comment_count athlete_count photo_count trainer commute manual private flagged gear_id average_speed max_speed total_photo_count
1 0 0 0 1 0 FALSE FALSE TRUE FALSE FALSE <NA> 0.481 0 0
has_kudoed average_cadence average_watts device_watts average_heartrate max_heartrate elev_high elev_low workout_type kilojoules athlete.id athlete.resource_state
1 FALSE NA NA NA NA NA NA NA NA NA 4309532 1
map.id map.summary_polyline map.resource_state
1 a236833349 <NA> 2
This gives you a good idea of interesting fields to further analyse:
- name = column 5
- distance = column 6
- type = column 10
..which lets you look at the data in a more refined format. So first row and the three columns listed above:
> stravadata[c(1),c(5,6,10)]
name distance type
1 First swim of 2015 1300 Swim
Before going much further I needed to filter the results to just show those for 2015 as my Strava API call would have included everything from 2016 to date as well. Do this by:
strava2015 <- stravadata[grep("2015-", stravadata$start_date), ]
...which yields this (just first 3 rows shown):
> strava2015[c(1:3),c(5,6)]
name distance
1 First swim of 2015 1300.0
2 HIIT 20150106 4716.1
3 HIIT 20140108 4709.2
Then picking out just the type and the distance:
> strava2015simple <- strava2015[,c(10,6)]
...and looking at first 3 rows of this:
> strava2015simple[c(1:3),]
type distance
1 Swim 1300.0
2 Ride 4716.1
3 Ride 4709.2
Making it very easy to compute some aggregated stats for distances for 2015:
First averages:
> stravaagg <- aggregate(list(Distance = strava2015simple$distance), list(Type = strava2015simple$type), mean)
> stravaagg
Type Distance
1 Ride 17765.398
2 Run 5487.856
3 Swim 1067.619
...then totals:
> stravaagg <- aggregate(list(Distance = strava2015simple$distance), list(Type = strava2015simple$type), sum)
> stravaagg
Type Distance
1 Ride 1030393.1
2 Run 334759.2
3 Swim 50178.1
So easy! (I'm not going to trouble the Brownlee brothers with these figures!)
No comments:
Post a Comment