Storing Strava Activity Data In Google Sheets

avatar

Welcome back. We've done a lot of good work so far in authenticating with the Strava API and getting some useful activity data back from Strava. In this article, we are going to store our data somewhere. In this instance, for something different, I thought I would use a google sheet to store the data. It is not the best option but I have used it before as a data store during prototyping applications and creating proof of concepts. It is also something a little different we can look at.

If you need to catch up on our previous articles, you can do so, by going to the following link:
https://hive.blog/hive-163521/@strava2hive/getting-started-using-the-strava-api
https://hive.blog/hive-163521/@strava2hive/updating-your-strava-access-tokens-with-the-strava-api
https://stemgeeks.net/hive-163521/@strava2hive/getting-activity-data-from-the-strava-api

For all our work so far, we have been using Python for our coding and to work with our Google Sheet, we are going to continue by using the pygsheets module. If you are not familiar with this module, it's all good, we are not going to do anything too crazy in this article, but you can also find the full documentation at the following link:
https://pygsheets.readthedocs.io/en/stable/index.html

To get this piece of code going, we need to do two major pieces of work.

  1. Get an authentication key from Google to allow the program to access and work with the google sheet.
  2. Set up the code that will use that authentication key, and store our data into the Google Sheet.

1. Get An Authentication Key From Google


Of course to work with Google Sheets, you need to have a valid Google account, and should be able to create google sheets already. I have created a sheet called NewUserRuns, which I am going to store my data in.

  • First go to the API's overview section in the Google Cloud console. The following link should get you there: https://console.cloud.google.com/apis/

  • If you don't have a project created already, create one and then click on the "Credentials" button on the left side menu, and click on the "Create Credentials" drop down menu.

  • Select "Service Account", providing a name, account ID and service account description and click on "Create and Continue"

  • You will then be able to click on the service account you just created and select "Create a New Key", and when presented with the option download the key as a json file. This json file will allow you to authenticate with the Google Sheet you just created.

  • Make sure you share the sheet with this service account. Select the sheet you want to work with and when you go to share this sheet, you will have the option to share to it with the new service account you just created.

2. Set Up The Code That Will Store The Data


In the previous article, although we had the option for a lot of data from Strava, we decided to keep things a little simple and only store the following:

  • Type
  • Date
  • Name
  • Distance
  • Duration
  • Link to activity

In the following form:

['Run', '2023-08-18', 'Lunch Run', '4.41', '24', 'https://www.strava.com/activities/9669641521']

We can seperate our new function into two sections. the first will authenticate with Google, access our Google Sheet and make sure we are working in the correct location. The second section will add all the items in the cells of the sheet.

The first part of our function is listed below. As you can see we start by taking all our parameters we have from the data we have retrieved from Strava:

import pygsheets

def record_post(activity_type, activity_date, activity_distance, activity_duration, activity_link):
  # Update the activity spreadsheet once you get the data from Strava
  gc = pygsheets.authorize(service_file='aithentication.json')
  sh = gc.open("NewUserRuns")
  wks = sh[0]
  cells = wks.get_all_values(majdim='ROWS', include_tailing_empty=False, include_tailing_empty_rows=False)

The second line starting with gc, uses the pygsheets.authorize module to use the JSON file we downloaded from the Google Cloud console that belongs to our Service Account. If you have stored your JSON file in a different location, you will need to place the full path after the service_file option. We then use this to authenticate with the NewUserRuns Google Sheet, and the sh[0] option points to the first work sheet. The final line we have listed basically finds the first empty row we have on the google sheet. This will make sure you are not copying over data that we already have in our sheet.

The next part of our function is listed below.

  # Add activity type
  cell_value = "A" + str(len(cells) + 1)
  wks.update_value(cell_value, athlete_type)
  # Now add activity date
  cell_value = "B" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_date)
  # Add activity name
  cell_value = "C" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_name)
  # Now add the activity distance
  cell_value = "D" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_distance)
  # Now add the activity duration
  cell_value = "E" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_duration)
  # Now add the activity link
  cell_value = "F" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_link)

The rest of the function basically steps through all of the parameters we have provided and adds it into the Google Sheet. We could have probably done this with a loop, but for now, I am stating each item to add to the sheet manually.

Last week, we created the process_activities() function. We can use this function to store our values in a variable first, and then loop through the items in the array and add them to the sheet.

athlete_activities = process_activities(ath_activities)
# athlete_activities is an array, 
# so we should be able to loop through them and pass them to our function

for i in len(athlete_activities):
  record_post(athlete_activities[i][0], athlete_activities[i][1], athlete_activities[i][2], athlete_activities[i][3], athlete_activities[i][4], athlete_activities[i][5] )

We have updated our full code below and if everything works as it should, our new spreadsheet should be updated like the image below. Please note, I did add column headings when I set up the Google Sheet:

Screenshot 2023-09-02 8.16.35 PM.png

We have a full program now that

  1. Verifies our Strava access_code is up to date, and if not updates it
  2. Access the Strava API and gets a list of activities from the user
  3. Gets the data returned from the API and adds it to a Google Sheet

For our next article, we are going to move onto something a little different and see our we can gather more details information on the Strava activity.

#!/usr/bin/env python

import os
import requests
import time
import re
import random
import string
import boto3
import pygsheets
from boto3.dynamodb.conditions import Key
from datetime import datetime, timedelta

def dynamo_access():
    client = boto3.client('dynamodb', region_name='ap-southeast-2', aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),)
    dynamodb = boto3.resource('dynamodb', region_name='ap-southeast-2', aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),)
    ddb_exceptions = client.exceptions
    return dynamodb

def refresh_access_token(athlete):
    # Update the strava access token every six hours
    athlete_vals = athlete[0]
    code_val = athlete_vals['strava_one_time']
    try:
        response = requests.post("https://www.strava.com/api/v3/oauth/token", params={'client_id': os.getenv('STRAVA_CLIENT_ID'), 'client_secret': os.getenv('STRAVA_SECRET'), 'code': code_val, 'grant_type': 'refresh_token', 'refresh_token': athlete_vals['strava_refresh_token']})
        access_info = dict()
        activity_data = response.json()
        access_info['access_token'] = activity_data['access_token']
        access_info['expires_at'] = activity_data['expires_at']
        access_info['refresh_token'] = activity_data['refresh_token']
        return access_info['access_token'], access_info['expires_at']
    except:
        print("Something went wrong trying to refresh the access token")
        return False

def access_strava_activities(athlete_access_token):
    # Pass the athlete access token to strava to get activities
    #strava_access_token = athletedb_response['Items'][0]['strava_access_token']
    bearer_header = "Bearer "  + str(athlete_access_token)
    t = datetime.now() - timedelta(days=7)
    parameters = {"after": int(t.strftime("%s"))}
    headers = {'Content-Type': 'application/json', 'Authorization': bearer_header}
    response = requests.get("https://www.strava.com/api/v3/athlete/activities?per_page=10", headers=headers, params=parameters)
    activity_data = response.json()
    return activity_data

def process_activities(data_from_strava):
    # Take the strava data and process is how you want
    activities = []
    for i in data_from_strava:
        #print(i)
        if i["type"] != "Run":
            continue
        else:
            activity_vals = []
            activity_vals.append(i["type"])
            date_val = i["start_date_local"]
            date = datetime.strptime(date_val, "%Y-%m-%dT%H:%M:%SZ")
            new_date_val = date.strftime("%Y-%m-%d")
            activity_vals.append(new_date_val)
            activity_vals.append(i["name"])
            distance = str(round(i["distance"] * .001, 2))
            activity_vals.append(distance)
            duration = str(round(i["moving_time"] / 60))
            activity_vals.append(duration)
            activity_vals.append("https://www.strava.com/activities/" + str(i["id"]))
            print(activity_vals)
            activities.append(activity_vals)
    return activities


def record_post(activity_type, activity_date, activity_distance, activity_duration, activity_link):
  # Update the activity spreadsheet once you get the data from Strava
  gc = pygsheets.authorize(service_file='aithentication.json')
  sh = gc.open("NewUserRuns")
  wks = sh[0]
  cells = wks.get_all_values(majdim='ROWS', include_tailing_empty=False, include_tailing_empty_rows=False)
  # Add activity type
  cell_value = "A" + str(len(cells) + 1)
  wks.update_value(cell_value, athlete_type)
  # Now add activity date
  cell_value = "B" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_date)
  # Add activity name
  cell_value = "C" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_name)
  # Now add the activity distance
  cell_value = "D" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_distance)
  # Now add the activity duration
  cell_value = "E" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_duration)
  # Now add the activity link
  cell_value = "F" + str(len(cells) + 1)
  wks.update_value(cell_value, activity_link)


dynamoTable = 'ai_test_athletes'

dynamodb = dynamo_access()
print("Scanning table")
table = dynamodb.Table(dynamoTable)
athletedb_response = table.query(KeyConditionExpression=Key('athleteId').eq('1778778'))
strava_expire_date = athletedb_response['Items'][0]['strava_token_expires']
print(strava_expire_date)

expire_time = int(strava_expire_date)
current_time = time.time()
expired_value = expire_time - int(current_time)
print(expired_value)
if expired_value > 0:
    print("Strava Token Still Valid")
else:
    print("Strava Token Needs To Be Updated")
    new_access_token, new_expire_date = refresh_access_token(athletedb_response['Items'])
    print("Now we have to update the database")
    print(new_access_token)
    table = dynamodb.Table(dynamoTable)
    athletedb_response = table.query(KeyConditionExpression=Key('athleteId').eq('1778778'))
    print(athletedb_response['Items'][0])
    print("Update Strava Token")
    response = table.update_item(
            Key={ 'athleteId': int('1778778')},
            UpdateExpression='SET strava_access_token = :newStravaToken', 
            ExpressionAttributeValues={':newStravaToken': new_access_token}, 
            ReturnValues="UPDATED_NEW")
    response = table.update_item(Key={'athleteId': int('1778778')}, UpdateExpression='SET strava_token_expires = :newStravaExpire', ExpressionAttributeValues={':newStravaExpire': new_expire_date}, ReturnValues="UPDATED_NEW")


strava_access_token = athletedb_response['Items'][0]['strava_access_token']
ath_activities = access_strava_activities(strava_access_token)

athlete_activities = process_activities(ath_activities)
# athlete_activities is an array, 
# so we should be able to loop through them and pass them to our function

for i in len(athlete_activities):
  record_post(athlete_activities[i][0], athlete_activities[i][1], athlete_activities[i][2], athlete_activities[i][3], athlete_activities[i][4], athlete_activities[i][5] )

Posted with STEMGeeks



0
0
0.000
0 comments