Automating iRacing Driver Stats: A Step-By-Step Guide

Automating iRacing Driver Stats: A Step-By-Step Guide

Note: This is a republishing of a post originally published on https://spassmacherrennsport.com on August 6, 2024.

If you manage a sim racing team, you will likely find yourself tracking all sorts of interesting data in an attempt to gain competitive advantage. In addition to tracking fastest and average lap times across various track and car conditions, I find it quite useful to keep an eye on drivers’ iRacing iRating and Safety Rating to estimate what split we are likely to find ourselves driving in for any given event. For endurance events, splits are determined based on the average iRating of the declared drivers for each team.

iRacing Terminology

iRating – a measure of a driver’s skill level and performance relative to other competitors and can range from 0 to over 10000, with higher iRatings indicating stronger drivers.

Safety Rating – a metric that reflects a driver’s ability to navigate races cleanly and without incidents, and is measured on a scale from 0.00 to 4.99, with higher values indicating safer driving habits.

Split – a division of players based on their iRating to create a balance of competitiveness and fairness among race participants, and preventing overcrowding on the track.

Tracking individual driver statistics along with aggregating and averaging them across the entire team using Google Sheets is pretty straight-forward, and might look like this:

GSheet displaying driver name, safety rating, and iRating

Manually managing the collection and entry of this data is not difficult, but it is also a task perfectly suited for automation, so that is my preference.

How It Used to Work

Many have used iRacing’s web user interface to find various statistics needed and then written scripts to “scrape” data from those resulting pages to work with it elsewhere. While this would have been a viable option to start, iRacing provided links to comma-separated value files containing driver statistics for each of the core driving categories on the service, aggregating and updating them regularly throughout the day, to make gathering and using data a little more approachable. The files (which are no longer accessible) were:

https://ir-data-now.s3.amazonaws.com/csv/Oval_driver_stats.csv
https://ir-data-now.s3.amazonaws.com/csv/Sports_Car_driver_stats.csv
https://ir-data-now.s3.amazonaws.com/csv/Formula_Car_driver_stats.csv
https://ir-data-now.s3.amazonaws.com/csv/Road_driver_stats.cs
https://ir-data-now.s3.amazonaws.com/csv/Dirt_Oval_driver_stats.csv
https://ir-data-now.s3.amazonaws.com/csv/Dirt_Road_driver_stats.csv

Tech Terminology

Python – a computer programming language often used to build websites and software, automate tasks, and analyze data.

Pandas – a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data. The name “Pandas” has a reference to both “Panel Data”, and “Python Data Analysis”.

Dataframe – a data structure constructed with rows and columns, similar to a database or Excel spreadsheet.

API – or Application Programming Interface, acts as the language that allows different software applications to talk to one another.

API Wrapper – a set of programming instructions that acts as a layer between an application and an API, simplifying interactions between the two.

JSON – or JavaScript Object Notation, is a text-based format for storing and exchanging data that can be read by humans and understood by computers.

Grabbing the data I needed from the appropriate CSV file and writing it into a Pandas dataframe was quite easy using Python:

url = 'https://s3.amazonaws.com/ir-data-now/csv/Road_driver_stats.csv'

# Fetch the CSV file using requests and load it into a DataFrame
response = requests.get(url)

# Write the content to a file
csv_file_path = 'Road_driver_stats.csv'
with open(csv_file_path, 'wb') as f:
    f.write(response.content)

# Read the CSV file with pandas
df = pd.read_csv(csv_file_path)

# Remove the CSV file
os.remove(csv_file_path)

# Filter for specific rows using the isin() method
drivers = ['Chip Witt', 'REDACTED', 'REDACTED', 'REDACTED', 'REDACTED']
df_filtered = df[df['DRIVER'].isin(drivers)]

# Extract specific columns using the loc[] method
df_extracted = df_filtered.loc[:, ['DRIVER', 'CLASS', 'IRATING']]

# Sort the resulting data using the sort_values() method
df_sorted = df_extracted.sort_values(by='DRIVER')

print(df_sorted)

 The resulting dataframe could then be easily written to a Google Sheet. Setting up this script to automatically run each time my sim rig PC started kept the information in the spreadsheet up-to-date and immediately accessible when needed.

That is, until July 8, 2024, when iRacing permanently shutdown their legacy web UI and removed access to the CSV files, leaving the new iRacing /data API as the only path to driver statistics. This required a new approach and an updated script.

The New Way

APIs are extremely useful, but can sometimes be tedious to code against directly. For that reason, I tend to look for wrappers that make accessing the API a tad easier, and make the code I write to get the job done more readable. Luckily, iracingdataapi (https://github.com/jasondilworth56/iracingdataapi) for Python is a well-maintained wrapper for iRacing’s new /data API.

API Authentication and Initiation

Before you can begin to use the API, you first need to authenticate. For automation tasks, you need to be mindful of security implications of your code, because, in most cases, the script performing the automated tasks will be acting as your proxy with your credentials. This means it is important to avoid hard-coding those credentials within the script where anyone with access to it can read them, and doing other silly things that deviate from secure software coding best practices. It is also important to take care that the script can’t “run-away” and do unintended things, making testing (both, happy and sad paths) a critical part of your work.

To authenticate, I chose to leverage the keychain and getpass Python modules to check to see if the needed credentials are already present in the logged-in user’s keychain (a reasonably secure credential facility, variations of which are built-in to MacOS, Windows, and Linux operating systems), and if they are not, collect them from the user at the command-line and store them securely in the keychain for future use (demanding the script be run at least once manually to collect the necessary credentials before scheduling the automated, on-going running of it). Once the credentials are available, the code initiates a connection to the API that will be used for making data calls throughout the remainder of the script. The process looks like this:

def get_iracing_credentials():

    # Check if credentials are stored in the keyring
    iracing_username = keyring.get_password("iracing", "username")
    iracing_password = keyring.get_password("iracing", "password")

    if not iracing_username or not iracing_password:
        # If credentials are not stored, prompt the user for them
        iracing_username = input("Enter iRacing username: ")
        iracing_password = getpass.getpass("Enter iRacing password: ")

        # Store the credentials in the keyring for future use
        keyring.set_password("iracing", "username", iracing_username)
        keyring.set_password("iracing", "password", iracing_password)

    return iracing_username, iracing_password

# Define the credential variables from the function
iracing_username, iracing_password = get_iracing_credentials()

# Initiate a data client connection to iRacing API using iracingdatapi
idc = irDataClient(username=iracing_username, password=iracing_password)

Note

With the 2024 iRacing Season 3 release, new /data/driver_stats_by_category endpoints were made available for downloading CSV versions of the driver stats previously available as CSVs on the member site. The iracingdataapi wrapper for Python was supposed to have support for these new endpoints as of version 1.2.1, but it was not present when I updated to that version. Instead, I accessed the driver stats I required using a different endpoint that was already supported by the wrapper.

Preparing to Collect Driver Statistics

Before we can collect statistics for individual drivers, we need to do a little work to identify each driver’s customer_id on the iRacing service by querying the /data/lookup/drivers endpoint with their name (the name must be the Display Name registered with iRacing). We accomplish this by way of a call to lookup_drivers through our wrapper client connection:

# List of driver names
driver_names = ['Chip Witt', 'REDACTED', 'REDACTED', 'REDACTED', 'REDACTED']

# Function to lookup driver IDs based on names
def get_driver_ids(driver_names):
    driver_ids = []
    for driver_name in driver_names:
        driver_id = idc.lookup_drivers(search_term=driver_name.strip())
        if driver_id:
            driver_ids.append(driver_id[0]['cust_id'])
    return driver_ids

After exploring the iRacing API, I found the /data/member/chart_data endpoint had exactly what I needed for collecting iRating and Safety Rating information for my drivers, and is made accessible through member_chart_data call using the wrapper. The comments documenting the function in the wrapper help us understand how to call it and what it will return:

Get the irating, ttrating or safety rating chart data of a certain category.

        Args:
            cust_id (int): the iRacing cust_id. Defaults to the authenticated member.
            category_id (int): 1 - Oval; 2 - Road; 3 - Dirt oval; 4 - Dirt road; 5 - Sports Car; 6 - Formula Car
            chart_type (int): 1 - iRating; 2 - TT Rating; 3 - License/SR

        Returns:
            dict: a dict containing the time series chart data given the matching criteria.

A call to category_id=5, chart_type=1 for each driver grabs the iRating information, and a call to category_id=5, chart_type=3 gets the Safety Rating information. Respectively, the returned JSON looks like this:

{
    'blackout': False, 
    'category_id': 5, 
    'chart_type': 1, 
    'data': [
        {'when': '2024-03-06', 'value': 1701}, 
        {'when': '2024-03-25', 'value': 1696}, 
        {'when': '2024-04-01', 'value': 1666}, 
        {'when': '2024-06-16', 'value': 1746}, 
        {'when': '2024-08-03', 'value': 1790}],
    'success': True, 
    'cust_id': xxxxx
}

…and:

{
    'blackout': False, 
    'category_id': 5, 
    'chart_type': 3, 
    'data': [
        {'when': '2024-03-06', 'value': 5499}, 
        {'when': '2024-03-30', 'value': 5459}, 
        {'when': '2024-04-02', 'value': 5464}, 
        {'when': '2024-04-12', 'value': 5437}, 
        {'when': '2024-04-20', 'value': 5415}, 
        {'when': '2024-04-23', 'value': 5430}, 
        {'when': '2024-08-03', 'value': 5411}], 
    'success': True, 
    'cust_id': xxxxx
}

The iRating JSON series is pretty straight-forward. The value is the iRating for the driver at the date specified by when. For my purposes, I am only interested in the last data value (the current iRating), so I will just grab that. The Safety Rating data requires a little bit more decoding to use; ultimately, I was able to deduce that the first digit in the returned value was a numeric representation of the driver’s license class. Mapping that out:

            license_classes = {
                '5': 'A',
                '4': 'B',
                '3': 'C',
                '2': 'D',
                '1': 'R'
            }

The remaining digits represent the decimal Safety Rating value, but are lacking the expected decimal point. In the example JSON above, the value ‘5411’ becomes ‘A 4.11’. Again, I’m only interested in the last value in the returned set, so the function to collect and transform the data in preparation for writing it to the dataframe becomes:

# Function to fetch iRating and Safety Rating data
def fetch_driver_data(driver_ids):
    driver_data = []
    for driver_id in driver_ids:
        try:
            irating_data = idc.member_chart_data(cust_id=driver_id, category_id=5, chart_type=1)
            srating_data = idc.member_chart_data(cust_id=driver_id, category_id=5, chart_type=3)

            irating_list = irating_data['data']
            final_irating_value = irating_list[-1]['value']

            srating_list = srating_data['data']
            last_srating_value = srating_list[-1]['value']

            license_classes = {
                '5': 'A',
                '4': 'B',
                '3': 'C',
                '2': 'D',
                '1': 'R'
            }

            srating_str = str(last_srating_value)
            license_class = license_classes.get(srating_str[0], 'Unknown')
            formatted_srating = f"{srating_str[1]}.{srating_str[2:]}"
            safety_rating_result = f"{license_class} {formatted_srating}"

            driver_data.append({
                'cust_id': driver_id,
                'iRating': final_irating_value,
                'Safety Rating': safety_rating_result
            })
        except Exception as e:
            print(f"No stats data collected for driver {driver_id}. Error message: {e}")
    return driver_data

Collecting, Organizing, and Writing Data to the Dataframe

Consolidating the above work, writing the data to a dataframe just requires that we call the functions and do a little bit of reorganization of the returned data:

# Retrieve driver IDs
driver_ids = get_driver_ids(driver_names)

# Fetch driver data
driver_data = fetch_driver_data(driver_ids)

# Combine driver names with their respective data
combined_data = []
for driver_name, data in zip(driver_names, driver_data):
    combined_data.append({
        'Driver': driver_name,
        'SR': data['Safety Rating'],
        'iR': data['iRating']
    })

# Create DataFrame
df = pd.DataFrame(combined_data)

Publishing the Data to a Google Sheet

Setting up Google Sheets API access appropriately is beyond the scope of this article. That being said, I recommend setting up OAuth access to discrete services separately, properly securing your access tokens so that others cannot gain unapproved access to them, and following instructions for how and where you allow access to edit documents using the API.

To facilitate this script being able to write to a Google Sheet, I chose to leverage the gspreadgspread_dataframe, and oauth2client.service_account Python modules. I do not care about persisting data previously written to the Google Sheet, and, after each successful writing of new data, I want to post a timestamp to let any viewer of the sheet know when the data was last updated, and to provide myself with a very helpful indication of whether or not the script is continuing to run properly. The code for all of this is:

## Write DataFrame to Google Sheets

# Authenticate using a service account key file
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('/path/to/keyfile.json', scope)
client = gspread.authorize(creds)

# Access the existing sheet by id
sheet_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
sheet = client.open_by_key(sheet_id)
tab = sheet.worksheet('Sheet1')

# Clear sheet and write dataframe
tab.clear()
set_with_dataframe(worksheet=tab, dataframe=df, include_index=False, include_column_header=True, resize=False)
tab.update('A8', "TEAM")
tab.update('C8', "=AVERAGE(C2:C6)", raw=False)
tab.format('A8:C8', {'textFormat': {'bold': True, 'fontSize': 12}})
tab.update('A12', "Last Updated:")
tab.update('B12', now)

print("Data written to Google Sheets successfully.")

Putting It All Together

Now that we’ve automated the “dirty work” (or “busy work”) with a script, we can stop there…and simply run the script anytime we want to update our spreadsheet (or print the information for our team’s drivers to screen), but I’m lazier than that. The final step for me is to set this script to run periodically on its own so the data in my spreadsheet is always current. Most operating systems have some sort of task scheduler built-in, including MacOS (Automator), Windows (Scheduled Tasks), and Linux (cron), and those can certainly be useful to schedule various tasks however frequently you want them to run. Personally, I have chosen to leverage the Windows Subsystem for Linux, and have my script set to automatically run when that environment starts up at boot. Regardless of how you configure the schedule, you probably do not need it to run thai script more frequently than once a day; more than that is excessive, as you would be most likely to retrieve the same information over-and-over-again in any single day (unless your drivers are very active in iRacing official races).

I hope that this has inspired you to dig in a little deeper with Python, the iRacing /data API, or the Google API, as all of them can result in hours of entertainment and, ultimately, many hours of saved effort performing mundane tasks. At the very least, I hope this has been insightful and worth your time. Thanks.

Please share your thoughts.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from Witt'z End Technologies

Subscribe now to keep reading and get access to the full archive.

Continue reading