F1 Data Analysis - Who Won When Leclerc Was on Pole?

F1 Data Analysis - Who Won When Leclerc Was on Pole?

In today's F1 data analysis, we're going to explore something that's been bugging me for a while. Whenever Charles Leclerc grabs pole position, who ends up winning the race? Now, Leclerc's got pole 23 times, but surprisingly, he's only crossed the finish line first in 4 of those races. This got me thinking – what's going on in those other races?

Sure, I could just Google the stats, but where's the fun in that? Instead, I'm going to roll up my sleeves and dive into the data myself. Using the FastF1 Python library, I'll go through the stats to uncover the story behind each of Leclerc's pole positions.

Official Pole Position

When I first started looking at the numbers, I initially looked at both qualifying and race stats. My goal was to see who topped the qualifying charts and then cross-check it with the race results to see who won. But here's where it gets tricky, if a driver qualifies first but then receives a grid penalty, they don't officially get the pole position, that pole goes to the next driver in line.

When I pulled the data for pole positions from the qualifying results, it showed that Leclerc had 21 poles. However, this count didn't include two specific races, Mexico 2019 and Spa 2023. In these instances, Max Verstappen was the fastest in qualifying but got grid penalties, resulting in Leclerc inheriting the official pole positions.

This nuance in how pole positions are awarded is crucial for accurate analysis. It's not just about who's fastest on Saturday; it's about who actually starts the race at the front on Sunday.

If you are completely new to the FastF1 library or Python in general, please check out my introductory post below.

F1 Data Analysis with Python - the Basics
There’s this great library called ‘FastF1’ that can really simplify things for you. It’s designed to make accessing and analysing F1 data a lot more straightforward and less time-consuming.

Python Script

import fastf1
import logging
import pandas as pd
from tabulate import tabulate

logging.getLogger('fastf1').setLevel(logging.CRITICAL)
pd.set_option('display.max_rows', None)
fastf1.Cache.enable_cache('cache_dir')

final_list = []
headers = ['Year','Track', 'Pole', 'Winner', 'Lec Finish']

for year in range(2018, 2024):
    schedule = fastf1.get_event_schedule(year)
    for index, row in schedule.iterrows():
        if 'Pre-Season' in row['EventName']:
            pass
        else:
            track = row['Location']
            session = fastf1.get_session(year, track, 'R')
            session.load()
            df = session.results
            lec_poles = df[df['GridPosition'] == 1.0]['Abbreviation'].iloc[0]

            if lec_poles == 'LEC':
                each_race = []
                winner = df[df['ClassifiedPosition'] == '1']['FullName'].iloc[0]
                try:
                    lec_finish = int(df[df['Abbreviation'] == 'LEC']['ClassifiedPosition'].iloc[0])
                except:
                    lec_finish = df[df['Abbreviation'] == 'LEC']['Status'].iloc[0]
                each_race.extend([year, track, 'LEC', winner, lec_finish])
                final_list.append(each_race)

table = tabulate(final_list, headers=headers)
print(table)
        

In this piece of code, we're using Python to analyze F1 data, specifically focusing on races where Charles Leclerc started from pole position.

First off, we import the necessary libraries. fastf1 is our go-to for F1 data, logging to manage log levels, pandas for data and tabulate to present our data neatly in table format.

We start by setting up our environment. We adjust the logging level for fastf1 to CRITICAL to avoid unnecessary info messages. Then, we tweak pandas to show all rows when printing data frames. We also enable caching in fastf1 to speed up data retrieval.

Then, we create an empty list, final_list, to store our findings, and define our table headers for later display. We're interested in the year, track, pole sitter, race winner, and Leclerc's finishing position.

We loop through each F1 season from 2018 to 2023. For each year, we fetch the event schedule using fastf1.get_event_schedule(year). We iterate over each event, skipping 'Pre-Season' events. We're only interested in the main events.

For each race, we determine the track location and fetch the race session data. fastf1.get_session(year, track, 'R') gets us the race session ('R') for the given year and track. We load the session data, which includes all the details about that race.

In the next step, we use a DataFrame, df, to hold our session results. We're specifically looking for races where Leclerc started from pole (GridPosition == 1.0). We find the abbreviation of the pole sitter for each race, and if it's 'LEC' (Leclerc), we want to find the winner, if it's not Leclerc we don't care and move onto the next race.

For each of Leclerc's pole starts, we create a list, each_race, to store details. We find the race winner and Leclerc's finishing position. If Leclerc didn't finish the race, we record his status (like 'Retired' or 'DNF'). We add these details to our each_race list and append it to final_list.

Finally, we use tabulate to neatly print our final_list as a table with our predefined headers. This gives us a clear, organized view of the races where Leclerc started from pole, who won those races, and where Leclerc finished. Here is the table we all are waiting for.

  Year  Track              Pole    Winner            Lec Finish
------  -----------------  ------  ----------------  ------------
  2019  Sakhir             LEC     Lewis Hamilton    3
  2019  Spielberg          LEC     Max Verstappen    2
  2019  Spa-Francorchamps  LEC     Charles Leclerc   1
  2019  Monza              LEC     Charles Leclerc   1
  2019  Singapore          LEC     Sebastian Vettel  2
  2019  Sochi              LEC     Lewis Hamilton    3
  2019  Mexico City        LEC     Lewis Hamilton    4
  2021  Monte Carlo        LEC     Max Verstappen    Driveshaft
  2021  Baku               LEC     Sergio Perez      4
  2022  Sakhir             LEC     Charles Leclerc   1
  2022  Melbourne          LEC     Charles Leclerc   1
  2022  Miami              LEC     Max Verstappen    2
  2022  Barcelona          LEC     Max Verstappen    Turbo
  2022  Monaco             LEC     Sergio Perez      4
  2022  Baku               LEC     Max Verstappen    Power Unit
  2022  Le Castellet       LEC     Max Verstappen    Accident
  2022  Monza              LEC     Max Verstappen    2
  2022  Marina Bay         LEC     Sergio Perez      2
  2023  Baku               LEC     Sergio Perez      3
  2023  Spa-Francorchamps  LEC     Max Verstappen    3
  2023  Austin             LEC     Max Verstappen    Disqualified
  2023  Mexico City        LEC     Max Verstappen    3
  2023  Las Vegas          LEC     Max Verstappen    2

Closing Up

That's it for our analysis into Leclerc's pole positions and the race outcomes. It's always more fun to dig into the data ourselves, isn't it?

If you're into DIY stats like this, why not follow my blog for more? And if you've got any thoughts or feedback, I'd love to hear from you in the comments. Cheers and catch you in the next post.