Booting · 00:00:00

Tom LEFRERE · Data Scientist

Raw data. A signal.

0%
EN FR

← Portfolio

· data · pandas · python

Any%English, how much of my YouTube is in English?

Python script to compute the share of my YouTube views in English via the YouTube Data API v3.

Any%English, how much of my YouTube is in English?

The project

A question had been nagging at me for a while: since I started using YouTube, what percentage of my videos have I watched in English? A bit of a silly question, sure, but one I thought would be a good proxy for my daily exposure to the language.

After checking the YouTube API, the catch was that the v3 only returns 50 history items at a time. Not enough for my consumption, so I had to get creative.

The solution

  • Export full history via Google Takeout.

  • Load and process with Pandas.

  • Send video IDs in batches of 50 to the YouTube Data API v3.

  • Pull the defaultAudioLanguage for each video.

  • Compute the final ratio.

Result

On my last 16,000 videos: 53% in French, the rest mostly English. More balanced than I thought, since my gut feeling was closer to 70/30 in favour of English.

Context

Personal project, born of a simple curiosity about my digital habits.

Tech stack

  • Python

  • Scrapy / Pandas

  • YouTube Data API v3