About us

About This Project

Hi, I am Andrea. I am a software engineer from Italy with a background in Web Information and Data Engineering. I have always been passionate about data, the web, and building small projects that turn curiosity into something tangible.

Every Monday, I enjoy watching HumanSafari and his adventures around the world. If you have seen his videos, you know he has a habit of exploring local supermarkets and, at some point, he started casually mentioning Nutella prices in different countries.

Why not build a simple data pipeline to analyze all his videos and extract every Nutella price he has ever mentioned?

For the curious and the geeks

Here is a more technical breakdown of how this dataset came to life:

  • It all started with scraping transcripts from Nicolò’s videos.
  • I used a simple regex search for "Nutella" to identify relevant mentions.
  • For each match, I extracted a context window and checked for nearby references to prices and weights.
  • I normalized the data using regex patterns to extract weight, local price, and currency.
  • When possible, I inferred the country from the video title.
  • Then I used AI to label each entry into ready, missing data, ambiguous, or false positive.
  • After that, I manually reviewed everything, often going back to the exact moment in the video referenced by the transcript.

The result

I may have missed a few entries here and there, but I managed to build a clean and usable dataset without watching thousands of hours of footage.

More importantly, I turned a fun idea into something you can explore, compare, and maybe even contribute to.

If you enjoy this project even half as much as I enjoyed building it, that is already a win 🙂

If you want to explore the project from another angle, open the guide, read the data information or submit a new observation.