journalism / pixel hunt

The first large-scale crowdsourced study of the presence of the Facebook Pixel and the data it collects in real-world scenarios.


title
Facebook Pixel Hunt

published
04/08/2022

story series
Read at The Markup︎︎︎

methodology
Read at The Markup︎︎︎




Have you ever shopped for a product online only to see the item appear over and over again in your Facebook or Instagram feeds? This is often the result of the Meta Pixel (formerly called the Facebook Pixel), a sophisticated snippet of computer code used for tracking you around the web, embedded in the HTML of a website. This code works by sending Meta, Facebook’s parent company, a detailed log of such user interactions as clicking on a product.

Few people are aware of how expansively Meta tracks their activities. Through Meta Pixel, for instance, a gambling app might notify Meta that you’ve registered with a particular email address—whether you have a Facebook account or not. The same tracking can occur as you submit an application for a student loan.

The Markup’s Facebook Pixel Hunt, in collaboration with Mozilla Rally, was the first large-scale crowdsourced study of the presence of the Meta Pixel and the data it collects in real-world scenarios—when logging in to websites, submitting forms, buying products, and during everyday browsing activities. Using network traffic data collected from participants who opted in to the study, we monitored the pervasiveness of the Meta Pixel on the internet and the kind of information the tracker collects.


Dataset

Within a few months, we collected a total of 2,635,130 pixel events, the majority of which (the first three noted below) are events sent by default:
  • 39.1% (1,032,218) are PageView events.
  • 22.8% (602,166) are Microdata events.
  • 14.6% (384,286) are SubscribedButtonClick events.
  • 9.0% (236,870) are standard events other than the default PageView.
  • 14.5% (381,919) are custom events, including 5,520 unique custom event types.

Our main objective was to analyze the data and identify instances where data sent to Meta's pixel violated their policies. We focused on health and financial data, as well as plain text email addresses and other private information. When we discovered these violations, we broadened our scope to determine if the violation was a common pattern within an industry or system, rather than an isolated mistake made by one website owner.

Once we discovered a pattern, we made efforts to recreate the captured data instances by conducting tests using sample data. These network traffic requests are saved in HTTP Archive format, or HAR, (a JSON-formatted archive file format for logging of a web browser's interaction with a site) for evidence to support the findings.

When site owners were made aware of the data they were sending to Meta, they most often were prompt in taking action to remove the tracking. Stories from this study had notable impact, including lawmakers demanding fines for tax filing companies that shared financial data and health companies notifying patients of data breaches in their online health systems.

Reporting highlights


title

Tax Filing Websites Have Been Sending Users’ Financial Information to Facebook

reported by
Simon Fondrie-Teitler, Angie Waller, and Colin Lecher

Major tax filing services, such as H&R Block, TaxAct, and TaxSlayer, were found transmitting sensitive financial information to Facebook via the Meta Pixel when Americans file their taxes online.

The data includes not only basic information like names and email addresses, but also more detailed data such as users' income, filing status, refund amounts, and the amount of college scholarships received by dependents.

impact
  • After our investigation, three congressional Democrats are demanding that the Internal Revenue Service investigate tax preparation companies for sharing sensitive taxpayer data with Facebook
  • Citing The Markup’s investigation, representatives sent a letter to the head of the Treasury Inspector General for Tax Administration, which provides independent oversight of the IRS, calling upon the IRS to conduct an investigation. The letter questions whether the companies violated laws requiring tax preparers to keep taxpayers’ information confidential. It was signed by Reps. Adam Schiff and Judy Chu of California and Raja Krishnamoorthi of Illinois.



title
Facebook Is Receiving Sensitive Medical Information from Hospital Websites

reported by
Todd Feathers, Simon Fondrie-Teitler, Angie Waller, and Surya Mattu

The Markup tested the websites of Newsweek’s top 100 hospitals in America. The Meta Pixel was found on 33 of them, sending Facebook a packet of data whenever a person clicked a button to schedule a doctor’s appointment. The data was connected to an IP address—an identifier that’s like a computer’s mailing address and can generally be linked to a specific individual or household—creating an intimate receipt of the appointment request for Facebook.

The Markup also found the Meta Pixel installed inside the password-protected patient portals of seven health systems. On five of those systems’ pages, we documented the pixel sending Facebook data about real patients. The data sent to hospitals included the names of patients’ medications, descriptions of their allergic reactions, and details about their upcoming doctor’s appointments.

impact
Since this reporting,  at least five class action lawsuits have been filed against Meta contending that the pixel’s data collection on hospital websites broke various state and federal laws. One, filed against the company on behalf of a Baltimore-based MedStar Health System patient, claims that Meta Pixels collected patient information from at least 664 different hospitals’ websites. The other lawsuits were brought on behalf of patients of Novant Health and hospitals in San Francisco, Los Angeles, and Chicago.

Credits
Lead Investigator Surya Mattu
Study Patner Mozilla Rally
Data Scientist Micha Gorelick
Research Manager Angie Waller
Infrastructure Engineer Simon Fondrie-Teitler
Data Editor Jeremy Singer-Vine
Copy Editor/Producer Jill Jaroff
Editors Rina Palta, Julia Angwin, Stephen J. Alder