journalism / citizen browser
title
Citizen Browser
published
01/05/2021
story series
Read at The Markup︎︎︎
methodology
Read at The Markup︎︎︎
Citizen Browser is the first application to track what is happening on Facebook in real time.
title
Citizen Browser
published
01/05/2021
story series
Read at The Markup︎︎︎
methodology
Read at The Markup︎︎︎
To gain a better understanding of the algorithmically broadcasted content on Facebook during the 2020 US elections and the 2021 elections in Germany, The Markup developed Citizen Browser. This application periodically scraped Facebook feeds from a paid panel of volunteers between 2020 and 2021. My role in this project included supporting the panelists in using the application, coordinating the technical development and maintenance of the app, providing training for journalists using the data, and analyzing data for story development.
Citizen Browser collected data from over 3,500 panelists in the US and 600 contributors from Germany, which included:
The application was designed to prioritize participant privacy. The raw data collected from the feeds remained private, and only redacted data that had been carefully parsed, removing personal and private information, was accessible to journalists and researchers for analysis. As the focus of the application was on Facebook's algorithm and ad targeting, no interactions, private posts, or comments from the volunteers were stored.
The application was designed to have minimal impact on the participants. They only needed to install the application and keep it running in the background.
Citizen Browser collected data from over 3,500 panelists in the US and 600 contributors from Germany, which included:
- 20 million Facebook posts with interaction metrics
- 57 million recommended Facebook groups
- 3 million targeted ads
The application was designed to prioritize participant privacy. The raw data collected from the feeds remained private, and only redacted data that had been carefully parsed, removing personal and private information, was accessible to journalists and researchers for analysis. As the focus of the application was on Facebook's algorithm and ad targeting, no interactions, private posts, or comments from the volunteers were stored.
The application was designed to have minimal impact on the participants. They only needed to install the application and keep it running in the background.
One of the main technical challenges was maintaining parsers that redacted private information. Facebook frequently updates its design and code elements, requiring testing of new parsing scripts to accommodate these changes. Since the code changes for different individuals at different times, we implemented a data pipeline with multiple parsers to effectively capture redacted data from each individual's interface.
In addition to project management and contributing to stories, I provided support to the panel of over 3,000 volunteers. Because of the attention to engineering and privacy considerations, I confidently addressed inquiries regarding user privacy and data protection. Originally intended to last a few months, the panel was extended to over a year. During this time, some panelists remained active throughout, while others temporarily left and later reinstalled the application due to technical issues or device replacements. As all user IDs were anonymous, I never connected the data to the individuals I supported. However, spending significant time on technical customer service fostered a deeper appreciation for and desire to support participants as collaborators rather than mere datasets.
In addition to project management and contributing to stories, I provided support to the panel of over 3,000 volunteers. Because of the attention to engineering and privacy considerations, I confidently addressed inquiries regarding user privacy and data protection. Originally intended to last a few months, the panel was extended to over a year. During this time, some panelists remained active throughout, while others temporarily left and later reinstalled the application due to technical issues or device replacements. As all user IDs were anonymous, I never connected the data to the individuals I supported. However, spending significant time on technical customer service fostered a deeper appreciation for and desire to support participants as collaborators rather than mere datasets.
Reporting highlights
Split Screen: A tool for comparing Facebook feeds
Using data from Citizen Browser panelists, Split Screen offers an interactive method for individuals to observe what various groups of people see on their Facebook feed. This tool provides a glimpse into news stories, group recommendations, and common hashtags seen by individuals who share demographic information such as gender, age, or political leanings.
How the far-right nationalist party dominated Facebook prior to German elections
In the weeks leading up to the election to replace long-serving Chancellor Angela Merkel in September, data from The Markup's Citizen Browser revealed that posts promoting the far-right nationalist political party Alternative für Deutschland (AfD) were shown to German panelists more than three times as often as posts from other political groups. This is the case even though our panel in Germany included more individuals who self-identify as supporters of the center-left Social Democratic Party and the center-right Christian democratic political alliance, rather than supporters of the AfD.
How Facebook allows advertisers to target protected groups
In late 2021, Facebook announced a change to its multibillion-dollar advertising system due to years of criticism for its practices. The change stated that companies buying ads would no longer be able to target people based on interest categories such as race, religion, health conditions, politics, or sexual orientation.
However, more than three months after the supposed implementation of this change, we discovered that ad targeting based on these categories was still available on Facebook's platform through clear proxies.
Project lead engineer and data analyst: Surya Mattu
API and data pipeline developer: Micha Gorelick
Project manager and data analyst: Angie Waller
App developers: Jeff Crouse and Ian Ardouin-Fumat
Infrastructure and security engineer: Simon Fondrie-Teitler Copy editor/producer: Jill Jaroff
Data pipeline developer: Leon Yin
App design and graphics: Sam Morris
Editors: Julia Angwin and Rina Palta
Split Screen: A tool for comparing Facebook feeds
Using data from Citizen Browser panelists, Split Screen offers an interactive method for individuals to observe what various groups of people see on their Facebook feed. This tool provides a glimpse into news stories, group recommendations, and common hashtags seen by individuals who share demographic information such as gender, age, or political leanings.
How the far-right nationalist party dominated Facebook prior to German elections
In the weeks leading up to the election to replace long-serving Chancellor Angela Merkel in September, data from The Markup's Citizen Browser revealed that posts promoting the far-right nationalist political party Alternative für Deutschland (AfD) were shown to German panelists more than three times as often as posts from other political groups. This is the case even though our panel in Germany included more individuals who self-identify as supporters of the center-left Social Democratic Party and the center-right Christian democratic political alliance, rather than supporters of the AfD.
How Facebook allows advertisers to target protected groups
In late 2021, Facebook announced a change to its multibillion-dollar advertising system due to years of criticism for its practices. The change stated that companies buying ads would no longer be able to target people based on interest categories such as race, religion, health conditions, politics, or sexual orientation.
However, more than three months after the supposed implementation of this change, we discovered that ad targeting based on these categories was still available on Facebook's platform through clear proxies.
Project lead engineer and data analyst: Surya Mattu
API and data pipeline developer: Micha Gorelick
Project manager and data analyst: Angie Waller
App developers: Jeff Crouse and Ian Ardouin-Fumat
Infrastructure and security engineer: Simon Fondrie-Teitler Copy editor/producer: Jill Jaroff
Data pipeline developer: Leon Yin
App design and graphics: Sam Morris
Editors: Julia Angwin and Rina Palta
related work