INTRODUCTION
Beginning in 2018, on the heels of the Cambridge Analytica scandal, conversations about the scale and impact of political microtargeting (‘PMT’) began to significantly fuel the newscycle and shape the political agenda. The EU began its war on disinformation and convinced leading internet platforms, including Facebook, to self-regulate. As a result, internet users were offered new tools and disclaimers (see: Facebook moves towards transparency) meant to increase the transparency of targeted political ads. Researchers rushed to analyse data from Facebook’s ad library, hoping to find at least some answers to the very troubling questions raised by the Cambridge Analytica scandal.
These questions included: Is it possible to target people based on their psychometric profile, using only Facebook data? Under what circumstances does PMT have a significant impact on voters? How does PMT differ between the U.S. and Europe, given Europe’s more modest advertising budgets, stronger legal protections, and varied political cultures? Does the use of PMT call for new legal rules? Are we ready to name concrete problems, and identify which of them can be solved by regulators? And last but not least: Who should be the target of such regulation? Political parties and their spin doctors, because they commission targeted ads? Or Facebook and other internet platforms, because they deliver targeting results and build the algorithms, or “magic sauce,” that make it all possible?
Polish and European political momentum
With these questions in mind, Panoptykon Foundation, ePaństwo Foundation, and SmartNet Research & Solutions (all based in Poland but operating in international networks, see: About us) embarked on a joint research project titled “Political microtargeting in Poland uncovered: feeding domestic and European policy debate with evidence”. The project started in spring 2019 and was funded by the Network of European Foundations as part of the Civitates programme. The timing seemed ideal: Poland was in the middle of its electoral triathlon, which began with local elections (2018), continued through EU and domestic parliamentary elections (2019), and concluded with presidential elections (2020). We also cooperated with Who Targets Me to leverage their crowdsourced database of political advertisements from Facebook. (Beginning with the 2019 EU elections, the Who Targets Me plugin was available for Polish users).
Our research was based on two streams of data: data voluntarily shared with us by Polish Facebook users (via Who Targets Me plugin), and data obtained by us directly from Facebook (via their official ad library and API which was made available to researchers). Based on this data, we hoped to measure the value of Facebook’s new transparency tools as well as collect evidence on how targeted political advertising is used as a campaign tool in Poland. We believe European policy debates would benefit from more evidence on the use of PMT, especially in the light of upcoming regulation on internet platforms (the so-called Digital Services Act, to be proposed by the European Commission by the end of 2020).
Research questions and (missing) answers
When designing our research, we sought to answer a handful of important questions:
- Are voters’ vulnerabilities exploited by political advertisers or by the platform itself?
- Are there different political messages for different categories of voters?
- Are there any contradictions in messages coming from the same political actor?
- Is PMT used to mobilise or demobilise specific social groups?
- What is the role of “underground,” or unofficial, political marketing in this game of influence?
After 12 months of collecting and analysing data, we still don’t have all the answers. And we won’t have them until we are able to force online platforms, Facebook in particular, to reveal all the parameters and factors that determine which users see which ads. In the course of this research, we found the missing answers (e.g. what we still don’t know about Facebook’s “magic sauce”) as interesting as the evidence we did find. We have also discovered that country-specific regulations are not prepared for securing transparency of the election campaign. We hope that this work — in particular our recommendations — will be a meaningful contribution to the pool of research on the impact of PMT and the growing role of internet platforms in the game of political influence.
How to navigate the report
In the first part of the report, “Microtargeting, politics, and platforms,” you will find:
- An explanation of our choice to focus narrowly on PMT, given it is one of many techniques for influencing voters;
- An overview of the role of political parties, which use various targeting techniques (not restricted to PMT);
- An explanation of why online platforms, Facebook in particular, play a key role in microtargeting voters based on their behavioural data;
- An explanation of the ad targeting process and the roles that advertisers and Facebook play;
- And basic information about Facebook’s ad transparency tools and policies related to the use of PMT (self-regulation inspired by the European Commission).
In the second part of the report, “Microtargeting in the Polish 2019 election campaign. A case study,” you will find:
- An explanation of our research methodology and objectives;
- Key takeaways and collected evidence;
- Evaluation of transparency tools that are currently offered by Facebook;
- And a summary of what remains unknown and why these blind spots are so problematic.
In the third part of the report, “Recommendations,” you will find:
- Our recommendations for policymakers (based on our own work and the work of other researchers);
- Our interpretation of the GDPR in the context of PMT;
- Our proposition for new disclosures as well as new data management and advertising settings, which would help users control their data and manage their choices in the context of PMT.
Who is it for?
We drafted this report thinking about policymakers and experts interested in regulating PMT and transparency of election campaigns in social media, as well as fellow researchers and activists who struggle with our same questions. We are aware that this audience does not need an overview of existing literature on this topic. In the interest of producing a useful and concise report, we essentially limited its scope to the presentation of new evidence (i.e. the case study showing how PMT was used during the 2019 elections in Poland) and our recommendations, which we formulated in order to fuel the debate. Definitions, context, and explanation of the ad targeting process (Part 1 of the report) has been limited to the minimum. For those who are interested in a broader picture (going beyond the use of PMT on Facebook) or need more information about the targeting process, we have a reading list at the end of the report.
Part I. Microtargeting, politics, and platforms
1.Political microtargeting as (part of) the problem
Microtargeting and potential harms
Political campaigns increasingly use sophisticated campaigning strategies fueled by people’s personal data. One of these techniques is microtargeting: using voters’ data to divide them into small, niche groups and target them with messages tailored to their sensitive characteristics (such as psychometric profiles).
The UK Information Commissioner defines microtargeting as “targeting techniques that use data analytics to identify the specific interests of individuals, create more relevant or personalised messaging targeting those individuals, predict the impact of that messaging, and then deliver that messaging directly to them.” According to Tom Dobber, Ronan Ó Fathaigh and Frederik J. Zuiderveen Borgesius, the distinguishing feature of microtargeting is turning one heterogeneous group (e.g. inhabitants of a particular neighbourhood) into several homogeneous subgroups (e.g. people who desire more greenery in the city centre, people who use a particular type of SIM card, etc.). In the words of the researchers:
Micro-targeting differs from regular targeting not necessarily in the size of the target audience, but rather in the level of homogeneity, perceived by the political advertiser. Simply put, a micro-targeted audience receives a message tailored to one or several specific characteristic(s). This characteristic is perceived by the political advertiser as instrumental in making the audience member susceptible to that tailored message. A regular targeted message does not consider matters of audience heterogeneity.
Colin Bennett and Smith Oduro-Marfo observe that microtargeting can be conducted across a number of different variables: Not only the demographic characteristics of the audience, but also a clear geographic location, policy message, and means of communication:
The practice varies along a continuum with the “unified view” of the voter at one extreme end, and the mass general messaging to the entire population at the other. Most “micro-targeted” messages fall somewhere in between and are more or less “micro” depending on location, target audience, policy message, means of communication and so on. Thus, micro-targeted messages might be directed towards a precise demographic in many constituencies. But they may equally be directed towards a broader demographic within a more precise location. A precise and localised policy promise, for instance, might appeal to a very broad population within a specific region.
Existing research shows that political microtargeting (or PMT in this report) carries several risks related to its inherent lack of transparency. It may:
- Enable selective information exposure, resulting in voters’ biased perception of political actors and their agenda;
- Enable manipulation by matching messages to specific voters’ vulnerabilities;
- Fuel fragmentation of political debate, create echo chambers, and exacerbate polarisation;
- Enable an opaque campaign to which political competitors cannot respond;
- Be used to exclude specific audiences or discourage political participation;
- Make detection of misinformation more difficult;
- Facilitate non-transparent PMT paid by non-official actors;
- Raise serious privacy concerns.
Why we chose to focus on PMT
We acknowledge that political parties have many other tools at their disposal for influencing potential voters (see: Political parties as clients: hiding in the shadows). In this context, it is important to view PMT as one part of a much bigger picture. Nevertheless, in this report we will not look at the whole picture, but exclusively at the use of PMT. We chose this focus for the following reasons:
- Other researchers, civil society organisations, and investigative journalists are in a much better position than us to shed light on the advertising or communication practices of political actors. Our mission is to expose the practices of surveillance and the use of personal data to control human behaviour;
- Amongst many data-driven advertising practices, microtargeting — which is often based on users’ behavioural (observed) data and data inferred with the use of algorithms — poses specific threats and therefore deserves more attention as a relatively recent phenomenon in the world of advertising;
- Given technological trends such as the popularisation of smart, connected objects (IoT) and fast-growing investments in AI, it seems likely that in the coming years we will face an inflation of behavioural data and new breakthroughs in predictive data analysis;
- Unless restricted by legal regulations, both online platforms and their clients (including political advertisers) will be keen to experiment with these new sources of personal data and new predictive models in order to target ads even more efficiently;
- Facebook has already sent a strong signal to its clients that the best way to increase their audiences’ engagement on the platform is to choose sponsored content. Since Facebook’s business model is based on advertising revenue, we can expect that this trend will continue and — as the reach of organic posts continues to decline — political advertisers will be pushed to invest more money in techniques such as PMT;
- In the light of these trends, we saw the need to produce country-specific evidence on the use of PMT by Facebook (and its clients) during 2019 election campaigns, hoping to inform a pending political debate on whether the EU should introduce further regulations in this area.
2.Political parties as clients: hiding in the shadows
According to the British ICO, “as data controllers, political parties are the client for the political advertising model, and sit at the hub of the ecosystem of digital political campaigning.” At the same time, we know little about political parties’ data-driven techniques for voter analytics and political persuasion. The key problem here relates to non-transparent spending and the existence of the political marketing “underground” — unofficial actors like Cambridge Analytica acting on behalf of political parties or for their own interest. Current regulations written for traditional political campaigning do not account for problems generated by online platforms and data brokers, who are now key players in this game.
In their “Personal Data: Political Persuasion” report, the Tactical Tech Collective examines the “political influence industry” and the techniques at their disposal. The authors of the report categorise data-driven campaigning methods to “loosely reflect how value is created along the data pipeline” and distinguish three crucial types of techniques:
- Acquisition - data as a political asset: valuable sets of existing data on potential voters exchanged between political candidates, acquired from national repositories or sold or exposed to those who want to leverage them. This category includes a wide range of methods for gathering or accessing data, including consumer data, data available on the open internet, public data, and voter files.
- Analysis - data as political intelligence: data that is accumulated and interpreted by political campaigns to learn about voters’ political preferences and to inform campaign strategies and priorities, including the creation of voter profiles and the testing of campaign messaging. This includes techniques such as A/B testing, digital listening, and other methods for observing, testing, and analysing voters and political discussions.
- Application - data as political influence: data that is collected, analysed, and used to target and reach potential voters with the aim of influencing or manipulating their views or votes. This includes a diverse set of microtargeting technologies designed to reach individual types and profiles, from psychometric profiling to Addressable TV. The artful use of these techniques in unison has been touted by some campaigners as the key to their success.
In the Polish context, there is little we can do to uncover this kind of shadowy political influence, apart from looking at its online traces. Political microtargeting is one such trace. And Facebook is, at this point, the most open platform for documenting it. This realisation was one of the premises behind our research. There is, however, a second critical premise: Online platforms have their own tools of power and influence that matter in the context of political elections, and therefore should be subject to public scrutiny.
3.Facebook as a key player in the game
When we chose to focus on the use of PMT, it quickly became clear that we would also need to focus on Facebook.
In recent years, Facebook has established itself as a leader in the political advertising market. In the EU, its political advertising revenue dwarfs that of Google. According to self-assessment reports published by both companies, between March and September 2019 political advertisers spent €31 mln (not including the UK) on Facebook and only €5 mln on Google. Even taking into consideration Facebook’s broader definition of political and issue ads, the difference is immense.
While other platforms also allow for targeting users based on their behavioural data (e.g Google’s adwords are based on search results, which may include political content), Facebook has built its reputation as the most granular advertising interface. It has also boasted about its ability to match ads and users based on hidden (and sometimes vulnerable) characteristics.
In 2015, Facebook introduced the concept of custom audiences (see: Custom audience), allowing advertisers to create their own specific customer segments directly within their Facebook Ads Manager, without the need to extract the data and analyze it externally. In 2016, this feature was expanded and advertisers could also target “lookalikes” (see: Lookalikes) of their custom audience, benefiting from Facebook’s own data analytics. Since then, Facebook has been improving its matching algorithm. This effort resulted in new patents for models and algorithms, including one that allegedly allows for insight into users’ psychometric profiles based on seemingly benign data such as Likes.
In this context, it is hardly surprising that Facebook became the target of investigations after the Cambridge Analytica scandal. While media reports focused on Facebook’s ability to microtarget voters and exploit their vulnerabilities, regulators turned out to be more concerned with “security breach” — that is, allowing malicious third parties, such as Cambridge Analytica, to access users’ data. In the U.S., these allegations resulted in $5bn in fines from the Federal Trade Commission. In the UK, a similar investigation led by the Information Commissioner’s Office resulted in a fine of £500,000.
This narrow focus of public investigations shows the challenge we face, as a society, in understanding and exposing the true role that online platforms play in political influence. On the one hand, it is unquestionable that Facebook is an enabler of precision-targeted political messages, which requires access to behavioural (observed) data and sophisticated algorithms (both treated by the platform as its “property.”) On the other hand, there is scarce evidence on how intrusive Facebook’s behavioural profiles really are and to what extent users’ vulnerabilities are exploited. From a regulator’s perspective, it is much easier to argue that Facebook’s policy has led to a “security breach” than to prove that microtargeted ads were based on inferred behavioural data and, as such, violated users’ privacy and self-determination.
In this report, we argue that Facebook’s role in microtargeting voters cannot be underestimated. We believe it deserves at least the same attention and scrutiny as the role of so-called malicious third parties, be they data brokers like the infamous Cambridge Analytica or the proverbial Russian spin doctors.
Two pillars of Facebook’s power
Facebook’s ability to influence voters’ behaviour is supported by two pillars:
Data on voters’ behaviour
It has been established by research and journalistic investigations that Facebook constantly monitors users’ behaviour (e.g. their location, social interactions, emotional reactions to content, and clicking and reading patterns). Coupled with algorithmic processing and big data analysis, Facebook is then capable of inferring political opinions, personality traits, and other characteristics, which can be used for political persuasion. As a result, Facebook knows more about citizens than political parties do. Political parties can commission a social survey or even buy customer data (see above), but won’t ever be able to profile the whole population and will never be able to verify these opinions based on facts (actual behaviour). Meanwhile, an estimated 56% of people with voting rights in Poland are Facebook users, and the number is growing. Therefore, it is not surprising that political parties are eager to tap into Facebook’s data to reach potential voters.
Algorithm-assisted ad delivery
Facebook is not merely a passive intermediary between advertisers and users. Rather, the platform plays an active role in targeting ads: Its algorithms interpret criteria selected by advertisers and deliver ads in a way that fulfills advertisers’ objectives. This is especially pertinent in countries where political parties do not have access to voters’ personal data via electoral registries (as is the case in most European countries), or do not engage in sophisticated voter analytics. In such cases, Facebook and other online platforms offer political parties the means to reach specific groups without having to collect data.
4.Ad targeting explained
In this section, we explain key decisions made in the targeting process. Even though this process is initiated by the advertisers who create ad campaigns and choose the profile of their target audience, the key role still belongs to Facebook. In fact, the platform plays many roles: from data collection and analysis, which enables identifying users’ attributes for advertising purposes, to optimisation of ad delivery, and everything that happens in between. While political advertisers who use Facebook do make their own choices, their choices have been shepherded by Facebook and increasingly rely on data that was collected or inferred by Facebook.
Advertisers’ role: describing their audience
Facebook’s interface for advertisers allows them to select their target audience based on specific characteristics, which are defined by the platform. After making this selection, advertisers can assign their own names to the targeted audience and save for further use. When it comes to defining targeting criteria and campaign strategies, advertisers have the following criteria at their disposal:
- Demographics
Advertisers may select age range and gender, as well as the language spoken by users they wish to target. - Location
Advertisers may select different types of locations: from larger areas, such as country, state, region, city, and election districts (in the U.S.), to very precise areas defined by individual postcodes or a radius as small as one mile from a particular address (e.g. an election rally). For each type of location, advertisers can define whether they want to reach people who live in this location, people who were recently in this location, or people who are travelling in this location. These options make precise geotargeting simple. - Predefined targeting attributes
Facebook allows advertisers to further refine their audience by selecting one or many additional criteria from a list of over 1,000 predefined attributes. These additional options can be based on life events or demographics (e.g. recently engaged users, expectant parents), interests (e.g. users interested in technology or fitness), or behaviours (e.g. users of a particular mobile operating system, people who have travelled recently). - Free-form attributes
In addition to predefined attributes Facebook offers advertisers the possibility to select free-form attributes. In Facebook’s Marketing API, advertisers can freely type user characteristics they are interested in and obtain suggestions from Facebook. Researchers identified more than 250,000 free-text attributes that include Adult Children of Alcoholics, Cancer Awareness, and other sensitive classifications. - Custom audience
Advertisers who have their own sources of data (e.g. customer lists, newsletter subscribers, lists of supporters, or — in some countries — registered voters) can upload them to Facebook. Facebook then matches this uploaded information (e.g. emails) with its own data about users (e.g. emails used for login or phone numbers used for two-factor authentication), without revealing the list of individual profiles to advertisers. This feature enables advertisers to reach specified individuals without knowing their Facebook profile names. - Lookalikes
Advertisers can target users who are similar to an existing audience (e.g. people who liked the advertiser’s Facebook page, or people who visited their website or downloaded their app). In practice, targeting so-called lookalikes entails asking Facebook to find people who are predicted to share characteristics with the seed audience. These lookalikes are determined by Facebook using customer similarity estimations that are constantly being computed by Facebook’s matching algorithm. - Exclusions
Advertisers can also exclude people from their audience, by defining excluded demographics, locations (e.g. an ad should be targeted to users in Poland with the exception of a particular city or region), attributes (e.g. parents excluding those that use the Android operating system), or individual users from the uploaded custom audience list. Regarding the latter: In January 2020, Facebook announced that users will be granted the possibility to opt in to see ads even if an advertiser used a custom audience list to exclude them. - Desired advertising objective and budget considerations
Normally, the size of the relevant audience (users who fit the criteria selected by the advertiser) is bigger than the size of the audience the advertiser can reach with a limited budget. To ensure they reach the most relevant users, advertisers specify their desired advertising objective. This objective is taken into account by Facebook when optimising ad delivery and selecting users for the targeted audience (see more: Facebook’s Role). In the first step of this process, advertisers choose from three broad objectives:- Awareness (aimed at generating interest and usually measured in the number of ad views),
- Consideration (aimed at making people seek more information and usually measured in the number of engagements with content),
- Conversion (aimed at encouraging people to buy or use the product or service advertised and usually measured by analysing specific actions, e.g. store visits, purchases).
Advertisers may also define the timeframe for running the ad, the placement of the ad (e.g. newsfeed, Messenger, right sidebar), and the bid price (e.g. how much they are willing to pay for one ad view). Advertisers can also set limits on their budget (e.g. daily budget caps). - A/B Testing
Advertisers can run A/B tests for particular ad versions, placements, target groups, and delivery-optimisation strategies. Advertisers test ads against each other to learn which strategies give them the best results, and then adjust accordingly.
Facebook’s role: building users’ profiles and targeting ads
Although political parties play an active role in defining their preferred audience, ultimately it is not up to them to decide which users fall into targeted groups. At this stage, key decisions are made by Facebook and are informed by everything Facebook knows about its users (including off-Facebook activity and data inferred by its algorithms). Political advertisers do not have access to the rationale behind these decisions, nor do they have access to the effects of Facebook’s analysis (e.g. individual user profiles). Facebook’s advertising ecosystem is a walled garden controlled and operated by the platform. In fact, as established by other researchers, even if advertisers do not aim to discriminate in how they select their preferred audience, Facebook’s ad delivery and optimisation process can still lead to discrimination and contribute to political polarisation.
Below, we share our insights from investigating how Facebook collects and analyses users’ personal data, builds their profiles, and applies the “magic sauce” that enables effective ad targeting. Please note that we do not have the full picture of that process, because it remains opaque. The understanding that we present below has been based on fragmented information revealed by Facebook (e.g. in its published patents) and on discoveries made by other investigators.
1. Data collection and profiling
Targeting starts with data collection, which provides a foundation for obtaining statistical knowledge about humans and predicting their behaviour. This process is both troubling and fascinating, and there exist many excellent investigations into how Facebook collects and analyses users’ data. (Our favorite is Vladan Joler and SHARE Lab’s series Facebook Algorithmic Factory).
For the purposes of this report, we will only cover the most common sources of data that are relevant to political targeting:
- Data provided by users (e.g. profile information, interactions, content uploaded);
- Observations of user activity and behaviour on and off-Facebook. This ranges from metadata (e.g. time spent on the website), to the device used (e.g. IP addresses), to GPS coordinates, to browsing data collected via Facebook’s cookies and pixels on external websites;
- Data from other Facebook companies, like Instagram, Messenger, Whatsapp, and Facebook Payments;
- Information collected from Facebook partners and data brokers such as Acxiom and Datalogix (discontinued in 2018).
All of this data is analysed by algorithms and compared with data from other users, in the search for meaningful statistical correlations. Facebook’s algorithms can detect simple behavioural patterns (such as users’ daily routines, based on location) and social connections. But thanks to big data analysis, Facebook is also able to infer hidden characteristics that users themselves are not likely to reveal: their real purchasing power, psychometric profiles, IQ, family situation, addictions, illnesses, obsessions, and commitments. According to some researchers, with just 150 likes, Facebook is able to make a more accurate assessment of users’ personality than their parents. A larger goal behind this ongoing algorithmic analysis is to build a detailed and comprehensive profile of every single user; to understand what that person does, what she will do in the near future, and what motivates her.
2. Ad targeting and optimisation
Determination of the relevant audience (ad targeting)
After an advertiser creates an ad campaign and selects criteria for people they wish to reach, it is Facebook’s task to determine which users match this profile. All users who fulfill advertisers’ criteria belong to “relevant audience,” which should not be confused with “targeted audience” (see below).
Depending on how the advertiser selected their target audience, Facebook will have slightly different tasks:
- Demographics, location, or attributes: Facebook will compare advertisers’ criteria with individual user profiles and determine which users meet these requirements;
- Lookalike audience: Facebook identifies common qualities of individuals who belong to the so-called seed audience (e.g. their demographic data or interests). Then, with the use of machine learning models, Facebook identifies users who are predicted to share the same qualities;
- Custom audience list: Facebook matches personal data uploaded by the advertiser with information it has already collected about users (e.g. emails used for login or phone numbers used for two-factor authentication).
Determination of the targeted audience (ad optimisation)
As we mentioned above, an advertiser’s budget usually is not sufficient to reach all Facebook users who match criteria they selected when creating a given campaign (i.e. reach everybody in the “relevant” audience). Therefore, Facebook - with the use of algorithms - makes one more choice: selects users from the relevant audience who should see a given ad (and make it to the “targeted” audience). This is what we call “ad optimisation.” In theory, this process should give the advertisers the best possible result for the money they spend.
In order to select the targeted audience, Facebook take the following factors into consideration:
- Optimisation goal selected by the advertiser (e.g. awareness, consideration, conversion);
- Frequency capping (Facebook will sometimes disregard advertisers who recently showed an ad to a particular user in order not to flood the user with ads from the same advertiser);
- Budget considerations (e.g. daily capping or bid capping);
- Ad relevance score (on the scale from 1 to 10), which is calculated by the analysis of:
- Estimated action rates: predictions on the probability that showing an ad to a person will lead to the outcome desired by the advertiser (e.g. that it will lead to clicks or other engagement);
- Ad quality: a measure of the quality of an ad as determined by many sources, including feedback from people viewing or hiding the ad and assessments of low-quality attributes in the ad (e.g. too much text in the ad's image, withholding information, sensational language, and engagement bait).
Automated Ads: Facebook is in charge of the whole campaign
Facebook offers advertisers a number of features to automate ad creation and targeting, including its Automated Ads service. The idea behind Automated Ads service is simple but powerful: Advertisers can rely on Facebook to prepare an entire advertising campaign, almost from start to finish.
Currently, advertisers can order Facebook to:
- Give them creative suggestions for different versions of ads (e.g. adding call-to-action buttons, text, and other creative details);
- Prepare personalised versions of the ad to everyone who sees it, based on which ad types they are most likely to respond to (so-called dynamic ads);
- Automatically translate ads;
- Suggest automated audiences tailored to an advertiser’s business/activity;
- Recommend budget that will be sufficient for achieving an advertiser’s goals;
- Suggest changes to running ads (e.g. refreshing the image);
- Automate where the ad will appear (automatic ad placement).
These features show that Facebook can play an even more active role in determining which users will be reached with specific ads. The platform can also facilitate microtargeting by creating versions of an ad that will most likely appeal to a particular person. Trends in the marketing industry show that Facebook’s role in ad automation will continue to grow. In the near future, the platform might take over the entire advertising process, from designing the ad creative to determining ad budgets and appropriate groups to target.
Facebook moves towards transparency
Amid scrutiny of how Facebook collects users’ data, builds behavioural profiles, and optimises ad delivery, the platform has taken a few steps toward transparency in recent years. This shift is far from comprehensive, and it likely would not have happened without pressure from European regulators. But it is important to acknowledge.
In September 2018, the European Commission launched the Code of practice against disinformation, a self-regulatory instrument which encourages its signatories to commit to a variety of actions to tackle online disinformation. In terms of advertising, signatories committed to providing transparency into political and issue-based advertising, and to helping consumers understand why they are seeing particular ads. The code has been signed by the leading online platforms including Facebook, Google, and Twitter.
As a signatory, Facebook has committed to offer users the possibility to view more information about Facebook Pages and their active ads; to introduce mandatory policies and processes for advertisers who run political and issue ads; and to give users controls over what ads they see, in addition to explanations of why they are seeing a given ad.
In March 2019, Facebook introduced a public repository of ads in the European Union dubbed the Ad Library (previously functioning in the U.S. under the name of Ad Archive). Simultaneously, Facebook expanded access to the Ad Library application programming interface (API), which allows researchers to perform customised searches and analyse ads stored in the Ad Library.
Facebook Ad Library encompasses all advertisements, and additional insights are available for ads related to social issues, elections, or politics. These insights include the range of paid views that the ad received; the range of amount spent on an ad; and the distribution of age, gender, and location (limited to regions) of people who saw the ad. In January 2020, Facebook announced it would add “potential reach” for each political and issue ad, which is an estimated audience size of how many people an advertiser wanted to reach (as opposed to how many they eventually managed to reach). Political and issue ads and insights related to them are archived and remain in the Ad Library for seven years, while other ads are available only during the time they are active.
Advertisers who want to publish political or issue ads are obliged to go through an authorisation process and set a disclaimer indicating the entity that paid for the ad. All Facebook ads are reviewed by a combination of AI and humans before they are shown, in order to verify whether the ad is political, election, or issue related. This review process occurs separately from the self-authorisation of advertisers, which means ads identified as political will appear in the Ad Library anyway, even if the advertiser does not include a disclaimer.
Further, Facebook offers Ad Library reports (also downloadable in CSV format), which include daily aggregated statistics on political and issue ads (e.g. the total spend on ads by country and by advertiser).
In addition to the public repository, Facebook gives users individual explanations about ad targeting. Real-time explanations are accessible by clicking an information button placed next to the advertisement (“Why am I seeing this ad”). Through their privacy settings, users can also access information about their ad preferences; check the list of advertisers who have uploaded their personal data to Facebook; and, as of January 2020, control how data collected about them on external websites or apps is used for advertising (off-Facebook activity). Recently-announced changes will also enable users to opt-out of custom audience targeting, and make themselves eligible to see ads even if an advertiser has used a list to exclude them.
In the second part of this report (see: Facebook transparency and control tools: the crash test), we evaluate these tools.
Part II. Microtargeting in the Polish 2019 election campaign: A case study
Key takeaways
- The scale of microtargeting by Polish political parties was small.
- Nonetheless, our observations suggest that the role of Facebook in optimising ad delivery could have been significant.
- Facebook’s transparency tools are insufficient to verify targeting criteria and determine whether voters’ vulnerabilities were exploited (either by political advertisers or by the platform itself).
- Facebook accurately identifies political ads; only 1% of ads captured by the WhoTargetsMe browser extension and identified as political were not disclosed in the Ad Library.
- The funding entity has not been accurately indicated in 23% of ads published in the Ad Library.
- Political ads in Poland were largely part of an image-building campaign promoting particular candidates. Only 37% of ads were directly related to a political party’s programme or social issues.
- The Polish National Election Commission does not have the tools to effectively monitor and supervise election campaigns on social media, including their financing.
1.Methodology and timeline
Our case study focused on the campaign for the Polish parliamentary elections scheduled for 13 October 2019. In preparation for the main phase, we tested our tools in a pilot run during the European elections in May 2019.
Our research involved:
- Monitoring and analysing political ads on Facebook with the use of data collection and analytics tools;
- Analysing actual campaign expenses and advertising contracts of political parties;
- Analysing observations of other stakeholders, including the National Electoral Commission, the National Broadcasting Council, the Polish Data Protection Authority, OSCE Office for Democratic Institutions and Human Rights, and law enforcement agencies.
For the purposes of data collection and analysis, we used the following tools:
- Facebook Ad Library, Ad Library API, and Marketing API;
- AI tools (e.g. natural language processing and computer vision) for data enrichment;
- The WhoTargetsMe browser extension, adapted for the Polish context.
Our analysis of data available in the Ad Library included political ads published by Polish political parties and candidates between 1 August and 13 October 2019, while the WhoTargetsMe extension collected data between 17 August and 13 October 2019. It is important to note that the official election campaign began on 9 August 2019 and finished on 11 October 2019. (On 12 and 13 October, pre-election and election day silence applied.)
The WhoTargetsMe browser extension collects all ads seen by its users, together with individual targeting information available in the “Why am I seeing this ad” feature. The goal of our WhoTargetsMe collaboration was to use crowdsourcing to gather and analyse more insights into ad targeting than those available in the Ad Library. During our case study, over 6,200 Polish Facebook users installed the WTM extension in their browsers. It is important to note that this group was not representative of the entire population, which pushed us to adopt a qualitative rather than quantitative approach to data analysis.
We have also created a searchable database of all political ads collected via the WhoTargetsMe browser extension. The database allows you to filter ads by political party, targeting attribute, and interests. Click to access the tool.
Chart 1. WhoTargetsMe users by gender, age, and political views
2.Polish election context and basic statistics
During the Polish parliamentary elections campaign, we monitored ads published by all election committees registered with the National Electoral Commission. But we focused our detailed analysis on the five committees that registered in all electoral districts:
- Law and Justice (Prawo i Sprawiedliwość);
- Civic Coalition (Koalicja Obywatelska);
- The Left (Lewica);
- Polish People’s Party (Polskie Stronnictwo Ludowe);
- Confederation Liberty and Independence (Konfederacja Wolność i Niepodległość).
From a set of over 28,000 political and issue ads available in the Ad Library during the election campaign, we identified 17,673 ads published by political parties and candidates which cost a total of 4,153,850 Polish zlotys (approximately €967,000). For comparison, during the UK general election campaign, which was a month shorter than the Polish one, British political parties published a little over 20,500 ads for nearly £3,500,000 (slightly above €4 million). Facebook’s self-assessment report on the application of the EU Code of practice on disinformation indicates that between March 2019 and 9 September 2019, Poland was eighth highest in the EU27 in terms of the number of political ads, and 13th in terms of money spent on these ads. Taking into consideration the size of the Polish population (over 37 million, fifth in the EU after Brexit), we can say that spending on political ads was very moderate.
The chart below shows how many ads individual political parties published and how much they paid for them. Expenses on Facebook ads (not including the cost of designing ad creatives) constituted a fairly small part of overall campaign budgets. Civic Coalition, the highest spender on Facebook and the biggest opposition committee, spent 8.3% of their campaign budget on Facebook ad targeting. Law and Justice, the winning party, spent 2.6%.
Chart 2. Number of ads and ad spend per committee
We also crowdsourced 90,506 ads via the WhoTargetsMe browser extension, but only 523 of them were political. They were published by eight parties and 126 candidates, with almost half of all political ads sponsored by the Civic Coalition.
Chart 3. Percentage of crowdsourced ads per committee
3.Key findings
I. Microtargeting: not by political parties, but what about by Facebook?
Our analysis shows that despite small amounts spent on ads and relatively small reach (which might suggest that ads were microtargeted), Polish political parties did not use microtargeting techniques at large scale. Candidates targeting ads to particular towns and villages were marginal. Further, the analysis of the content of the ads indicates that Polish political parties largely focused on image-building and did not differentiate messages depending on targeted groups. At the same time, widely-defined audiences paired with small ad budgets might suggest that the role of Facebook in optimising ad delivery (into which we have no insight) was significant (see more: Facebook's Role).
In order to reach these conclusions, we analysed data about ads collected from the Facebook Ad Library and via the WhoTargetsMe browser extension from various perspectives, the most important of which are described in the next sections.
Small ad budgets and ad reach could suggest microtargeting
The analysis of amounts spent on political ads revealed that ads with very small budgets of up to 100 Polish zlotys (approximately €25) were dominant for all political parties. For the two leading political parties — Law and Justice and Civic Coalition — they constituted around two-thirds of all purchased ads. The average price of an ad varied between 25 and 340 PLN (€6-79).
Chart 4. Number of ads per ad budget (in PLN)
In terms of reach, most ads were delivered to a maximum of 50,000 people each, with the most popular category being between 1,000 and 4,999 users (the second-smallest reach range presented in the Ad Library). Seven percent of ads reached more than 50,000 users, and only 25 (or 0,1% of all ads) were seen by more than one million people each. Combining data on money spent on ads and ad reach has enabled us to see that in general, a small ad budget translated to small reach. This is intuitive but not to be underestimated, since in the Polish political practice a couple of thousand votes are sometimes sufficient to obtain a seat in the parliament. Only seven ads which cost up to 100 zlotys (€25) reached between 50,000 and 99,000 users. All of these ads were published by the ruling party Law and Justice and were promoting individual candidates and not particular topics on the political agenda. Because of the lack of insight into Facebook’s ad optimisation methods (see more: Facebook's role), we do not know what factors were responsible for these ads standing out.
Chart 5. Comparison of ad reach and ad budget (in PLN)
Small ad budgets and relatively small reach might suggest that particular messages were A/B tested or microtargeted at narrow groups of users. We set out to verify this by analysing individual ad messages, as well as targeted audiences.
No fragmentation in policy messages
We have not identified experiments with various, potentially contradictory messages targeted at different audiences, which are typical for microtargeting. In fact, individual ads often covered multiple topics at once. A vast majority of political ads (65%) was part of an image-building campaign, while ads about direct policy proposals or social issues were less popular. Among them, most often covered topics were: business, environment, women’s issues, transport, health, and education.
Chart 6. Topics of political ads
Negative campaign and hate speech
We have identified a couple campaigns aimed at discouraging potential voters from voting for political opponents. These campaigns were more intense toward the end of the political cycle. Negative ads often focused on building fear around concrete proposals by opponents. Here are some examples:
- Civic Coalition aimed to scare small business owners with the introduction of increased social insurance dues by the ruling Law and Justice;
- The Left criticised Law and Justice’s energy policy based on fossil fuels, and also called for indictment of the justice minister after judiciary reforms;
- Law and Justice prepared videos criticising opponents and sowing fear over their policy proposals or building antipathy for opposition leaders.
It seems these negative ads were targeted at undecided voters. However, because Facebook offers only basic demographic insights into reached groups, it is impossible to determine which characteristics political advertisers targeted (and which characteristics Facebook itself targeted in the process of ad optimisation).
Although the language and messages used in negative ads by Facebook pages of political parties were more sharp, in qualitative analysis we did not identify ads that could be qualified as hate speech or that incited violence against particular groups. At the same time, ads published by individual candidates tended to be more aggressive and included more direct attacks on politicians. The line between an aggressive negative campaign and hate speech is thin and some ads bordered on the latter (if not crossing the line). For instance, the ad on the left side features the motto “Stop the LGBT ideology,” while attacking political opponents (Civic Coalition and The Left) for supporting “modern” Western solutions rather than the traditional Christian family model.
This does not mean the Polish political campaign was free of hate speech. However, hate speech was present in organic, and not sponsored, posts on other social media platforms (e.g. Twitter), or in offline campaigns. What was sponsored on Facebook was often responses to these attacks. For instance, a member of Civic Coalition sponsored an ad with a #StopHate hashtag, in response to an organic Twitter post published by Law and Justice that compared politicians from Civic Coalition to excrement.
Rare examples of precise targeting by political parties
We noted only a few cases of precise targeting of messages to a concrete group of people. Two such cases were the regional campaigns of - respectively - a Civic Coalition and The Left candidates who each created a set of ads (see examples below), with the only difference between messages being the name of the town. This suggests that the candidates made use of Facebook’s precise geotargeting tools.
Straightforward examples like this were rare. Even when looking only at demographics (e.g. gender and age), we noticed that there were very few ads — 1.8% of the total — directly targeted both to a particular gender and to a particular age category. Usually, such ads were targeted to people under the age of 35. In general, political advertisers poorly calibrated their messages to targeted groups and did not specify narrow audiences. For instance, ads about issues relevant to elderly people reached all age groups, but — as a result of Facebook’s optimisation rather than criteria selected by advertisers — were by our estimations more likely to be seen by the elderly. Ads about women’s issues (e.g. feminism, in vitro, contraception, abortion), despite mostly reaching young women, were likely to be seen as often by men and by women over 55. This leads to two conclusions: Political parties did not microtarget ads, and Facebook might have played a significant role in ad optimisation and delivery.
Chart 7. Estimated average number of ad views per topic
Our analysis of Facebook’s explanations to users who saw the 523 political ads collected by the WhoTargetsMe browser extension also confirms these conclusions. Our small sample of ads (please note that it might not be representative of all ads) suggests that political advertisers usually did not use advanced targeting for their ads. Rather, they relied on standard demographic criteria such as age or gender. However, this varied across parties and candidates.
The party that won the elections (Law and Justice) used rather traditional attributes:
- “Demographics” (100% of ads)
- “Speaks language” (86,7% of ads)
- “Interacted with content” (< 1% of ads)
- “Group” (< 1% of ads)
Their main opponent (Civic Coalition) used more advanced features of the Facebook ad engine, such as lookalikes or particular interests:
- “Demographics” (100% of ads)
- “Lookalike” (36,4% of ads)
- “Interests” (33,3% of ads)
- “Likes page” (12,1% of ads)
- “Group” (< 1% of ads)
- “Likes other pages” (< 1% of ads)
In terms of interests selected by advertisers, the three top ones included: “business,” “Law and Justice” (the name of the party), and “European Union.” We have also seen more sensitive criteria such as “LGBT,” “sustainability,” “veganism,” “gender,” and “climate,” but our analysis shows that they were mostly used by The Left (Lewica) and were closely linked to the content of the ad, which simply referred to the party’s programme (e.g. an ad about same-sex marriage was targeted to people interested in “LGBT”). We have not seen these sensitive criteria being used to promote misinformation, polarising content, or messages that were not thematically related to selected interests.
Chart 8. Targeting attributes crowdsourced via WhoTargetsMe and the number of ads
It is important to acknowledge our tools have limitations that prevent us from seeing the full picture of microtargeting. An interesting example of potentially more sensitive and detailed targeting by political advertisers emerged during the analysis of political parties’ financial statements in the National Electoral Commission. The Left (Lewica) submitted an invoice from Facebook which — among other features — disclosed the titles of ad campaigns which included “Fans of Razem [political party],” “Cultural lefties 18-25” (“Lewaki kulturowe 18-25”), and “Tumour - Poland” (“Nowotwór - Polska”). In marketing practices, these titles are often equivalent to the characteristics of targeted groups — and we suspect that it was also the case in this example. This shows that due to the lack of transparency into targeting, we are not getting the full picture of both how political parties choose their audiences and which targeting criteria Facebook chooses to reveal (see: Facebook transparency and control tools: the crash test).
II. Spending on online ads beyond effective control
Financial statements do not offer enough insight
Elections in Poland are regulated by the 2011 Electoral Code and the 1997 Constitution of the Republic of Poland (which introduces the general rule that “the financing of political parties shall be open to public inspection”). Under the Electoral Code, all registered election committees must submit, within three months of the election day, a report on revenues, expenses, and financial obligations to the National Electoral Commission. These provisions are made more specific by a 2011 regulation by the Minister of Finance, which states financial statements should include expenses on external services in connection with “the execution of electoral materials, including conceptual work, design work and production, broken down into (...) online advertising,” as well as with “the use of mass media and poster carriers, broken down by services rendered on (...) advertising on the Internet.” Although the template of the financial statement obliges election committees to present exact amounts spent on online ads, it does not differentiate between social media and online news publishers. As a result, election committees are not obliged to state how much they have spent on ads on online platforms.
An additional hurdle to the analysis of campaign expenses is that all documents are submitted on paper — no electronic version is required (although there are some committees which attach CDs with selected files). Therefore, the actual analysis of financial reports has to be done in person in Warsaw and involves manually browsing thousands of documents.
Invoices submitted to the NEC oftentime present only general information that an intermediary (usually a media agency) received an overall payment for “advertising activities online.” It is impossible to determine how much of it was in fact later transferred to online platforms and for which specific ads.
More detailed invoices that mention payments for “advertising services on Facebook, according to the media plan” lack crucial information (e.g. the said media plan, exact content which was promoted, and what audiences were selected). As the description is quite vague, it is also impossible to determine whether the entire amount was transferred to Facebook or partly to the agency.
The analysis of financial statements after two elections in 2019 shows that most intermediaries are not big agencies or entities specialised in political marketing on social media (in the likes of Cambridge Analytica). Very often, these are small companies dealing with all sorts of marketing (not just political) which provide comprehensive campaign support — not only by running campaigns on Facebook, but also by designing ad creatives and buying advertising space on billboards or broadcast time on local radio.
In addition, the general way of describing the service leaves it open to interpretation. For example, that money was not necessarily used for sponsoring actual ads but for paying influencers or other people to boost organic posts. The latter is not subject to any scrutiny.
As a result, neither researchers nor the oversight body can compare amounts presented in financial statements with financial data published in the Facebook Ad Library.
This may also cause problems in identifying the entity that paid for political ads, as it may not be the election committee directly but rather the employee of the PR firm engaged in the campaign (see remarks about disclaimers below). Documents also lack information about which groups of users were targeted with political ads, when the ad was active, and what was its topic.
The most useful information was available in the detailed invoice issued directly by Facebook to one for the committees which bought ads without help from intermediaries. The invoice contains the titles of campaigns and ads (which can and often do reveal targeted groups; see remarks above), dates the ads ran, the number of views, and the exact amount spent on them. This data enables more scrutiny over party spending on social media. It is also in line with the 2011 Regulation, which obliged committees to submit the actual amount spent online (however, there is no legal requirement to report spending on political ads in social media as a subcategory). When detailed invoices like this are available, the oversight body can compare spending on Facebook with the copy of expense reports from the committee bank account. However, for almost €1 million spent on Facebook ads by all political parties, only The Left presented some invoices issued directly by Facebook.
Facebook does not remove ads without disclaimers fast enough
As mentioned in Part 1 (See: Facebook moves towards transparency), Facebook created an obligation for political advertisers to include a disclaimer indicating the entity that paid for the ad. The Polish Electoral Code also requires all election materials be clearly labeled with the name of the committee. However, despite these requirements, almost one-fourth (23.6%) of all political ads was mislabelled: nearly 1200 political ads ran without any disclaimers at all (7% of all ads), while over 2800 ads (16% of all ads) indicated individuals or media agencies as paying entities. This makes it difficult to verify whether a particular ad was sponsored by a registered election committee and to compare it with financial statements submitted to the National Electoral Commission, which undermines the transparency of campaign spending.
Chart 9. Number of ads and ad spend by funding entity
Looking into the amounts spent on particular types of ads, we have observed that on average less money was spent on ads without disclaimers (on average 84 zlotys as opposed to 288 zlotys for ads paid for by the committee). This means that the reach of ads without disclaimers, and consequently their impact, might not have been as significant. Small amounts spent on ads without disclaimers may also indicate that they were effectively disabled and interrupted by Facebook, as the platform declared before the European elections. However, our analysis of the running time of ads with and without disclaimers shows that this is not really the case.
Chart 10. Average ad running time per funding entity
We have established that ads which ran without a disclaimer were on average emitted longer than ads with disclaimers. Facebook identified and interrupted them with significant delay — on average it took about a week. At the same time, we have observed that most political ads were published in the last days of the election campaign, so fast and effective interruption of ads without disclaimers is key. In our case study, we have seen that the closer it was to the election day, the better job political advertisers did at adding disclaimers. But if this had not been the case, ads without disclaimers would probably have been running until the end of the campaign — which seriously undermines the transparency of political advertising.
III. Unregulated pro-turnout and media campaigns
In our attempt to identify actors other than political parties or candidates who engaged in sponsoring ads related to elections, we uncovered quite a few ads aimed at encouraging voter turnout and ads from media outlets that had a clear election-related message. Ads in both of these categories were in large part not politically neutral. Indeed, they were quite the opposite — they aimed for selective mobilisation of different groups of voters and included very specific election-related messages.
One example was the “How to vote” campaign targeted mainly to women between 18 and 25 studying and living away from their hometowns. The ads included a link to an instruction on how to register and vote. And the accompanying photo presented faces of Jarosław Kaczyński — the leader of the conservative Law and Justice party — and two famous priests — Tadeusz Rydzyk and archbishop Marek Jędraszewski — along with a slogan: “Do not let old men choose your future.” The ad reached over 50,000 women.
Other pro-turnout campaigns were led by NGOs and also targeted very particular audiences. The example below shows an ad published by Campaign Against Homophobia (Kampania Przeciw Homofobii) — an NGO fighting for LGBT rights — targeting people with the “LGBT” attribute with the pro-turnout slogan “I vote for love.”
A controversial negative campaign against the ruling Law and Justice, which we crowdsourced via the WhoTargetsMe extension, was run by one of the biggest Polish newspapers, Gazeta Wyborcza.
Voter mobilisation ads and political ads from media outlets are beyond the scope of election regulations. At the same time, clear political messages included in these ads rendered these ads relevant in the context of election persuasion.
4.Facebook transparency and control tools: the crash test
Ad Library and Ad Library API
It is beyond doubt that the Facebook Ad Library increased transparency into ads’ content. Researchers and interested users can now see a whole range of ads being promoted, not only those in their personal newsfeeds. However, insights offered for political and issue ads are still insufficient to enable an effective identification of abuses and a thorough analysis of the entire ad targeting process.
While evaluating the Facebook Ad Library, we took three aspects into consideration:
- Comprehensiveness and accuracy
We have established that only 1% of political ads collected via the WhoTargetsMe browser extension were not qualified as political by Facebook and, as a result, were not disclosed in the Ad Library. It is important to note, however, that our sample was relatively small (523 ads). In fact, research from other countries (See: Why we need an ad library for all ads) suggests that the identification of political ads remains a problem. In addition, according to the Polish National Broadcasting Council, there were inconsistencies between the number of ads presented in the Ad Library report — a tool enabling the download of daily aggregated statistics about ads — and the Ad Library itself. These inconsistencies were first observed during the European elections in May and still occurred during the Polish parliamentary elections in October. - Effectiveness of disclaimers
As described in key findings, the funding entity of almost a quarter of all political ads was not correctly disclosed, including 7% that ran without any disclaimer at all. Also, the average running time of ads without a disclaimer was longer than that of ads that were correctly labelled. Facebook was too slow in identifying these cases and in interrupting ads without the indication of the funding entity (this process took about a week). This problem is all the more pertinent given that a vast majority of political ads are published in the last days of the election campaign. - Scope of insights into targeting
The Ad Library does not disclose detailed information on targeting, which makes the social control of political marketing practices impossible. In particular, in the Ad Library we do not see the information about audiences selected by advertisers nor information about optimisation methods used by the platform itself. We only see the outcome of the targeting process (reach) in terms of two basic categories: location (in Poland limited to the region) and basic demographics (gender and age groups) and — since very recently — potential reach. Given that targeting options available for advertisers are way more refined (see: Advertisers' role), our analysis of the Ad Library leads to the conclusion that it is incomplete and presents only the tip of the iceberg when it comes to targeting criteria.
Mozilla on Facebook’s Ad Library API
It is impossible to determine if Facebook’s API is comprehensive, because it requires you to use keywords to search the database. It does not provide you with all ad data and allow you to filter it down using specific criteria or filters, the way nearly all other online databases do. And since you cannot download data in bulk and ads in the API are not given a unique identifier, Facebook makes it impossible to get a complete picture of all of the ads running on their platform
The API provides no information on targeting criteria, so researchers have no way to tell the audience that advertisers are paying to reach. The API also does not provide any engagement data (e.g., clicks, likes, and shares), which means researchers cannot see how users interacted with an ad. Targeting and engagement data is important because it lets researchers see what types of users an advertiser is trying to influence, and whether or not their attempts were successful.
The current API design places major constraints on researchers, rather than allowing them to discover what is really happening on the platform. The limitations in each of these categories, coupled with search rate limits, means it could take researchers months to evaluate ads in a certain region or on a certain topic.
II. Individual ad explanations - Why am I seeing this ad?
Apart from a publicly available repository of ads, Facebook provides users with an individual, personalised explanation of ad targeting via the “Why am I seeing this ad?” feature. Users can access this feature by clicking a button next to ads in their newsfeed. This explanation is available for all ads, not only political and issue ads. It is important to note that Facebook significantly changed the structure, language, and scope of explanations of the feature in the last quarter of 2019, and that our case study covered the period between August and October 2019.
On the basis of our observations and empirical research, we identified the following deficiencies of individual explanations as of October 2019:
- They are incomplete. Facebook reveals only one attribute to users while advertisers can (and usually do) select multiple criteria. Researchers have also established that Facebook does not show whether users were excluded from a particular group, and on the basis of what characteristics. Also, if the advertiser uploads a custom audience list, Facebook does not reveal the type of personal information that was uploaded (e.g. email, phone number).
- They are misleading by:
- Presenting only the most common attribute. Facebook shows only the attribute that is linked to the biggest potential audience. For example, if an advertiser selects two attributes — interest in “medicine” (potential reach of 668 million) and interest in “pregnancy” (potential reach of 316 million) — the user’s explanation will only contain “medicine” as the more common attribute. This example is not incidental: During the European elections campaign in May 2019, a person who was pregnant at the time saw a political ad referring to prenatal screenings and perinatal care. “Why am I seeing this ad?” informed her that she was targeted because she was interested in medicine.
- Preferring attributes related to demographics over other types of targeting attributes. Researchers found that whenever the advertiser used one demographic-related attribute (e.g. education level, generation, life events, work, relationships), in addition to other attributes (e.g. recent travel, particular hobbies), the demographic-based attribute would be the one in the explanation.
- Presenting only the most common attribute. Facebook shows only the attribute that is linked to the biggest potential audience. For example, if an advertiser selects two attributes — interest in “medicine” (potential reach of 668 million) and interest in “pregnancy” (potential reach of 316 million) — the user’s explanation will only contain “medicine” as the more common attribute. This example is not incidental: During the European elections campaign in May 2019, a person who was pregnant at the time saw a political ad referring to prenatal screenings and perinatal care. “Why am I seeing this ad?” informed her that she was targeted because she was interested in medicine.
- They do not explain Facebook’s role in ad delivery optimisation. Individual disclosures do not explain why the user was qualified for the targeted audience (e.g. which data provided by the user or which observed behaviour was taken into account). Individual explanations also do not offer any insights into the logic of optimisation.
All of these deficiencies make it impossible for users to fully understand and control how their data is used for advertising purposes.
What has changed after the update?
We are not aware of any studies published after Facebook’s update that address the topic. Our initial insights suggest that the structure and the language of explanations is clearer than before. There is more information available, including the list of criteria selected by advertisers and more than one attribute. But in both cases, without empirical research, it is impossible to say whether this information is comprehensive.
However, with the exception of age determination, users still will not find out which of their characteristics or tracked behaviours were taken into account by Facebook when matching them with a particular ad. For location, interests, and lookalikes, Facebook offers users only a general, non-individualised explanation. This leads us to the conclusion that updated explanations are still incomplete and do not offer meaningful information on why exactly a particular person saw a particular ad.
III. Data transparency and control: settings and ad preferences
Facebook users can gain insight into how their personal data is used for advertising purposes — and, to some extent, control such use of their data — via the Ad Preferences page in their account settings.
In terms of transparency of users’ profiles, we have the following observations:
- Facebook presents the outputs of the profiling process in the form of inferred interests, but it does not offer any meaningful information about the inputs (e.g. what concrete actions and behaviours by the user were responsible for assigning particular interests to them). Instead, Facebook offers generic and vague information (e.g. “you have this preference because of your activity on Facebook related to X, for example liking their Page or one of their Page posts”);
- Facebook presents only interest-based attributes, while hiding other attributes that advertisers can select in the advertising interface (e.g. those related to particular life events, demographics such as income brackets, or inferred education level). The only example of an additional category of attributes that we found is a simple characterisation of Facebook’s users based on their use of internet networks.
- Despite the fact that attributed interests change dynamically (with every advertising campaign), Facebook doesn’t offer users insights into how their profiles change and grow over time;
- Users can see the list of advertisers who uploaded their information to Facebook via the custom audience feature, but Facebook doesn’t reveal what type of information was uploaded (e.g. e-mail, phone number).
In terms of control tools offered by Facebook to its users, we have the following observations:
- Removing individual interests presented in Ad Preferences is not equivalent with removing data that led to this interest being attributed to the user — the only effect is that the user will not see ads based on that interest;
- Facebook’s “Clear history” tool does not allow users to remove data collected by Facebook via pixels and other tracking technologies on other websites and in apps. It only enables removing the connection between this particular source of data and other data Facebook collected about a user;
- Users can only opt out of:
- Ads based on behavioural observations outside Facebook (via Facebook pixels and other trackers), as well as ads based on data about users’ offline activity (e.g. from customer data brokers);
- Seeing ads outside of Facebook (delivered via the Facebook Audience Network);
- Ads about selected topics (political issues are not one of them, but in January 2020 Facebook announced that it will offer users the option to see fewer political ads);
In January 2020, Facebook announced that it will give users more control in relation to how their data is used by advertisers who upload custom audience lists. Users will have the possibility to opt out of seeing ads from a particular advertiser, which were targeted based on the uploaded custom audience list, or make themselves eligible to see ads if an advertiser used a list to exclude them. This suggests that Facebook is not planning to give users the right to globally opt out of all ads targeted with the use of custom audiences.
Facebook’s ad settings have been built on an “opt out” rather than an active “opt in” premise, which means that none of these limitations and restrictions to ad targeting will apply by default. Each and every user has to select them in order to regain some control over the use of their own data for advertising purposes. In Part 3, we argue that this design of advertising settings is not in line with the GDPR standard.
Big unknowns
Because of the opacity of the targeting process and the insufficiencies of Facebook’s transparency tools, it is impossible to comprehensively investigate PMT and the use of users’ data for this particular purpose. From the perspective of both researchers and concerned users, the following gaps are most problematic:
- The Ad Library presents insights limited to platform-defined political and issue ads, which makes it difficult to verify whether election-related content was or was not sponsored by actors other than political parties and candidates;
- These insights are too general and largely insufficient to determine the profile of users who were reached with a specific ad;
- Individual explanations offered to users are incomplete and potentially misleading;
- Users do not have access to their behavioural and inferred data, which often shape their marketing profiles (it is on that basis that Facebook assigns its users particular attributes);
- Control tools offered by Facebook to its users are superficial and do not give them real control over their data and its use for (political) advertising;
- Lack of insight into Facebook’s ad optimisation process (e.g. what prediction models are used) makes it impossible to evaluate its impact and potential risk (in particular, whether ad optimisation may lead to discrimination or enable microtargeting based on sensitive characteristics).
Part III. Recommendations
Country-specific evidence in a pool of other research on the use of PMT
Our case study of how PMT was used in Poland during the 2019 elections added country-specific evidence to a much greater pool of relevant research. From the very beginning of our work, we have been aware of other pending projects and we wanted to use this opportunity to build on each others’ observations. We are confident that by looking at the same problem from the perspective of different countries and by asking slightly different research questions, we are able to bring more value to European discussion on the use (and potential regulation) of PMT.
Being mindful of this common objective, we frequently quote observations made by other researchers who analysed the use of PMT in other countries or looked at advertising practices of online platforms. We invited some of them to comment on our findings and help us develop our final recommendations (see “Expert boxes”). We want to make it very clear that recommendations formulated in this report are based not only on our work in Poland, but also on evidence and arguments published by our peers.
While each of our recommendations can be boiled down to a straightforward demand (in most cases the demand for some form of binding regulation), we see the need to explain the reasoning which has led us to formulate this demand. Therefore each point in this chapter includes background, our argumentation, and dilemmas we faced in the process of developing recommendations.
Below, we place all recommendations on one map in order to show their orientation on two axes:
- From basic transparency requirements toward new control tools and strict limitations;
- From self-regulation and existing regulation towards new binding legislation
We argue that the use of PMT by online platforms poses enough risks to justify more control over the use of behavioural data for advertising purposes and additional limitations to prevent societal harm. We also argue that platforms’ self-regulation is not the right approach to ensure real transparency and meaningful control tools for users and the general public. Therefore many of our recommendations call for binding regulation on online platforms, which can only be adopted on the EU level.
Chart 11. Map of recomendations
Building on existing law & self-regulation
Control
Transparency
Towards new regulation
1.Scope of ad libraries: beyond the political
Our research reveals that the gap between ads labelled as “political” by the platform and ads perceived as political by researchers did exist in Poland, but was not as significant as in other countries (see: Ad Library and Ad Library API). However, there is a wealth of research confirming the opposite elsewhere (see sources quoted in the expert's opinion), namely that online platforms struggle with identifying political content. Following these findings, we argue that the scope of public ad libraries should be extended to include all ads, not only political ones.
Why we need an ad library for all ads
Paddy Leerssen LL.M., PhD Candidate at Institute for Information Law (IViR), University of Amsterdam, Non-Resident Fellow at Stanford University Center for Internet and Society
In countries around the world, platforms have shown that they struggle to identify political ads at scale. This is no surprise, since the concept of a “political” ad is highly ambiguous and difficult to enforce in practice. Depending on platforms to identify political ads is a bad idea, and instead researchers should need comprehensive access to all platform ads. This is the only way to ensure we aren’t missing anything.
In archiving political ads, the first challenge is defining what actually qualifies as a “political” ad. If you focus only on official election ads, then a lot of important political activity is ignored. For instance, many of the Russian ads disseminated on Facebook during the 2016 U.S. election agitated on polarizing social issues without directly referencing the election. To capture such activity, a broader definition of political issues is needed — but this is complex and subjective. Is the coronavirus political, for instance? What about Bitcoin? Or climate change? Google itself has admitted to the European Commission that they struggle to come up with a consistent definition of political advertising. Facebook, for its part, bases its European Union archive on an extremely general and unpredictable list of “social issues,” which includes “economy,” “political values and governance,” and “health”. This ambiguity seriously limits the value of their data, as you cannot be sure what is missing.
Whatever definition of “political” advertising platforms end up using, enforcement remains a major challenge. Big platforms process many millions of ads, but automated systems are not suitable to interpret nuanced human concepts like “politics” at such a scale. As a result, the platform almost inevitably makes mistakes. An independent study from Princeton found many false positives and false negatives in Facebook’s U.S. Ad Library. Another study by ProPublica found unarchived ads on issues including civil rights, gun rights, electoral reform, anti-corruption, and health care policy. In December 2019, a bug in the UK archive caused almost all political ads from the preceding month to be deleted from the UK archive, massively disrupting research efforts. And the French government has observed that, “Facebook removed 31% of ads in the French library over the week of the European parliamentary elections, including at least 12 ads that were illegal under French law.” Such omissions are a major barrier for accountability and regulation: If the relevant ads are not public, we cannot start to identify harmful activity, or to enforce relevant laws against it.
Because of this conceptual ambiguity and difficulty of enforcement, building an archive only for “political” ads is a bad idea. Instead of relying on a filtered dataset created by the platform, researchers should have access to all of the ads. Firstly, this avoids the problems in agreeing on a controversial definition of “political issues.” Secondly, it is the only way for researchers to make sure we are not missing anything, and that platforms’ rules are actually being enforced properly.
Looking beyond political advertising, an additional benefit of having comprehensive ad archives is that they enable research into harmful commercial advertising, and are important from the perspective of consumer protection authorities. Commercial advertising online creates many risks that may require intervention — especially on behalf of children and other vulnerable groups. These include the sale of harmful, illegal, or regulated products, the use of manipulative or deceptive sales tactics, and discriminatory targeting of ads. A comprehensive ad library would help to research these practices and hold them accountable.
Another way of putting this is that all ads are political in some sense. If a public health campaign to reduce obesity is “political,” then why isn’t a McDonald’s advertisement for unhealthy fast food? If a pro-recycling campaign is political, then isn’t the advertisement of wasteful plastic products political, as well? Ultimately, all of these practices deserve to be open to criticism and accountability.
All of this means that, although platforms like Facebook should continue their own efforts to detect political ads, we need a comprehensive archive. This archive should be public, and accessible through both a browser search portal and an API. Just like current political ad libraries, it should include, at a minimum, the ad message, spending data, the identity of the buyer, and anonymized audience demographics. What should also be included is the targeting criteria selected by the ad buyer, so we can start to understand how advertisers reach their audiences. Instead of relying on archives based on inflexible and unenforceable definitions from platforms or legislators, this comprehensive approach is more reliable, more powerful, and more future-proof.
We acknowledge that with regard to commercial ads some information (e.g. spending) might be considered trade secrets. In our recommendations we point out that the level of transparency available in public ad archives might be different for political and commercial ads but deciding where exactly the line should be drawn is beyond the scope of this report and requires further discussion.
2.Transparency of ad targeting: from ad creation to ad delivery
Access to this information is essential from the user’s perspective, as it conditions their ability to exercise other data protection rights, such as correction/deletion of their own personal data. As a society, we also have good reasons to scrutinise targeted ads beyond the context of political campaigns. These reasons stem from legitimate concerns about the impact of microtargeting on voters’ opinions, about the scale of disinformation, and about the exploitation of individual vulnerabilities for both political and commercial goals. Above (see: Facebook transparency and control tools: the crash test), we explain why existing transparency mechanisms, introduced voluntarily by Facebook and other internet platforms, are not sufficient.
We recommend that:
- Public disclosures made by online platforms should cover decisions made by both actors in the targeting process, i.e. the advertiser (both in terms of selection of the audience and determination of the campaign strategy, including what was it optimised for), as well as methods and parameters used by the platform in the optimisation process (see: Ad targeting explained: from ad creation to ad delivery).
- For decisions made by the advertisers, public disclosures should include an equivalent level of information as is offered to the actual ad buyer when commissioning the campaign (e.g. actual audience demographics, location and other targeting criteria).
- With regard to Facebook’s role, public disclosures should explain the prediction model used to achieve the optimisation goal.
- Users should have access to both all ads that have been targeted at them and all advertisers who targeted them in the last five years. We imagine this new interface as a personal ad library. Because it would show only ads that are relevant to an individual user, this interface would not be as overwhelming as the public ad library, which presents a vast amount of ads and requires advanced research skills.
- Users should also be provided with real-time information explaining the logic behind targeting (“why am I seeing this ad”). At a minimum, this individual explanation should include the following data:
- Specific demographic and location data, if the ad was targeted based on these criteria;
- Attributes relevant to the user, if the ad was targeted based on platform-defined user attributes or free-form attributes (see: Advertiser's role);
- Reasons why the user has qualified for the targeted audience, including personal data that were relevant in this process;
- Source of user's data (e.g. website tracking, mobile app, newsletter, loyalty card), if the ad was targeted based on custom audience;
- Optimisation goal selected by the advertiser and reasons why the user has been reached with this ad.
Real-time information should not be overwhelming and, therefore, should focus on explaining why the user in question was targeted with a specific ad. Information should also contain direct links to other interfaces: to the ad library for public disclosures, and to data management settings for a full list of personal data used in the targeting process.
The graph below shows the scope and type of information (including users’ personal data) that should be revealed by a combination of public and personal interfaces:
Also see the graph “User Tools: How They Function Together”, which explains how different tools for transparency and control could function together.
3.The GDPR standard for targeted political advertising: to be clarified
We argue that the application of the GDPR to targeted political advertising should be clarified, preferably by the EDPB guidelines or in practice by Data Protection Authorities. PMT is usually based on a user’s behavioural data (collected without the user’s awareness and control), which may include sensitive data. In this context, it is possible to argue that PMT should only be allowed after obtaining explicit consent. As suggested by the ICO, in some cases political microtargeting can also qualify as automated decision having significant impact on the data subject, which triggers additional safeguards under Article 22 of the GDPR.
We acknowledge that there are controversies regarding the application of the GDPR to PMT, and we look forward to having these issues clarified by relevant authorities. For the sake of discussion, below we propose answers to some of the key questions, based on opinions of the Article 29 Working Party, which we find relevant in the context of targeted political advertising.
Requirement for explicit consent
While it remains uncontroversial that first-party marketing under the GDPR can sometimes be based on legitimate interest, there is an ongoing debate regarding legal grounds for third-party marketing served by online media and internet platforms. In the following legal argumentation we will focus on one type of third-party advertising, namely political advertising targeted on the basis of users’ behavioural data. Based on our reading of the GDPR and arguments developed by the Article 29 Working Party in its opinions (cited below), we argue that online platforms should not serve political advertising without users’ explicit consent for this particular purpose.
This claim can be based on the following, independent arguments:
1. Processing sensitive data
While it cannot be assumed that microtargeting of political ads always entails processing of data about users' political beliefs, at least in some instances it will be the case. If either the advertiser or the platform aims to reach a specific audience based on their political opinions (be that declared or inferred from their behavioural data), such a campaign should qualify as processing sensitive data. According to Article 9 of the GDPR, such processing requires explicit consent.
On the other hand, we acknowledge that this line is easily blurred when political ads are targeted on the basis of behavioural data, which only indirectly and only in some circumstances will function as proxies for political opinions (i.e. sensitive data). For example, it is debatable whether targeting users based on the fact that they visited a website run by a specific political party or liked a Facebook page run by a specific political party qualifies as processing their sensitive data. For some users it will be true that this behaviour reveals their political opinions, but for others (e.g. journalists or marketing professionals) it will not be the case (see Expert box).
Proxy variables
Agata Foryciarz, Stanford Computer Science Department
Even in the absence of sensitive data, individual characteristics can still be revealed in a dataset because of “proxy variables” — elements of the dataset that are closely related to sensitive attributes. For example, the number of hours spent daily on a platform may be a proxy for age or employment status, and the set of liked pages may be a proxy for gender, sexual orientation, or political opinion.
This phenomenon becomes even more pronounced when machine learning algorithms are used to select target groups, since algorithms excel at combining multiple proxies to find patterns equivalent to sensitive attributes. An algorithm may therefore recognize individuals as similar, and treat them as such, without labeling them as “male” and “female” or “liberal” and “conservative.”
Additionally, if a dataset does not contain other truly informative data about individuals that can help make a prediction, the similarity of individuals (corresponding to their sensitive features) may end up becoming the most pronounced pattern in the data. The predictions generated based on this data would then look similar to predictions generated on the basis of a sensitive feature, even if the dataset contained no such features.
2. Balancing test according to article 6(f)
Drawing on recital 47 of the GDPR, many argue that third-party online advertising targeted on the basis of behavioural data cannot be based on legitimate interest. There are strong arguments supporting this interpretation. Firstly, such advertising activity certainly does not qualify as "direct marketing," which is explicitly mentioned in the GDPR as one of the purposes which may qualify as legitimate interest.
Secondly, the balancing test, which determines the scope of what is legitimate in the context of marketing, refers to "reasonable expectations" of the data subject. According to the Article 29 Working Party's Opinion on the notion of legitimate interests, reasonable expectations have to be judged based on the relationship between the data subject and the data controller. Advertisers who commission targeted ads are not part of this relationship.
Furthermore, behavioural data (e.g. users’ engagement with posts, exact location, browsing data from other websites and apps) used in the targeting process is usually (at least in the case of Facebook) collected without users’ awareness and control. Since users are not even aware of their dynamic marketing profiles, one cannot assume that they reasonably expect these profiles to be used for third-party marketing.
3. Purpose limitation
In their Opinion on purpose limitation, Article 29 Working Party surfaced yet another argument to require users’ explicit consent for advertising based on behavioural data. It is based on the premise that users should be able to control further use (for marketing purposes) of their data, which was generated as a result of profiling:
when an organisation specifically wants to analyse or predict the personal preferences, behaviour and attitudes of individual customers, which will subsequently inform ‘measures or decisions’ that are taken with regard to those customers [...] free, specific, informed and unambiguous ‘opt-in’ consent would almost always be required.
At the same time, Article 29 Working Party argues that the requirement for the purpose of data processing to be specific means that:
it must be detailed enough to determine what kind of processing is and is not included within the specified purpose, and to allow that compliance with the law can be assessed and data protection safeguards applied. For these reasons a purpose that is vague or general, such as for instance (...) 'marketing purposes', will — without more detail — usually not meet the criteria of being ‘specific'.
This is why it is not sufficient for online platforms to ask their users for consent for broadly-defined marketing purposes. In the context of PMT, we argue that explicit consent should be requested precisely for political advertising.
Access to all inferred and observed data
According to the GDPR (recital 63 and Article 15), users should be able to access all personal data processed about them by online platforms. Following a broad definition of what qualifies as “personal data” under the GDPR, there is no reason to exclude users’ marketing profiles and all observed or inferred data.
On the contrary, it is essential that users are given access to personal data that was collected from other sources or generated (e.g. inferred based on behavioural observation) beyond their control. In the context of online behavioural advertising, this means that every piece of personal data that has been taken into account in the targeting process (including the opaque process of ad optimisation) should be subject to users’ control via data management settings.
This interpretation is strengthened by recital 60 of the GDPR, which states that providing information about profiling (and its results) is part of the controller’s transparency obligations under Article 5(1)(a). It means that the data subject has a right to be informed by the controller about “profiling,” regardless of whether automated individual decision-making based on profiling takes place.
Information about the logic behind targeting
We argue that under the GDPR, users have the right to be informed about the logic behind targeting, especially in the PMT context. In practice it means that the platform should reveal all reasons why a particular user has been targeted with a specific ad, including: characteristics attributed to them, their location, demographics, and optimisation criteria used in the targeting process (see the “Disclosures of Ad Targeting” graph for details).
This interpretation of the GDPR is based on the following arguments:
- According to the ICO, microtargeting by political parties and campaigns may qualify as automated decision making with sufficiently significant effects on individuals. Therefore additional safeguards under Articles 22 and 15 of the GDPR apply.
- According to Article 15 of the GDPR, a data subject has the right to obtain meaningful information about the logic involved in automated decision making, as well as the significance and the envisaged consequences of such processing, at least (i.e. not only!) when such profiling has legal or significant consequences (reference to Article 22).
- The GDPR does not give examples of cases when automated decision making will not qualify for safeguards provided by Article 22 but nevertheless will deserve a higher standard of transparency, according to Article 15. Because PMT can be qualified as high-risk data processing, it is reasonable to expect a higher standard of transparency in this case (even when “significant impact” of a specific political ad cannot be established).
Our recommendations regarding transparency and control tools, which should be provided by internet platforms to individual users (see: Effective control tools for users), are based on this interpretation of the GDPR.
Custom audience controversy: Is explicit consent a must?
Political advertisers can use Facebook's custom audience feature to upload data of individuals they want to reach from other sources, such as app or website tracking mechanisms (Facebook’s own pixel), newsletter subscriptions, or offline lists of political supporters. Currently, this feature does not require users’ explicit consent (Facebook recently decided to give users a possibility to opt-out on the level of individual advertisers). Facebook argues that — when delivering ads based on personal data uploaded by the advertiser — it only acts as a data processor (not a data controller) and therefore it does not need a separate legal basis to process such data. This argument is backed by the fact that advertisers who want to use the custom audience feature are requested to sign a data processing agreement with Facebook.
On the other hand, there are reasons to doubt whether Facebook, when delivering ads based on custom audiences, acts only as a data processor. In practice it would mean acting solely on behalf of the advertiser and not for Facebook’s own purposes. The first reason is related to timing. Facebook introduced data processing agreements and started using its “data processor” defence in 2018, after the Bavarian Data Protection Authority banned advertisers from using the custom audience feature (and uploading people’s data to Facebook) without explicit user consent. The Higher Administrative Court of the federal state of Bavaria upheld this decision. Before this German case, Facebook and its clients had no second thoughts about uploading users’ data for advertising purposes and enriching users’ marketing profiles with this data.
The second reason is related to wording. The data processing agreement explicitly mentions a number of activities that Facebook will not undertake with regard to users’ data uploaded by the advertiser. But it does not say that Facebook will only pursue purposes defined by the data controller (i.e. the advertiser who uploaded the data). There must be a good reason why Facebook’s lawyers left that grey area. It is possible that Facebook uses uploaded data for its own purposes, such as improvement of its services and for research purposes.
If data uploaded by an advertiser from another (off-Facebook) source at any point gets integrated with a user's profile, it effectively starts working for Facebook. If this is the case, Facebook does much more than processing custom audience data “on behalf of the advertiser” who uploaded the data and, therefore, needs a valid legal basis to do so. According to the Bavarian data protection authority, this basis should be nothing short of users’ explicit consent. Sharing data with Facebook can be a significant intrusion into users’ privacy and may lead to real harm. Therefore the advertiser’s or Facebook’s legitimate interest is not sufficient to justify such a transfer. We argue that users’ explicit consent should be collected by the advertiser before uploading their data to Facebook.
4.Fully-functional APIs for researchers
Based on our and other researchers’ experience, we concluded that transparency tools provided voluntarily by leading platforms have limited functionalities, which did not allow for independent analysis of collected data (see: Facebook transparency and control tools: the crash test). In an open letter to Facebook and Google, Mozilla and a cohort of independent researchers called for an ad library API which would enable advanced research and the development of tools to analyse political ads targeted at EU residents. The letter specifies a long list of requirements for a fully-functional API, including the following important design features:
- Unique identifiers associated with each advertisement and advertiser to allow for trend analysis over time and across platforms.
- All images, videos, and other content in a machine-readable format accessible via a programmatic interface.
- The ability to download a week’s worth of data in less than 12 hours and a day’s worth of data in less than two hours.
- Bulk downloading functionality of all relevant content. It should be feasible to download all historical data within one week.
- Search functionality by the text of the content itself, by the content author or by date range.
Based on our own experience with retrieving data from the Facebook Ad Library, we strongly support this set of recommendations. Moreover, we argue that minimum technical requirements for a fully functional API should be defined in a binding regulation (see: Binding regulation on PMT is necessary). Transparency in the world of data is mediated through and conditioned by interfaces. Therefore it should not be left to online platforms alone to shape them. While the law is not the best tool to shape proprietary interfaces offered by online platforms, it can formulate minimum technical requirements for their APIs.
5.Effective control tools for users
We argue that Facebook users do not currently have control over their marketing profiles (see: Facebook transparency and control tools: the crash test). Neither can they verify true reasons for being included in a particular target audience. In fact, users can only verify (check, delete, or correct) a short list of interests that the platform is willing to reveal. This is by no means an exhaustive list of the results of constant behavioural observation and algorithmic analysis on the platform. Users’ marketing profiles change and grow with every targeted ad campaign, which entails assigning them with new characteristics. Some of these characteristics can be sensitive or reveal users’ vulnerabilities. There is a wealth of research and writing on this topic, including facts revealed by whistleblowers in the Cambridge Analytica scandal.
We acknowledge that it may be difficult for the platform to provide a meaningful explanation of the ad optimisation process, which exploits users’ behavioural patterns but does not necessarily lead to inferred data being named and documented in any database (see box below). At the same time, we argue that there are many choices made either by the advertiser or the platform in the targeting process that should be subject to users’ control via advertising and data management settings. In this context, there is clearly a need for more granular settings, as described below.
Predictive models and interpretability
Agata Foryciarz, PhD Candidate at Stanford University Computer Science Department
Tens of thousands of data points about a given individual are often used in ad optimisation. In current machine learning models, complex combinations of those data points are used to produce an output. While it is usually technically possible to re-trace those combinations and determine the inputs that contributed to the prediction, the process is tedious and its results are often uninformative. This makes it challenging to explain with confidence what factors had the most decisive impact on generating an ad recommendation for a particular individual.
Additionally, the process of tailoring a target audience may entail multiple steps, including inferring information about users from their behavioural patterns and arranging users into target groups. The advertisements shown to a particular user may depend on the number and activity of other users who fit the targeting criteria. Data about an individual may be aggregated with that of other individuals in order to make inferences about their hidden features that were absent in the dataset (such as political leanings or similarity to other people present in the dataset). These inferred representations about individuals then become inputs to the final prediction.
While the representations may correspond to specific users’ characteristics (such as membership in a marginalized community, employment status, or gender identity), they are often not created with the explicit goal of capturing this (possibly sensitive) information, and may not always be interpretable as such. Platforms usually create these representations in order to improve performance of their predictive models, rather than to increase model interpretability.
However, these complexities should not justify the opacity of predictive models. Users should have full access to their data, including inferred representations. Platforms should produce thorough documentation of their models and allow for external audits. Regulators should be able to verify and challenge algorithmic design choices that involve sensitive information about individuals, even in the presence of the complexities described above.
The graph below explains the functional relationship between user tools, which we propose in our recommendations:
Features increasing users’ control over their personal data:
- Access to the full (marketing) profile
Users should have access to their dynamic marketing profiles (namely all characteristics attributed to them) as well as to the full list of inferred personal data, regardless of whether it was generated for marketing purposes. Users should have an easy option to delete or correct every piece of inferred or observed data. - Opt-in for the use of behavioural data for advertising purposes
According to the GDPR, users’ behavioural (be that observed or inferred) data cannot be used for marketing purposes (be that political or not) without their explicit consent (see: Requirement for explicit consent). Therefore advertising settings should include an opt-in feature allowing advertisers to use such data. For transparency purposes, platforms should differentiate between data:- Observed in the course of a user’s activity on the platform (e.g. likes, engagements patterns, visited sites);
- Tracked outside the platform (via pixels, cookies, and other trackers);
- Inferred by their algorithms.
- Opt-in for political advertising
Advertising settings should include an opt-in feature for political ads. - Opt-out to block specific campaigns/advertisers
Based on the list of all third parties that targeted them with ads on Facebook (available via a new interface, which we call the personal ad library), users should be able to block specific campaigns or all campaigns from particular advertisers.
After making their initial choices (e.g. consenting to political advertising run by official political parties in their country), users should also be able to easily modify data management and advertising settings based on their experience. While this goal can also be achieved by offering users’ a more granular and intuitive interface, we argue that in the long run the best approach would be to offer users’ personal APIs and establish open protocols for communicating their choices.
Personal APIs and open protocols to communicate users’ choices
Imagine a marketing profile of a Facebook user with at least 10 years of activity on the platform. Tens of thousands of actions over this period would inform his/her behavioural observation, and potentially thousands of characteristics would be attributed to this person. We are talking about an extraordinary amount of data that is difficult to parse in a simple graphical interface. Therefore we argue that individual users should also be offered access to their data via a programmable, fully-functional API.
A user’s personal API should enable connection with the user's own client (e.g. a trusted data management service) and make all data provided by the platform accessible to researchers. Such a tool would also enable more efficient management of personal data, independent from the platform’s proprietary interface. The same logic applies to advertising settings: While it may be time-consuming for users to manage their list of trusted entities and granular consents, a personal API would allow them to programme the advertising interface so that control settings change automatically if some results appear (“block every entity that uploaded my data to Facebook”) or during different times of the day (“office mode” or “night mode”).
While a personal API would help users to manage their platform settings and access their own data, it would not correct the power imbalance between platforms and users. Technically speaking, the platform maintains full control over API structure and services that it offers to users. Therefore we argue that a more meaningful way to increase users’ control over their own data is to establish open protocols that communicate users’ choices (under the GDPR or other legislation, yet to come). Internet platforms should be forced to respect users’ choices about data management, as long as these choices are communicated to the platform with the use of open protocols. Establishing this independent language for platform-user communication could open space for new services and tools, such as data management clients or browser-based advertising settings (e.g. new Do Not Track feature).
New Do Not Track feature and ‘exception sets’
Alan Toner, independent policy analyst and former member of the W3C Tracking Protection Working Group on behalf of the Electronic Frontier Foundation
Consent mechanisms (also known as “consent management platforms”) have become ubiquitous due to the GDPR and are much criticised. Users are required to make privacy choices on each site individually, are often subject to manipulative design, and are not — and cannot — be properly informed about how their data will be used should they agree.
For a consent scheme to be credible it must meet basic usability requirements. First, it must function based on machine readability, so that users can set a general preference which will be complied with. And users should be called to make choices regarding exceptions only occasionally. Second, there must be standard definitions of who is allowed to use the data and for what purpose, so that the consequences of the agreement are clear.
A real improvement to the existing consent mechanism (unilaterally controlled by the platform) would be a GDPR-compliant version of the “old” Do Not Track feature. Do Not Track is a header signal which can be enabled in the browser to communicate to the server that the user does not consent to tracking. There is currently no legal obligation to comply with DNT, but the option to activate the signal exists in almost all browsers (but not Safari). DNT is a superior method for expressing user preferences because it is persistent and does not rely on a central register of identity information.
A new Do Not Track feature should enable users to set criteria for when a site can, or must, ask for an exception from their general preference (“do not track me!”). For example, an exception request might be permitted only when the number of visits to a site exceeds a given threshold. And it should only be possible to ask if the site’s processing practices match established standards. Users may also decide that sites that want to target ads at them using their activity on the specific site can do so without making an exception request.
Users of the new Do Not Track feature could also choose to rely on “exception sets” managed by trusted third parties (“TTPs”). These TTPs would collect sites in a given category, check their compliance with agreed processing practices, grade them, and package them in a manner which matches user expectations. A user who values news could subscribe to an exception package which allowed news producers to use coarse location data together with context to target ads at them. That package of publications would be editable via interface to enable exclusion of default components or to add additional sources (e.g. a network of popular blogs or forums).
As is the case with common security issues, the browser is best positioned to provide users with the support that they need to protect their privacy. Browsers could also make it easier for users to think through their role in the production of public goods such as news. Browsers could offer the activation of exception packages for sites designated as having a high social value during the browser installation process. Such a system would offer publishers a strong motivation to converge on reasonable and intelligible uniform processing practices, and provide users with a trust framework which would make consent meaningful.
6.Limitations on PMT to protect societal interests
Transparency and control tools offered to researchers and the general public can certainly increase platforms' accountability in the area of PMT, but they do not address all concerns related to the use of this technology (see: Political microtargeting as (part of) the problem). Different responses are needed to address societal harms, such as polarisation of public debate or the manipulation of voters' opinions. While societal impact of PMT was outside the scope of our research, this issue has been analysed by other researchers, including Frederik Borgesius, Judith Möller, Sanne Kruikemeier, and others from the IViR Institute at the University of Amsterdam. In the conclusion of their paper “Online Political Microtargeting: Promises and Threats for Democracy,” these authors make the following claims:
[...] microtargeting also brings disadvantages for democracy. It can invade people’s privacy, and could be used to exclude or manipulate people. For example, microtargeting enables a political party to, misleadingly, present itself as a different one-issue party to different people. [...] A risk for public opinion is that the priorities of political parties may become opaque. Moreover, political discussions may become fragmented when different voter groups focus on different topics. These risks are serious, and if they materialise, they threaten democracy.
In this context, we argue that the use of PMT should be subject to specific limitations in order to protect societal interest (interests which cannot be reduced to the sum of individual harms). We acknowledge that this proposition will require more research and deliberation. However, for the sake of debate, we have developed the following propositions:
- Prohibiting PMT based on characteristics which expose our mental or physical vulnerabilities (e.g. depression, anxiety, addiction, illness);
- Mandating that internet platforms use and publish fairness criteria for their ad optimisation, particularly with regard to PMT;
- Mandating that PMT qualifies as a high-risk application of so-called Artificial Intelligence (e.g. machine learning or other algorithms) to enable automated decisions that might have significant, harmful impact on humans — and therefore should be subject to independent auditing procedures, controlled by a relevant regulator.
7.Constraints on financing online political campaigns
As we established in our case study, effective oversight of campaign spending on political ads on online platforms is very difficult given the current legal framework in Poland. Acknowledging that we have only researched the Polish context and that regulating national and local elections falls under the competence of Member States (rather than the EU institutions), we present the following recommendations:
- Financial reports of election committees and political parties should be filed in an electronic format, preferably open and searchable (not scans);
- Financial statements should include the direct invoice / transaction between the political party and the online platform and should indicate at least: the name of the party or committee that paid for the ad; the title of the ad selected by the advertiser; the ad ID from the Facebook Ad Library; and information on the amount spent (in the case of a recurring ad, emission times and amounts). All disclosures available for an ad in the public ad library (see: Disclosures of Ad Targeting) should also be part of the invoice;
- Identical information should be prepared and presented to the oversight body when the party/committee uses the services of an intermediary (e.g. a media agency);
- Expenses for online ads should be classified in two separate categories: one for ads on online platforms or any other services that enable targeting and microtargeting. And one for online ads that are contextual or for unspecified groups of users (by “unspecified,” we mean a group of people that cannot be assigned particular personal data-based attributes);
- Financial reports should distinguish services purchased from an entity or an individual that entail sharing information about the party’s or committee’s activities, their messages, or any other information related to political content on profiles and Facebook pages that they manage.
8.Binding regulation on PMT is necessary
In line with observations made by other researchers, we conclude that the transparency tools offered voluntarily by Facebook do not allow for independent scrutiny of targeted political ads. In addition, Facebook's privacy and advertising settings do not allow users to control the use of their (inferred) data for advertising purposes and express choices that they are entitled to make under the GDPR. We also argue that both the general public (including researchers) and individual users should have access to respective transparency and control tools via fully-functional APIs, and not just proprietary interfaces designed by the platform. Finally, we argue that the use of targeted political advertising should be limited in order to protect societal interests.
Does this mean the EU should specifically regulate the use of PMT by online platforms? Our answer is “yes,” however it is also true that most of the transparency measures and some of the limitations we propose could easily apply to behavioural targeting in general.
We acknowledge that flaws with existing privacy and advertising settings could, to some extent, be solved by stronger enforcement of the GDPR and by issuing authoritative guidelines on its interpretation in the PMT context. However, other recommendations we make require moving from self-regulation toward binding regulation. Having observed policies and practices of online platforms over the last decade, we have learned that platforms will not voluntarily reveal any information that could threaten their business model. As long as platforms’ top priority is pleasing shareholders, they will not prioritize users' rights or the public interest unless forced by strong regulation.
We argue that platforms play a key role in delivering targeted political ads and, therefore, their actions and capabilities should not be hidden in the shadows. Neither should these entities be left alone to determine the scope and the format of disclosures related to PMT. This is a task for the European regulator to define and enforce, together with minimum technical standards for APIs that allow independent scrutiny of targeted political ads. The European regulator must also introduce limitations on the use of PMT to protect societal interest and ensure that the same rules apply across the single digital market.
In Europe, there is currently no clarity nor a common standard governing how much transparency is required into online platforms’ advertising practices. This observation, calling for legal intervention and more harmonisation, is supported by the study commissioned by the EC in 2018:
The main legal challenge, as apparent from the diversity of examples documented during the desk research, is that there is an abundance of disclosure practices fragmented across devices, jurisdictions and providers, while the legislative framework is open as to how and how much disclosure must be provided.
At the same time, we acknowledge that the EU does not have the competence to regulate rules applicable to political parties when it comes to financing (online) advertising campaigns and their reporting obligations. While European regulation is the best tool to increase transparency and accountability of online platforms in the context of PMT, new obligations that should be imposed on political parties will be left to Member States.
Recommended reading
Reports and investigations:
- SHARE LAB (V. Joler), Facebook Research, August 2016 - May 2017
- The Guardian, The Cambridge Analytica Files
- UK Information Commissioner’s Office, Democracy Disrupted? Personal Information and Political Influence, July 2018
- D. Ghosh, B. Scott, Digital Deceit II. A Policy Agenda to Fight Disinformation on the Internet, September 2018
- ProPublica, What We Learned From Collecting 100,000 Targeted Facebook Ads, December 2018
- Electoral Reform Society, Reining in the Political ‘Wild West’. Campaign Rules for the 21st Century, February 2019
- Tactical Tech, Personal Data: Political Persuasion, March 2019
- Panoptykon (K. Szymielewicz), Three layers of your digital profile, March 2019
- Oxford Internet Institute, The Market of Disinformation, October 2019
Assessments and recommendations:
- Mozilla, Facebook and Google, this is what an affective API looks like, March 2019
- European Commission, Annual self-assessment reports of signatories to the Code of Practice on Disinformation, October 2019
- UK Centre for Data Ethics and Innovation, Review of Online Targeting, February 2020
Selected academic papers:
- A.Andreou, G. Venkatadri, O. Goga, K. Gummadi, P. Loiseau, A. Mislove, Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebook’s Explanations, January 2018
- F. Zuiderveen Borgesius, J. Moeller, S. Kruikemeier, R. Ó Fathaigh, K. Irion, T. Dobber, B. Bódó, C. H. de Vreese, Online Political Microtargeting: Promises and Threats for Democracy, February 2018
- A.Andreou, M. Silva, F. Benevenuto, O. Giga, P. Loiseau, A. Mislove, Measuring the Facebook Advertising Ecosystem, December 2018
- R. Deibert, The Road to Digital Unfreedom: Three Painful Truths About Social Media, January 2019
- M. Ali, P. Sapiezynski, M. Bogen, A. Korolova, A. Mislove, A. Rieke, Discrimination through optimization: How Facebook’s ad delivery can lead to skewed outcomes, September 2019
- P. Leerssen, J. Ausloos, B. Zarouali, N. Helberger, C. H. de Vreese, Platform ad archives: promises and pitfalls, October 2019
- C. Bennett, S. Oduro Marfo, Privacy, Voter Surveillance and Democratic Engagement: Challenges for Data Protection Authorities, October 2019
- T. Dobber, R. Ó Fathaigh, F. Zuiderveen Borgesius, The regulation of online political micro-targeting in Europe, December 2019
- M. Ali, P. Sapiezynski, A. Korolova, A. Mislove, A. Rieke, Ad Delivery Algorithms: The Hidden Arbiters of Political Messaging, December 2019
- M. Silva, L. Santos de Oliveira, A. Andreou, P. Olmo Vaz de Melo, O. Goga, F. Benevenuto, Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook, January 2020
ABOUT US
Who (really) targets you is a project by Panoptykon Foundation, ePaństwo Foundation, and SmartNet Research&Solutions (provider of Sotrender), funded by Civitates. Panoptykon controls the government and corporations, fights for freedom and privacy online; ePaństwo opens public data to make authorities more transparent; Sotrender provides research and analytical tools. We teamed up to cast more light on political advertising practices on Facebook.
Charts - style
Charts - Script