Tuesday, 05 February 2019 13:57

Facebook Algorithms and Personal Data

By  [This article is originally published in pewinternet.org written by PAUL HITLIN & LEE RAINIE]

[This article is originally published in pewinternet.org written by PAUL HITLIN & LEE RAINIE - Uploaded by AIRS Member: Jasper Solander] 

About half of Facebook users say they are not comfortable when they see how the platform categorizes them, and 27% maintain the site's classifications do not accurately represent them

Most commercial sites, from social media platforms to news outlets to online retailers, collect a wide variety of data about their users’ behaviors. Platforms use this data to deliver content and recommendations based on users’ interests and traits and to allow advertisers to target ads to relatively precise segments of the public. But how well do Americans understand these algorithm-driven classification systems, and how much do they think their lives line up with what gets reported about them? As a window into this hard-to-study phenomenon, a new Pew Research Center survey asked a representative sample of users of the nation’s most popular social media platform, Facebook, to reflect on the data that had been collected about them. (See more about why we study Facebook in the box below.)

Many Facebook users say they do not know the platform classifies their interests, and roughly half are not comfortable with being categorized

Facebook makes it relatively easy for users to find out how the site’s algorithm has categorized their interests via a “Your ad preferences” page.1 Overall, however, 74% of Facebook users say they did not know that this list of their traits and interests existed until they were directed to their page as part of this study.

When directed to the “ad preferences” page, the large majority of Facebook users (88%) found that the site had generated some material for them. A majority of users (59%) say these categories reflect their real-life interests, while 27% say they are not very or not at all accurate in describing them. And once shown how the platform classifies their interests, roughly half of Facebook users (51%) say they are not comfortable that the company created such a list.

The survey also asked targeted questions about two of the specific listings that are part of Facebook’s classification system: users’ political leanings, and their racial and ethnic “affinities.”

In both cases, more Facebook users say the site’s categorization of them is accurate than say it is inaccurate. At the same time, the findings show that portions of users think Facebook’s listings for them are not on the mark.

When it comes to politics, about half of Facebook users (51%) are assigned a political “affinity” by the site. Among those who are assigned a political category by the site, 73% say the platform’s categorization of their politics is very or somewhat accurate, while 27% say it describes them not very or not at all accurately. Put differently, 37% of Facebook users are both assigned a political affinity and say that affinity describes them well, while 14% are both assigned a category and say it does not represent them accurately.

For some users, Facebook also lists a category called “multicultural affinity.” According to third-party online courses about how to target ads on Facebook, this listing is meant to designate a user’s “affinity” with various racial and ethnic groups, rather than assign them to groups reflecting their actual race or ethnic background. Only about a fifth of Facebook users (21%) say they are listed as having a “multicultural affinity.” Overall, 60% of users who are assigned a multicultural affinity category say they do in fact have a very or somewhat strong affinity for the group to which they are assigned, while 37% say their affinity for that group is not particularly strong. Some 57% of those who are assigned to this category say they do in fact consider themselves to be a member of the racial or ethnic group to which Facebook assigned them.

These are among the findings from a survey of a nationally representative sample of 963 U.S. Facebook users ages 18 and older conducted Sept. 4 to Oct. 1, 2018, on GfK’s KnowledgePanel.

Social media users say it is easy for sites to identify their race and interests

The second survey of a representative sample of all U.S. adults who use social media – including Facebook and other platforms like Twitter and Instagram – using Pew Research Center’s American Trends Panel gives broader context to the insights from the Facebook-specific study.

This second survey, conducted May 29 to June 11, 2018, reveals that social media users generally believe it would be relatively easy for the platforms they use to determine key traits about them based on the data they have amassed about their behaviors. Majorities of social media users say it would be very or somewhat easy for these platforms to determine their race or ethnicity (84%), their hobbies and interests (79%), their political affiliation (71%) or their religious beliefs (65%). Some 28% of social media users believe it would be difficult for these platforms to figure out their political views, nearly matching the share of Facebook users who are assigned a political listing but believe that listing is not very or not at all accurately.

Why we study Facebook

Pew Research Center chose to study Facebook for this research on public attitudes about digital tracking systems and algorithms for a number of reasons. For one, the platform is used by a considerably bigger number of Americans than other popular social media platforms like Twitter and Instagram. Indeed, its global user base is bigger than the population of many countries. Facebook is the third most trafficked website in the world and fourth most in the United States. Along with Google, Facebook dominates the digital advertising market, and the firm itself elaborately documents how advertisers can micro-target audience segments. In addition, the Center’s studies have shown that Facebook holds a special and meaningful place in the social and civic universe of its users.

The company allows users to view at least a partial compilation of how it classifies them on the page called “Your ad preferences.” It is relatively simple to find this page, which allows researchers to direct Facebook users to their preferences page and ask them about what they see.

Users can find their own preferences page by following the directions in the Methodology section of this report. They can opt out of being categorized this way for ad targeting, but they will still get other kinds of less-targeted ads on Facebook.

Most Facebook users say they are assigned categories on their ad preferences page

A substantial share of websites and apps track how people use digital services, and they use that data to deliver services, content or advertising targeted to those with specific interests or traits. Typically, the precise workings of the proprietary algorithms that perform these analyses are unknowable outside the companies who use them. At the same time, it is clear the process of algorithmically assessing users and their interests involves a lot of informed guesswork about the meaning of a user’s activities and how those activities add up to elements of a user’s identity.

Facebook, the most prominent social network in the world, analyzes scores of different dimensions of its users’ lives that advertisers are then invited to target. The company allows users to view at least a partial compilation of how it classifies them on the page called “Your ad preferences.” The page, which is different for each user, displays several types of personal information about the individual user, including “your categories” – a list of a user’s purported interests crafted by Facebook’s algorithm. The categorization system takes into account data provided by users to the site and their engagement with content on the site, such as the material they have posted, liked, commented on and shared.

These categories might also include insights Facebook has gathered from a user’s online behavior outside of the Facebook platform. Millions of companies and organizations around the world have activated Facebook pixel on their websites. The Facebook pixel records the activity of Facebook users on these websites and passes this data back to Facebook. This information then allows the companies and organizations who have activated the pixel to better target advertising to their website users who also use the Facebook platform. Beyond that, Facebook has a tool allowing advertisers to link offline conversions and purchases to users – that is, track the offline activity of users after they saw or clicked on a Facebook ad – and find audiences similar to people who have converted offline. (Users can opt out of having their information used by this targeting feature.)

Overall, the array of information can cover users’ demographics, social networks and relationships, political leanings, life events, food preferences, hobbies, entertainment interests and the digital devices they use. Advertisers can select from these categories to target groups of users for their messages. The existence of this material on the Facebook profile for each user allows researchers to work with Facebook users to explore their own digital portrait as constructed by Facebook.

The Center’s representative sample of American Facebook users finds that 88% say they are assigned categories in this system, while 11% say that after they are directed to their ad preferences page they get a message saying, “You have no behaviors.”

A majority of Facebook users have 10 or more categories listed on their ad preferences page

Some six-in-ten Facebook users report their preferences page lists either 10 to 20 (27%) or 21 or more (33%) categories for them, while 27% note their list contains fewer than 10 categories.

Those who are heavier users of Facebook and those who have used the site the longest are more likely to be listed in a larger number of personal interest categories. Some 40% of those who use the platform multiple times a day are listed in 21 or more categories, compared with 16% of those who are less-than-daily users. Similarly, those who have been using Facebook for 10 years or longer are more than twice as likely as those with less than five years of experience to be listed in 21 or more categories (48% vs. 22%).

74% of Facebook users say they did not know about the platform’s list of their interests

Most Facebook users do not know the platform lists their interests for advertisers, and half are not comfortable with these lists

About three-quarters of Facebook users (74%) say they did not know this list of categories existed on Facebook before being directed to the page in the Center’s survey, while 12% say they were aware of it.2Put differently, 84% of those who reported that Facebook had categorized their interests did not know about it until they were directed to their ad preferences page.

When asked how accurately they feel the list represents them and their interests, 59% of Facebook users say the list very (13%) or somewhat (46%) accurately reflects their interests. Meanwhile, 27% of Facebook users say the list not very (22%) or not at all accurately (5%) represents them.

Some Facebook users do not agree with the political label the platform assigns them

Yet even with a majority of users noting that Facebook at least somewhat accurately assesses their interests, about half of users (51%) say they are not very or not at all comfortable with Facebook creating this list about their interests and traits. This means that 58% of those whom Facebook categorizes are not generally comfortable with that process. Conversely, 5% of Facebook users say they are very comfortable with the company creating this list and another 31% declare they are somewhat comfortable.

There is clear interplay between users’ comfort with the Facebook traits-assignment process and the accuracy they attribute to the process. About three-quarters of those who feel the listings for them are not very or not at all accurate (78%) say they are uncomfortable with lists being created about them, compared with 48% of those who feel their listing is accurate.

Facebook’s political and ‘racial affinity’ labels do not always match users’ views

It is relatively common for Facebook to assign political labels to its users. Roughly half (51%) of those in this survey are given such a label. Those assigned a political label are roughly equally divided between those classified as liberal or very liberal (34%), conservative or very conservative (35%) and moderate (29%).

Among those who are assigned a label on their political views, close to three-quarters (73%) say the listing very accurately or somewhat accurately describes their views. Meanwhile, 27% of those given political classifications by Facebook say that label is not very or not at all accurate.

There is some variance between what users say about their political ideology and what Facebook attributes to them.3 Specifically, self-described moderate Facebook users are more likely than others to say they are not classified accurately. Among those assigned a political category, some 20% of self-described liberals and 25% of those who describe themselves as conservative say they are not described well by the labels Facebook assigns to them. But that share rises to 36% among self-described moderates.

In addition to categorizing users’ political views, Facebook’s algorithm assigns some users to groups by “multicultural affinity,” which the firm says it assigns to people whose Facebook activity “aligns with” certain cultures. About one-in-five Facebook users (21%) say they are assigned such an affinity.

The use of multicultural affinity as a tool for advertisers to exclude certain groups has created controversies. Following pressure from Congress and investigations by ProPublica, Facebook signed an agreement in July 2018 with the Washington State Attorney General saying it would no longer let advertisers unlawfully exclude users by race, religion, sexual orientation and other protected classes.

In this survey, 43% of those given an affinity designation are said by Facebook’s algorithm to have an interest in African American culture, and the same share (43%) is assigned an affinity with Hispanic culture. One-in-ten are assigned an affinity with Asian American culture. Facebook’s detailed targeting tool for ads does not offer affinity classifications for any other cultures in the U.S., including Caucasian or white culture.

Of those assigned a multicultural affinity, 60% say they have a “very” or “somewhat” strong affinity for the group they were assigned, compared with 37% who say they do not have a strong affinity or interest.4 And 57% of those assigned a group say they consider themselves to be a member of that group, while 39% say they are not members of that group.

This report is a collaborative effort based on the input and analysis of the following individuals. Find related reports online at pewresearch.org/internet.

Primary researcher

Paul Hitlin, Senior Researcher
Lee Rainie, Director, Internet and Technology Research

Research team

Aaron Smith, Associate Director, Research
Kenneth Olmstead, Research Associate
Andrew Perrin, Research Analyst
Andrea Caumont, Social Media Editor

Editorial and graphic design

Margaret Porteus, Information Graphics Designer
David Kent, Copy Editor

Communications and web publishing

Shawnee Cohn, Communications Manager
Sara Atske, Assistant Digital Producer

Facebook user survey

The analysis in this report is based on a nationally representative survey conducted from Sept. 4 to Oct. 1, 2018, among a sample of 963 U.S. adults ages 18 years and older who have a Facebook account. The margin of error for the full sample is plus or minus 3.4 percentage points.

The survey was conducted by the GfK Group in English and Spanish using KnowledgePanel, its nationally representative online research panel. KnowledgePanel members are recruited through probability sampling methods and include those with internet access and those who did not have internet access at the time of their recruitment (KnowledgePanel provides internet access for those who do not have it, and if needed, a device to access the internet when they join the panel). A combination of random-digit dialing (RDD) and address-based sampling (ABS) methodologies have been used to recruit panel members (in 2009 KnowledgePanel switched its sampling methodology for recruiting members from RDD to ABS).

KnowledgePanel continually recruits new panel members throughout the year to offset panel attrition as people leave the panel. All active members of the GfK panel with an active Facebook account were eligible for inclusion in this study. In all, 1,419 panelists were invited to take part in the survey. All sampled members received an initial email to notify them of the survey and provided a link to the survey questionnaire. Additional follow-up reminders were sent to those who had not responded as needed. In total, 1,040 people completed the survey. Of those, 963 cases were determined to be valid and included in the final analyses. The other 77 cases were excluded due to evidence of speeding through the survey or because the respondent was not able to log in to Facebook or find the right page.

To complete the survey, respondents were asked to log in to their Facebook account and navigate to the page containing their Facebook ad categories. The survey then asked them to answer a series of questions about the contents of that page. All findings in this study are based on these self-reported results – the Center did not gain access to users’ Facebook accounts or collect any additional data (whether passively or otherwise) about users’ Facebook accounts beyond what was self-reported in the survey.

The process for finding the page of categories that Facebook has developed about a given user may differ depending on the device being used to access Facebook.

Respondents completing the survey on a laptop or desktop computer we instructed to follow these steps:

  1. Log on to your Facebook.com account.
  2. On the upper right side of the screen click on the upside-down black triangle. You will get a dropdown menu. Click on “Settings” near the bottom of the menu.
  3. On the “General Account Settings” page, click on “ads” on the lower part of the left column.
  4. This should put you on a page called “Your ad preferences.” Click on the tab of this page called “Your information.”
  5. You will see two choices right under “Your information,” one that says “About you” and one that says “Your categories.” Click on “Your categories.”
  6. Once you select “your categories,” you should see one of two options:
    1. A list of boxes with information about your hometown, birthday, interests, etc. You might need to select “see more” to see all the categories on your list
    2. A message that says you do not have any “behaviors” listed. (This will likely appear if you have previously changed your privacy settings to prevent Facebook from collecting certain information.)

Respondents completing the survey on a mobile device were instructed to follow these steps:

  1. Open the Facebook app and sign in or open a web browser and navigate to Facebook.com and log in.
  2. Near the top or bottom of your screen (depending on your device), you will see three horizontal lines. Click those lines.
  3. On the next screen, scroll down and select “settings.”
  4. On the next screen, scroll down and select “ad preferences.”
  5. On the next screen, select the option that reads “your information.”
  6. Select the option that reads “Review and Manages your Categories.”
  7. You should see a screen with one of two options:
    1. A list of boxes with information about your hometown, birthday, interests, etc. You might need to select “see more” to see all the categories on your list.
    2. A message that says you do not have any “behaviors” listed. (This will likely appear if you have previously changed your privacy settings to prevent Facebook from collecting certain information.)

The final sample of 963 adults was weighted using an iterative technique that matches gender, age, race, Hispanic origin, education, region, household income, home ownership status and metropolitan area to the parameters of the Census Bureau’s March 2018 Current Population Survey (CPS) Supplement Data. This weight is multiplied by an initial sampling or base weight that corrects for differences in the probability of selection of various segments of GfK’s sample and by a panel weight that adjusts for any biases due to nonresponse and noncoverage at the panel recruitment stage (using all of the parameters described above).

Sampling errors and statistical tests of significance take into account the effect of weighting at each of these stages.

In addition to sampling error, one should bear in mind that question-wording and practical difficulties in conducting surveys can introduce error or bias into the findings of opinion polls.

American Trends Panel survey

The American Trends Panel (ATP), created by Pew Research Center, is a nationally representative panel of randomly selected U.S. adults recruited from landline and cellphone random-digit-dial (RDD) surveys. Panelists participate via monthly self-administered web surveys. Panelists who do not have internet access are provided with a tablet and wireless internet connection. The panel is being managed by GfK.

Data in this report are drawn from the panel wave conducted May 29-June 11, 2018, among 4,594 respondents. The margin of sampling error for the full sample of 4,594 respondents is plus or minus 2.4 percentage points.

Members of the American Trends Panel were recruited from several large, national landline and cellphone RDD surveys conducted in English and Spanish. At the end of each survey, respondents were invited to join the panel. The first group of panelists was recruited from the 2014 Political Polarization and Typology Survey, conducted Jan. 23 to March 16, 2014. Of the 10,013 adults interviewed, 9,809 were invited to take part in the panel and a total of 5,338 agreed to participate.5 The second group of panelists was recruited from the 2015 Pew Research Center Survey on Government, conducted Aug. 27 to Oct. 4, 2015. Of the 6,004 adults interviewed, all were invited to join the panel, and 2,976 agreed to participate.6 The third group of panelists was recruited from a survey conducted from April 25 to June 4, 2017. Of the 5,012 adults interviewed in the survey or pretest, 3,905 were invited to take part in the panel and a total of 1,628 agreed to participate.7

The ATP data were weighted in a multistep process that begins with a base weight incorporating the respondents’ original survey selection probability and the fact that in 2014 some panelists were subsampled for invitation to the panel. Next, an adjustment was made for the fact that the propensity to join the panel and remain an active panelist varied across different groups in the sample. The final step in the weighting uses an iterative technique that aligns the sample to population benchmarks on a number of dimensions. Gender, age, education, race, Hispanic origin, and region parameters come from the U.S. Census Bureau’s 2016 American Community Survey. The county-level population density parameter (deciles) comes from the 2010 U.S. decennial census. The telephone service benchmark comes from the July-December 2016 National Health Interview Survey and is projected to 2017. The volunteerism benchmark comes from the 2015 Current Population Survey Volunteer Supplement. The party affiliation benchmark is the average of the three most recent Pew Research Center general public telephone surveys. The internet access benchmark comes from the 2017 ATP Panel Refresh Survey. Respondents who did not previously have internet access are treated as not having internet access for weighting purposes. Sampling errors and statistical tests of significance take into account the effect of weighting. Interviews are conducted in both English and Spanish, but the Hispanic sample in the ATP is predominantly native-born and English speaking.

In addition to sampling error, one should bear in mind that question-wording and practical difficulties in conducting surveys can introduce error or bias into the findings of opinion polls.

The May 2018 wave had a response rate of 84% (4,594 responses among 5,486 individuals in the panel). Taking account of the combined, weighted response rate for the recruitment surveys (10.0%) and attrition from panel members who were removed at their request or for inactivity, the cumulative response rate for the wave is 2.4%.8

Pew Research Center is a nonprofit, tax-exempt 501(c)(3) organization and a subsidiary of The Pew Charitable Trusts, its primary funder.


World's leading professional association of Internet Research Specialists - We deliver Knowledge, Education, Training, and Certification in the field of Professional Online Research. The AOFIRS is considered a major contributor in improving Web Search Skills and recognizes Online Research work as a full-time occupation for those that use the Internet as their primary source of information.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.