The first official tally of American households was recorded in the 1790 Census. Every ten years since, the Census Bureau has collected data about the number of people living in the United States. Once again, the Census Bureau is asking all people who live in the United States to stand and be counted.
The 2020 Census questionnaire, distributed in mid-March, asks questions about who resides in a particular household, the type of dwelling they occupy, and the relationships that link them as members of a household unit. Obtaining accurate data is essential to ensuring the most equitable distribution of government funds as well as political representation at every level of the government. Demographers and statisticians, concerned about low compliance and, therefore, inaccuracy in the official count, are especially concerned about this census for two reasons: the increasing mistrust of government among some people living in the United States and the coronavirus pandemic. These factors may impact the willingness and/or ability of people to complete the questionnaire, forcing reliance on other federal government data to supplement the Census and provide a more realistic picture of the population as a whole.
The Census Bureau is trying to mitigate the fears of government data misuse and reassure people that personal data will be keep confidential. Protecting privacy in the age of big data is a monumental challenge. When so much of our personal identification information is already in the hands of Silicon Valley leviathans less concerned with privacy than profits, and data breaches are increasingly common, it’s not difficult to see why people might question the government’s promise to maintain confidentiality. Moreover, in the age of big data, it is increasingly easy to identify an individual person with just a few demographic facts since so many other data points already exist to flesh out exactly who someone may be. This “mosaic effect” allows savvy data miners to combine existing open data sets with, for instance, the 2020 census data to identify specific individuals. To combat this, the government builds various disclosure avoidance methods into their calculations. For the 2020 Census, the method of choice is called “differential privacy,” a strategy that has both adherents and skeptics, but that, for now, many experts claim, is the best option for balancing risk of disclosure and accuracy of data. To learn more about this approach to data management and the Census Bureau’s “privacy loss budget,” see the following:
Differential Privacy for Census Data Explained (National Conference of State Legislatures)
Will the Census Improve Open Data Privacy Protections? (Government Tech)
Can a Set of Equations Keep U.S. Census Data Private? (Science)
Census 2020 Will Protect Your Privacy More than Ever – But at the Risk of Accuracy (The Conversation)