Saturday, November 17, 2012

Cost-cutting and "Third-Party Records"

On October 25-26, 2012, I attended the first in-person meeting of the NAC in Washington, DC.  At the meeting, I met members of the Census Bureau staff and most of the NAC members.  During the meeting, we received presentations on various topics relevant to the NAC's work.  In this post, I'll say a bit about one of the topics I think is most relevant to issues of racial justice and to people who mark Two Or More Races (TOMR):  the use of Third-Party Records as a cost-efficiency strategy.

The Census Bureau is responsible for producing the Constitutionally-mandated decennial (every 10-year) census, but it is also responsible for a variety of other services, including the annual American Community Survey (ACS) and other surveys.  The decennial Census and other surveys provide data that's important for a number of reasons.  Among their many uses, the data is used for apportioning Congressional seats for political representation, for allocating tax dollars for programs, and for providing a baseline against which claims of civil rights violations can be evaluated.  To provide one simplified example, if we know that a particular racial group is X% of the total U.S. population, but that they're more than three times X% of the people stopped-and-frisked by police, then we can use that information as part of an argument that that racial group is being discriminated against.  Without the baseline data, we don't have a point of comparison.  The decennial Census is one such survey that provides some of those baseline data.

Congressional budget cuts are putting pressure on the Census Bureau to be more efficient.  For example, Congress has directed the Census Bureau to deliver Census 2020 at approximately the same cost as Census 2000 -- that's a major budget cut, particularly given that each decade, the Census costs more to conduct, not less.  At the same time, people are less responsive to survey efforts and are more wary of participating; this is, I'm told, a global trend.  So, the Census is looking at various ways to increase accuracy while cutting costs.

One notable strategy being considered is the use of what is being called, variously, "administrative records" or "third-party records" (TPR).  Imagine this: You're the Census Bureau and you're conducting a survey.  You want to survey as many of the people relevant to the survey as you can; you want to be accurate; and you want to be as efficient as possible.  Perhaps you start by sending people a paper survey; that's relatively inexpensive and many people will respond right away.  However, to reach the people who don't respond right away via the paper survey, you might want to follow-up with them via a phone call or, if that doesn't work, to talk to them in-person -- and that's expensive.  That's where third-party records might start to look appealing.

Say you haven't been able to get Jane X to respond to your survey.  You've sent her a survey.  You've send her a follow-up note.  You've called her three times.  This is getting expensive.  Remember, there's millions of non-respondents like Jane X; it adds up.  You could send someone to Jane X's house -- maybe she's there, maybe she isn't and each visit is expensive.  Or, you could decide that, at some point in the cost-curve, you're just going to ask someone else about Jane X -- someone who HAS been able to successfully gather information from Jane X.  That "someone" is a "third party."  Third parties might include other governmental agencies (e.g., the Department of Education) and private entities (e.g., businesses who gather, track, and analyze data -- aka "Big Data" -- but also other businesses that you might know, like Amazon.com).  So, instead of visiting Jane X for a first, second, or third time, maybe you buy access to the databases of the Department of Education or some "Big Data" company.  Jane X has probably filled out some Department of Education form at some point -- and, if you can't reach Jane X, well, maybe you could just take some of that data and use it to fill in what you don't already know about Jane X.  Sound like a technically sensible strategy?

But, there are some problems with using Third-Party Records.  One set of problems has to do with privacy and data security.  That set of problems is outside my area of expertise and there's a few people on the NAC who're experts and advocates on such issues.  Another set of problems more directly relates to my own areas of interest: racial justice.  The Census Bureau runs small-scale experiments to test out possible strategies -- and the results from tests of using Third-Party Records seems to indicate that there are racial disparities in the accuracy of TPR data.

TPR data is worse at filling in information about People of Color than it is about White people.  And TPR data is particularly bad at filling in information about People of Color who identify as Two Or More Races (TOMR).  And when I say particularly bad, I mean that TPR data on Whites might be 90+% accurate -- but for Monoracial People of Color, it might be somewhere in the 70% to 90% range -- and for TOMR People of Color, it's somewhere between 4%-36% accurate.  And that's a really marked racial disparity in accuracy.  This disparity for TOMR respondents might be created, in part, because many third parties don't allow people to "Mark One or More" races -- this, despite the Office of Management and Budget (OMB) Directive 15, which instructs Federal agencies and those entities that receive Federal funding to use formats that allow people to "Mark One or More" races.  With data that're inaccurate in racially skewed ways, it becomes more difficult to use data to make claims and cases about racial discrimination -- we're back to the baseline data idea.

Currently, it's my sense that the Census Bureau is seriously considering the use of Third-Party Records as a cost-efficiency strategy, given the deep financial cuts.  So, while people might consider advocating against use of TPR data, I'm not sure how strategic or winnable that might be.  I think we should, at least, be considering and discussing ways to reduce the racial disparities created by the use of TPR data.

I have many questions about the use of Third-Party Records.  What creates these racial disparities in accuracy?  What would the consequences of these racial disparities be?  How might we reduce those racial disparities and improve the accuracy of TPR data use?  I, along with some of my fellow NAC members, have proposed convening a Working Group to explore questions about Third Party Record use.

QUESTIONS FOR YOU:  What're your thoughts about Third Party Record use?  What're your questions and concerns?  And do you know people who might be important to include in discussions about such issues?  Maybe people who'd be available to participate in an NAC Working Group?

Please discuss in the comments section, below -- and/or email me at CensusNAC@gmail.com.