Q&A with Eitan Hersh: how campaigns use data to target voters

The first Republican presidential debate is already over. The media is churning out a daily cycle of campaign coverage. The candidates are busy stumping even though the nearest primary is in February.

As voters learn about the presidential hopefuls, the candidates’ campaigns are combing databases to learn about voters. President Barack Obama’s successful 2008 and 2012 campaigns achieved renown for their data-driven voter outreach.

Hacking the Electorate: How Campaigns Perceive Voters,” a new book by Yale political scientist Eitan D. Hersh, examines how political campaigns use data to target voters. Hersh argues in the book that a core set of public records provides campaigns most of what they know about voters and that variations in open records laws across the country cause campaigns to perceive voters differently in different places.

Hersh, an assistant professor in the Department of Political Science and a resident faculty fellow at the Institution for Social and Policy Studies, discussed his findings with YaleNews.

Following the 2008 and 2012 presidential campaign, the media highlighted the Obama campaign’s data-driven efforts to mobilize voters. Has the press overhyped political campaigns’ ability to reach voters through collecting personal data?

A campaign’s ability to gather data has not been overhyped. What’s been overhyped is what a campaign can do with that data, how valuable it is. Compared to 10 to 20 years ago, campaigns are collecting a lot more data. And that’s not just because technology makes it easier to do so. There is a public policy angle to the data collection. We have voter registration laws that went into place after the 2000 election mishaps in Florida. Now, because of those laws, every state has to make a digital voter file and that file becomes a public record that campaigns can collect.

Once campaigns have a list of registered voters, then they can start appending to the list all sorts of interesting data coming from private and public sources. Some of those sources are commercial sources, but mostly they are public — like census information and licensing data, such as who has a hunting license.

While campaigns certainly are collecting a lot of data, the storyline in the press about that data is often off base. Journalists tend to overhype how useful campaign data are or what the data mean. I recently saw an article in USA Today describing a correlation between pro-life attitudes and preferences for desserts. Journalists tend to latch onto juicy tidbits like that, but these odd correlations that pop up are usually meaningless. My book explains where modern campaign data comes from and how it’s useful and not useful.

What public records do campaigns rely on most?

There are three big sources of government records that campaigns rely on. First and foremost are voter registration records. This data is interesting because it is very political. It contains data on which party you’re registered with, which past elections you voted in, and, in many states, it includes whether you’ve voted in Democratic or Republican primaries. The data contain an array of politically relevant demographic characteristics like your age, gender, and, in some cases, race. All of this — combined with the fact that the voter registration record contains your name, address, and, phone number— makes it very valuable to campaigns.

The next set of records is census data and other geographic information. The only real purpose of the U.S. Census, according to the Constitution, is to count the number of citizens in each state to apportion congressional districts. Very quickly the census was co-opted for other purposes. Some of those purposes are governmental. For example, the government may want to know how many poor people live in each state so it can allocate federal funds. For that purpose, it might want to collect information on citizens’ income. But a lot of the data that the census collects primarily serves the business community. In commercial marketing as well as campaign marketing, census data is core offline data that marketers are using because it is so comprehensive. For a fairly small geographic area, it tells you all sorts of information about the population.

The last bit of information that is becoming more prominent is licensing data. There have been some news stories lately about states requiring all sorts of occupations to have licensing where the government interest is not exactly clear — like licensing hairstylists. Campaigns can get this data through open records laws. They can learn who has a hunting license, a fishing license, a teaching license, a medical license — these things become part of the campaigns’ database.

You point out in your book that politicians have crafted open records laws to enable campaigns to collect people’s information. Can you explain more about that?

Almost everyone who has their hands in data laws has a political stake. The person who runs the election office — usually an elected secretary of state — is typically a partisan politician. Elected legislators are in charge of crafting data laws. In the book, I recount a number of instances where a state legislator was asking for some piece of data to be collected about voters for no other purpose than that it would be useful for campaign politics.

This is interesting because a) almost no one pays attention to state-level laws about data despite their importance, and b) an open records law that seemingly has only an administrative purpose turns out to be hugely important for how political campaigns get data, market to voters, and build electoral coalitions. One goal of the book is to show this connection.

Americans often view politics with a horserace mentality. Political scientists bring structure to this view. What we’re observing in campaigns is strategic actors working within a set of rules, regulations, and norms. My task here is to show how x affects y, how rules affect strategies and outcomes. In this case, it’s to show a connection between data law and electoral politics.

What kinds of data do campaigns tend to find most useful?

Political campaigns have a very difficult task. They have to engage a lot of voters — at the presidential level, it’s tens of millions of voters — and they don’t know much about these individuals. One perspective on campaign data is that by collecting lots of individual-level and neighborhood-level data, they can learn about the preferences of their voters. And what could be bad about that?

The other perspective is that actually all they really want to know is who’s with them and who’s against them. So they spend most of their time trying to identify who is a partisan. That’s their main goal, and so data that contribute to that goal are data like party affiliation, past behavior in party primaries, and demographic characteristics that are highly correlated with partisanship, such as race. They’re figuring out who is going to vote and who is likely to vote for them. They are less equipped to use data to try to discern the issue preferences of voters. Issue preferences and nuanced dispositions are much more complicated to predict than partisan loyalty.

Is information about people’s consumer preferences — the cars they drive, the food they prefer to eat, where they vacation — useful to political campaigns?

There are some relationships between consumer preferences and political attitudes that are out there. We all sense them. Usually, campaigns don’t need them. One example I give in the book concerns boat ownership, which is correlated with being a Republican, but so is being a rich white dude who lives in a fancy neighborhood — and that I already know from public records. Once I know your age, race, gender, income level, and the neighborhood you live in, the fact that you own a boat becomes irrelevant.

There are not too many consumer preferences that are strongly correlated with partisanship. Most people don’t divide based on these sorts of things. Those consumer preferences that are correlated with political views don’t tell us much because they are highly correlated with demographics that we already know are related to political preferences.

What happens in states that don’t provide information on race or party affiliation in public records? How do campaigns respond?

It has a big effect on efficiency. Think about white voters in Wisconsin. They are generally split Democratic and Republican, tend to live in mixed partisan neighborhoods, and there is really nothing that I can collect that outs them as a Democrat or Republican because the state’s voter registration system doesn’t provide information on party affiliation. If I’m a campaign, what am I going to do?

What I argue in the book is that a campaign will do a couple of things. First, the campaign will focus more on persuasion than mobilization. They’re going to knock on doors, send mailers, and call people on the phone, not to mobilize them (because they don’t know which voters are in their camp) but to persuade them.

The other thing that happens is a lot of accidental contact with the other side. In a state that collects information from people about their partisanship, campaigns basically cease contact with the other party. Here in Connecticut, we have party registration so Democratic campaigns usually see no reason to interact with Republican voters in their campaign tactics. They get the list from the voter registration system, delete all the Republicans, and interact with Democrats and independents. In a state like Wisconsin, where voters aren’t listed as Democratic or Republican, campaigns can’t do that so they end up interacting a lot with voters on the other side.

In the book, I don’t take a strong position on which situation is better or worse. There are tradeoffs. If we like a kind of politics in which campaigns interact more with the other side and are more focused on persuasion than just drumming up the base, then less data is better. If we prefer a kind of politics in which campaigns can build stronger coalitions by reinforcing their message to their base, then more data might be better. What I hope to show in the book is how public policy about data is a mechanism for voter engagement. In this sense, policy is also a tool for the public to use in order to nudge electoral politics in its preferred direction.