What if there was a Google search bar for Census data?
The Census Bureau’s Chief Data Officer Zach Whitman and his team are making that a reality for data geeks and curious citizens alike.
Speaking at the June 12-14 Amazon Web Services Public Sector Summit in Washington, D.C., Whitman said data.census.gov is replacing the bureau’s American FactFinder “in both form and function” for a majority of users looking to do basic things like extract data and visualize maps.
“We are trying to expand our market share because the biggest thing we hear from people who are not American FactFinder users but need to use American FactFinder, is that it’s hard to use, they don’t understand it,” Whitman told Federal News Radio. “That slows down their ability to get to the right answer; it’s a lot of unclear table names that they don’t understand and isn’t quite related to the question. If you think of like a Google-type approach where it’s a single search bar, type in what you want — and it can be American FactFinder-type language where you type in a specific table ID, or if you can just [type] ‘I want to know the average commute time in Maryland’ — we’re trying to bridge that gap and allow for the novice user experience since Google has trained everybody to use that type of experience. We’re trying to emulate that as close as possible with our data.”
The switch isn’t happening quite yet, but Whitman’s team is working on two-week sprint cycles, with release dates at a roughly 40-day cadence.
“We’re making sure that our testing is very thorough before we end up releasing,” Whitman said. “We haven’t open sourced our code yet, but it’s coming.”
At this point in the cycle, Whitman said, the team is looking for feedback from heavy and “hardcore” American FactFinder users — the ones who use the site enough to notice details and give input on data responses.
“We have a very strict mandate to emulate the same standards American FactFinder has, so we need to match that and so we’re starting with them,” Whitman said. “It’s also due to the fact that we know the basic users don’t like the interface on American FactFinder quite so much.”
That doesn’t mean Census won’t get their input, Whitman said, but right now the the focus is on advanced users.
Bridging the gap
Two problems Census hopes to address with the site is “discoverability” and consolidation of mapping experiences.
With the site, Whitman said, Census wants to build a platform that allows users “to grab the data that you’re looking for in a way that’s understandable, consumable and accessible.”
For many end users, they have to learn a lot about the Census Bureau before they can even start to begin using the data, Whitman said. More than 1 trillion Census estimates are available, and that’s not counting the metadata that explains Census data.
“What we’re finding is if we take an end-user perspective, take an end-user-centric design perspective, we end up finding a lot of frustration,” Whitman said. “We see people going and bouncing between one data dissemination product to another, not sure what they’re grabbing. They grab an estimate that might look right but might actually not be appropriate for their exact question and that leads to misinformation for the end user; they don’t actually get the right answer that they’re looking for, mostly because it’s just too hard.”
As for mapping, Whitman said one of the biggest problems is that users want to have data, but they don’t know what geography it might related to.
“People who know geospatial tend not to know data-data and vice versa, and that existed in the data science world as well,” Whitman said. “We’re trying to bridge that gap and make it easier, so for the geographer … they know geospatial data but they don’t know where to get the data from, and the tabular data. We’re going to help them and then in the other direction the data geek, we can get them shapes so they can quickly just throw a map up.”
Whitman said the next sprint finish line is near the July 4th holiday; in September the American Community Survey 1-year estimates are due out, followed by the 5-year estimates in December; in January, information about the 2017 Economic Census rolls out.
Once those deadlines have passed, the team can get to work improving the user experience even more.
“We’re busy trying to build as many summary levels into the service that we’re using, so that as easily as you can get to a table, you can get to a map and get to a thematic map,” Whitman said. “What’s great about that is that they’re coupled, so when a user has an interface they can build up a very custom view of a table or data element that they’re looking for, and then on top of that a custom geography set.”
Excited for the 2020 Census
Whitman said Census is also leaning on this type of service because it allows outside help to extend the bureau’s limited capabilities and resources.
“With funding and everything being so tight, we can only do so much,” Whitman said. “So what we need to do is focus that energy into a place that we know others can take it and run. And if we do our job right we can enable those types of behaviors rather than making it kind of a pain and forcing other people to do the hard yards and struggle through it. We want to make the experience of collecting data as pleasant as possible.”
Whitman said another hope with the site is to increase awareness of what the Census actually does, and why it’s important for people to participate in it.
“The more we can offer a value proposition to the public, I think the better we’ll see in terms of when you get something in the mail, or when you’re asked to fill something out online, maybe you’re not so encumbered to fill it out, maybe you’re actually excited to do it,” Whitman said.
All eyes are on the bureau as it gears up for the 2020 Census.
At nearly $13 billion, the 2010 Census was the most expensive count in U.S. history, and it cost the bureau about $100 per household. New systems like the Census Enterprise Data Collection and Processing (CEDCaP) effort are supposed to reduce costs by digitizing much of the process, but congressional overseers and federal auditors warn timing and cost overruns could negatively impact the decennial count.