The Obama administration’s goal with its open data initiative is simple: make the data that’s already readily available public and in a machine readable format, then wait for the magic to happen.
Steve VanRoekel, the federal chief information officer, said concerns about trust in government, something called the mosaic effect and even the cost of managing and releasing the stores of information that agencies currently hold all will be overcome by the economic potential of the data.
“The opportunity we have inside government is all the efficiencies gains that come from that. When you have the ability to connect data that makes it programmable, amazing things happen from your insights to make decisions to bring clarity to certain things and others,” he said during a recent press conference with reporters at the FOSE trade show. “Then outside government, you have the opportunity to create economic opportunity to drive the ability for researchers to look at education evidence and garner insights from that to change the way we educate kids; in the way it creates economic opportunity in jobs for entrepreneurs and innovators to create solutions for citizens, Americans or others out there.”
VanRoekel said open doesn’t always imply public, but the goal for agencies is to treat data as an asset.
President Barack Obama earlier this month signed an executive order and the Office of Management and Budget released a new policy creating a new norm for how agencies treat and make data accessible.
VanRoekel said if agencies could just release the treasure trove of data that’s locked up in paper today or in proprietary formats, the impact would be huge.
“We can really change the way people perceive government and government services,” he said.
One concern that some open government advocates brought up is around the section in the policy that discusses the mosaic effect.
The mosaic effect is the idea that multiple data sets taken one-by-one provide no private information. But taken all together, the information could shed light on personal or classified data.
Open government advocates say agencies could use it as an excuse not to release data.
“The open data policy and executive order changes nothing about the Freedom of Information Act really in any way,” VanRoekel said. “It’s really a discussion of the format of the data and the opting for where things are public to make them public and doing that assessment. We will still follow FOIA to the letter of the law.”
VanRoekel said the mosaic effect section really came from the work they did with releasing medical payment data. He said it’s been around for the last year or so and they haven’t seen any impact on FOIA requests or releases of data.
“This is an important part of setting the new default as well,” he said. “If we were to say, we will do machine readability of data that would be retroactive, I think the way we collected data because paper is a less computable environment, we didn’t capture information in the past in a record form or individual form that is respective of privacy and security because you can take that piece of paper and lock it in the room. I’m excited about the new default, because we will have a fresh look at what kind of data we are releasing and how we are releasing it.”
VanRoekel addressed another concern about the cost to manage and release the huge amount of information.
He said the benefits from releasing the data will outweigh the changes to how agencies now have to capture information. VanRoekel said there are plenty of opportunities for agencies to change their processes that will save money.
“There are so many examples of the form that gets filled out, scanned in and reprinted for some process, there are so many of these that exist out there,” he said. “The principles of open data, quality in from a collection side, quality use inside and open data dissemination of the data will streamline these things in a way that will have ripple effects into the savings. You are seeing this in the private sector. I understand things like data storage. … We have to be very smart about solutions where those kind of scenarios will exist. I don’t think government as a massive API respository for access to live government data streams isn’t necessary in every case. We have to be smart about things. We have to look at options for things like timely bulk downloads or things like that to manage around some of the resource constraints in the future.”
VanRoekel said the updated version of data.gov will have some of those tools, including a key manager to allow developers give the government some analytics on the data they are downloading so the system isn’t overtaxed.