The government collects an impressive amount of data that’s used in both the public and private sectors. But agencies could better use that data if they collected and stored it the same way, or if automation tools made for faster analysis.
That’s the thinking behind the data accountability and transparency piece of the President’s Managment Agenda, and is one of the plan’s 14 cross-agency priority (CAP) goals. Federal Chief Information Officer Suzette Kent said the data CAP goal will help the government get its arms around its huge trove of data.
“We can’t use the data if we can’t understand it,” Kent said at a Data Coalition event in Washington, D.C. on Wednesday. “The data CAP goal is the focused effort on how we transform the governmentwide approach to how we look at and use data.”
The CAP goal aims to improve how agencies carry out their missions and to make government data more accessible to the private sector.
“The data that the U.S. government has is the most valuable data in the world, and there are many companies that already use U.S. government data. The geospatial data actually supports many current companies and the success in how they go to market,” Kent said. “But data from the IRS, from Census, from Transportation, does more to develop a national resource. It powers our innovation.”
Too much data too fast
To get a sense of how much data the government produces, the Commerce Department puts out as much as 20 terabytes of data every day. By 2025, Kent estimated that the government will produce at least 163 zettabytes of data — that’s more storage than 250 billion DVDs.
Kent said agencies need better automation tools, because they’re producing way too much data for humans to process.
“If one of us tried to process a terabyte of data, we would have to watch the equivalent of 400 90-minute videos,” she said. “I know some of you binge-watchers might say ‘Hey, I’m up for that,’ but that’s a lot. But using technology, and with the right discipline around data, we can process that in seconds. But it has to be structured, and we have to understand it.”
The President’s Mangement Agenda puts the Commerce Department and Small Business Administration in charge of the DATA CAP goal. They’ll help bring other agencies reach governmentwide benchmarks.
The CAP goal team is led by Commerce Undersecretary Karen Dunn Kelley; SBA Chief Officer Pradeep Belur; Jack Wilmer, the senior policy adviser with the White House’s Office of Science and Technology Policy, and Chief Statistician Nancy Potok.
But the Commerce Department and SBA have other priorities beyond just making data more accessible. Kent said the CAP goal also sets benchmarks to improve cybersecurity standards.
“From a cyber posture, we’re changing the concept of high-value assets to not necessarily be about applications, but to be about the data. That’s what’s important,” she said. “We’re looking [at] our goals and we’re aligning how data moves, and what we do with it when it’s at rest.”
Striking a balance of privacy, transparency
Kent added that the CAP goal will have agencies moving from a perimeter-based approach to cybersecurity, and instead adopting a security framework focused on protecting the information.
Agencies have other priorities laid out of them under the data CAP goal, such as striking a balance between privacy and data transparency.
The CAP goal also includes priorities for the federal workforce. Kent said agencies will need to hire not just cybersecurity professionals, but also data analysts. She said in some cases, private sector companies are hiring more data people than cybersecurity people.
“We have to evolve roles for data scientists, data labelers, model builders, deep learning development, inquiry designers,” Kent said. “As we move down this path that’s wholly data-driven, we need different types of capabilities in our workforce.”
Former Federal CIO Tony Scott said the government needs to invest in more machine learning to get ahead of its data.
“We’ve got to get a lot more automation, because it’s very clear that even with an accelerated plan and a highly successful plan, we’re just not going to have the people in place to do all the work that’s required to keep all the systems that we have afloat and running effectively,” Scott said yesterday at a cloud computing event sponsored by Meritalk in Washington, D.C. “So we’ve got to automate like crazy.”
Scott added that agencies will also need to hire significantly more employees in their IT departments to help keep up with demand.
“We’ve got to go work on these human resource things for the longer term. We’re going to need to increase our hiring 20-50 percent in some agencies just to stay even over the next five years,” he said.
The Trump administration is kicking off this governmentwide data challenge, but Kent said the White House wants this effort to continue into the next administration and beyond.
“Through our CAP goal, we’re endeavoring to define a long-term strategy — something that will last many decades,” Kent said.