The intelligence community is moving from “connect-the-dots” to sharing some or all of the picture created by connecting those dots.
As part of this effort, a working group is developing data standards to ease the burden of understanding the picture the dots create.
“What we are finding is each agency or department sort of has their own view of the data they own and/or manage and exploit. What we are trying to do [is figure out] how do you harmonize that knowledge from one agency to another effectively, and how do you do it quickly, particularly in a time-sensitive situation?” said Dirk Rankin, the chief technology officer for the National Counterterrorism Center and the co-chairman of the Data Aggregation Working Group (DAWG), in an exclusive interview with Federal News Radio.
Rankin said the DAWG actually may have “stumbled” onto something where standards do not exist but are especially needed.
Given the niche that the working group is in, the three-year-old committee is putting together a series of tools and architectures to address this big data problem.
“The driving use case, is how do you quickly discern whether or not agency A knows X information about person Y and sharing that basic very high level. Just a quick scan, do we have anything on that individual across other elements of the U.S. government?” Rankin said. “I think if we had an ability to do that rapidly through machine-to-machine interactions and cloud technologies and so forth, given the security concerns and everything else that needs to be baked into that, we would have a much better chance as a government to prevent future negative events from happening.”
Clear need for the DAWG
Recent terrorist attacks against the United States demonstrate just how far the work has to go.
Rankin said while there has been a lot of work on data standards and data interoperability over the last decade, there still are challenges in correlating data.
He said the Christmas Day bomber helped launch the DAWG, and the Boston Marathon attack reinforced the need for systems to communicate with each other.
Paul Reynolds, the other co-chairman of the Data Aggregation Working Group, said without data standards and system interoperability, it’s much more difficult for the disparate databases to talk.
“We wanted to know very quickly what was going on and who was involved. From our perspective, there were a high number of people spending a lot of time working across different databases within different parts of the organization that I work for, who were not necessarily working together to try and get this single picture,” said Reynolds, who has received approval to speak only on behalf of the working group, as long as he didn’t identify his agency. “It took a lot longer than it should have, and it really doesn’t need to. That’s the thing. If we do this right, it doesn’t need to take that much effort. What we want to do is we want to get the people who are good at thinking about these things and really solving the problems and really asking the tough questions … and bring them to the point where they have a base knowledge of the information. Then we are in a good spot.”
He added the product of his agency’s work then could be sent over to another intelligence or law enforcement agency in a matter of seconds to help complete the picture. Or the information could be “staged” at the end of the agency’s domain space, where other officials could query on an as-needed basis.
Two pilots and a tool kit
The Information Sharing Environment brought together the Data Aggregation Working Group when these events proved an opportunity existed to improve data standards.
In its annual report to Congress from September and its more recent Strategic Implementation Plan for the President’s National Strategy for Information Sharing and Safeguarding, the ISE listed data tagging and developing a reference architecture as major priority areas.
Over the last three years, the DAWG has made progress in exploring and bringing together data aggregation best practices, especially from the health and finance sectors.
The working group currently is reviewing industry responses to a request for information issued in March.
The RFI will help the DAWG identify what is working or what has potential to go into a data aggregation reference architecture.
“We primarily are focusing on data aggregation systems, those systems that do the consolidation of information and correlation. By focusing on data aggregation systems, what we want to do is we want to build out an environment where these systems are available and ready to talk to each other in this format,” Reynolds said. “The first step in doing that is understanding where we are today, and part of the reference architecture is helping people figure out where they are from a maturity perspective, as far as those systems in particular go. Then we also, in this reference architecture, are building out a vision of where we want to go. We know it will take a while to get there. We also want to help people for this to be a tool. We want them to have a clear understanding at the end of this of where their system is, and they’ll have an idea of where they could take their system. And at least they’ll do it in the right direction of where the whole of government is going.”
Rankin added the reference architecture also is a call to action to the data community to operationalize and pull extra value out of these systems.
The working group still will have to fill in some data standard gaps, because there are things they just don’t know about or areas that will rise as systems mature.
The working group is expected to complete version 1.0 of the data reference architecture later this year. Then the intelligence and law enforcement communities would take three years to implement and build on the initial architecture.
In preparing to develop the architecture, the working group already ran two pilot programs using existing federal information exchanges across agency boundaries.
A call to action
Reynolds said the first pilot used the National Information Exchange Model (NIEM) and talked through the process of how to best exchange data, focusing on creating consistency, quality and timeliness.
He said the pilot focused on one organization sending data to another but not in a consistent way, which made it hard for the machine-to-machine communication. So, the organizations figured out how best to create a structure to make it easier for the machines to understand based on the NIEM structure.
“By doing so, we also had some data quality improvements. It just created rigor in our process,” he said. “We also found a way because of the consistency by sending it in a XML schema format, the machines pulling the information into the other systems could do it much more rapidly, so we found huge time savings on the consuming end. We were able to test this out a couple of times. We never really implemented it fully. When it came time to actually do production work, it got a little stuck. But it was a good lesson learned, which was the intent of the pilot.”
He said from the first pilot, the group created a tool kit that’s available on the ISE website so others could duplicate its efforts.
“The second time around, we used our own tool kit and it worked. But we also took a hard look at the mission value: how is the mission going to specifically benefit from this effort? So we added that feedback back into the tool kit and actually improved the tool kit,” Reynolds said. “We focused on a round-trip loop if information is being sent across to another agency. If we can fix just a little bit of internal processing on that end, we can actually add to that information and send the information back to the originating agency and provide them value from a business perspective with additional information to make a better counterterrorism decision. We were able to actually impact both the information going out and the information coming back. That is just about ready to be put into production. That is a real change that we’ve made with these pilots.”
Reynolds said one of the end goals with the reference architecture is to benefit many data exchanges, not just single instances of information sharing.
Rankin said the different communities the DAWG serves need to keep in mind this effort is in the early stages.
“We do expect to see some alignment of programs. And once the reference architecture can actually align things and people start catching the vision, then we can expect to see some greater synergies,” he said. “Fundamentally, this is a coalition of the willing. There is no authority here. We are just putting out guide posts here to help the government writ large across this whole of government space do a better job of what its core mission is.”