The United States Agency for International Development called on volunteers to clean up some 20,000 records on economic programming.
Over the weekend, USAID held its first crowdsourcing event, asking for volunteers from the public to help code and research information on 20,000 records.
The online participants included students, geography experts and transparency advocates, according to the USAID website. These records each had geographic information — but this information was not collected in a standard format. It was “a messy jumble,” said Shadrock Roberts, senior geographic information systems analysis at USAID, in an interview with The Federal Drive with Tom Temin and Emily Kopp. Within the first 16 hours, the volunteers were able to sort through the records that were not machine-readable and determine where the record should be located at the city or state level, Roberts said.
“A big group of people can do a little piece of the puzzle and create the bigger whole,” Roberts said.
Crowdsourcing attracts experts, as well as “non-specialists.” For this project, USAID drew on volunteers from two organizations — Standby Task Force and GISCorps.
USAID is now conducting an accuracy assessment.
“One of the questions about this volunteered info is, How credible is it? How accurate is it?” Roberts said. “We’re assuming there’s going to be some error in the data because we’ve asked non-specialists to do it.”
However, Roberts added, even automated data can contain errors. The goal of the accuracy assessment is to determine the limits of the datasets when they are released on Data.gov, he said.
Within USAID, the data will be used to map out where aid money is flowing to — whether aid is concentrated in certain areas and where there may be a gap in aid.
Beyond the in-house analysis, the “wonderful thing” about open data is other analyses will arise from the datasets from the public, Roberts said.
“If you’re an economic masters student in Kenya and you can find something useful in the data, you may find something interesting and publish it, share your research. And that’s what we really hope happens with this dataset,” Roberts said.
Tom Temin is the host of The Federal Drive, which airs from 6-9 a.m. on 1500 AM in the Washington, DC region and online everywhere. Tom has 30 years experience in journalism, mostly in technology markets. Before coming to Federal News Radio, he was a long-serving editor-in-chief of Government Computer News and Washington Technology magazines.