Data Hackathon evolves to accelerate change

  • Published
  • By Tech. Sgt. Robert Cloys
  • Air Force Test Center

“Problem solvers” (also referred to as “hackers”) and “problem owners,” met together virtually and in person at Edwards Air Force Base, Calif., March 14-18 for the Air Force Test Center’s second Data Hackathon Event. Participants from a multitude of career fields across AFTC, the 412th Test Wing at Edwards Air Force Base, Calif., the 96th Test Wing at Eglin AFB, Fla., and Arnold Engineering Development Complex, Tenn., combined efforts to tackle complex sourced problems with open-source and Air Force provided tools.

The Hackathon, organized by Capt. Troy Soileau, 96th Cyberspace Test Group Chief Data Officer, has seen one iteration of the event prior to this and is already showing substantial growth and evolution. The concept which initially started to see what data software a team could use to accelerate capabilities has gone from solving data problems singularly sponsored by the United States Air Force Test Pilot School into an efficient think-tank of problem solvers tackling issues sponsored by 7 units at once.

“This event is about raising the overall data competency baseline in the Center. We work with problem owners to convert an aspect of their mission into a scoped problem, we provide hackers access, awareness, and training to current cloud analytics tools, and we provide a space for the mission problems and the talent to solve them to meet,” said Soileau. “It doesn’t matter what level of data capability you or your unit is at- there is value in participating in this event.”

Though the participants worked collectively from different locations, they utilized the VAULT cloud data science platform from the Air Force Chief Data Office, to collaborate their efforts.

The Data Cluster Identification Conundrum

Noah Jackowitz from AFTC’s Multi-Domain Test Force brought one of three complex problems the Hackathon looked to solve.

When presented with a certain airspace picture, multiple aircraft can squawk at the same frequency causing radar systems to only provide some relevant information needed, but not a clear picture.

“We are given data that has really good content but is really badly structured and a lot of that meaningfulness of the data is really hard to interpret. So, this requires a lot of up-front processing before you’re able to do any analysis of it,” said Jackowitz. “We are trying to handle that up-front processing and then deliver processed data to folks so they can start analyzing stuff that was just too difficult to wrangle before.”

The solution proposed by the team involved using the Python open-source software and “sci-kit learn” to create a machine learning tool for predictive data analysis to further categorize and group data points through kinematics.

“We are identifying individual aircraft out of a cluster of data points,” said Aaron Krause, AFTC Emerging Technologies CTF autonomy test engineer. “The concept is that with a human we can see a line of dots, with a computer it’s just data. So, what we’re functionally doing is putting a circle around each dot, and if there’s another dot within that circle it’s part of the track. Then what the computer can do that a human can’t do is see that that circle is not just space, but it’s also time.” 

The challenge in this problem is finding algorithms that are neither too aggressive, which creates false clusters, nor too relaxed, which can lead the machine into thinking two aircraft are one.  

“Really what we’re using is a one-liner that we pulled from the internet,” said Krause. “These things are being developed pretty aggressively and pretty fast. And, as long as we can set up architecture, [we can] then drop in new clustering algorithms and get better and better looks at it.”

By using Air Force-approved cloud storage and computing environments, Data Hackathon teams can not only centralize tools and analysis but also work on data sets that are too large for an individual computer.

“A cloud environment is a scalable environment,” said Krause.

Parker Brown, 416th Flight Test Squadron weapon integration engineer, is now on his second Data Hackathon and has taken a new role in helping solve technology problems amongst the different groups.

“Someone in the Air Force at some point decided that the cloud was a good thing and they were right,” said Brown.

Taking on Vault’s cloud computing environment on his own, Brown has charged himself with passing the knowledge he gained from his previous Vault use to new participants.

“I’m helping teams transition from a local environment to how do you make this work in a cloud environment,” he said. “How do you use the tools in a cloud environment to make better processes that are faster and more efficient?”

Though the solution may not be complete, a task that normally would be time prohibitive suddenly becomes much more plausible. The time savings associated with taking large quantities of data and grouping them could potentially allow for battlespace pictures to be seen more completely than ever before.

Weather Data-Scraping Dilemma

The 96th Weather Squadron at Eglin AFB, a second “problem owner” during the Hackathon, presented an issue as well for the teams to solve. Though not as complex as machine learning data clusters, the squadron sought a solution of both time savings and human error reduction in their weather data collection.

Before taking their problem to the Hackathon, the squadron grabbed much of their data across multiple sources on the web.

“They want to see some more automation. They are doing a lot of copying and pasting of data and pulling it in,” said Clinton Bowers from the 412th Communications Squadron Mission Defense Team, who volunteered his time and programming skills to the Hackathon. “What we’re able to do is using the tools we have here, we’ve come up with a script that goes through and parses through the same information at the sites. By providing that data to a front end, they can show all the data in one place without having to manually pull from multiple sources or manually calculate.”

Because this script can be automated and scrapes the same information from the web, human error is reduced by eliminating “hand-jamming” information into spreadsheets.  Additionally, though Terminal Aerodrome Forecast can be pulled manually, this solution allows for the process to be automated, pulling data from several bases simultaneously while all data is collected and displayed in one place by utilizing Vault.

“Right now, we’re using all open source,” said Bowers. “The Air Force is encouraging more and more of that along with the DoD is using open source.”

The long-term goal is to make the solution scalable to bases across the Air Force.

Edgar Alcaraz from the 412th CS Communication Focal Point also decided to try to lend his hand in the weather squadron’s solution as well, regardless of having minimal experience in the area.  This breadth of experience amongst participants is highly encouraged during the event.

“This is my first Hackathon. I’ve never really dealt with the whole data management and data analysis,” said Alcaraz. “When I saw the e-mail come through there was no minimum experience necessary so I wanted to get a feel for it and learn about it. Now that I’m learning about all the tools, and when I go back [to my job] I’ll be able to start developing what I’m learning here and grow on that.”

Test Point Parallels

In a continuation of processing the United States Air Force Test Pilot School data from AFTC’s first Data Hackathon, members worked together to further this data collection and effectively, organize, categorize and recall test points to substantially save time and money for the DoD.

“This time we are focusing on analyzing test points of opportunity,” said Megan Runyan, 416th Flight Test Squadron weapons integration engineer and lead for the TPS problem. “What we are trying to do is compare the test plan to the actual data to see were they ever on conditions and where they ever on conditions more times than they thought they were. This is beneficial to future mission planning to reduce flight hours. If you find out you’re constantly doing the same points that are required without having to fly that specifically, then that will reduce mission hours and the cost of flight as well as helping with analysis by having more data points.”

This is beneficial for future mission planning to reduce flight hours.

“The idea behind this is reducing flights; being able to do multiple tests from one flight,” said Quinn Seys, 416th Flight Test Squadron weapons integration engineer. “There’s a lot of data that’s being collected that’s not actually being looked at all the time.”

By modifying an algorithm to search through these large amounts of data collected on previous flights to see if test point conditions have already been met to negate the need for further flying. A flight that once was vectored to one test can now be reallocated to a newer objective, substantially reducing test cost and production timeline of Air Force aircraft.  

Being a part of Data Hackathons ends up as a win-win situation for participants looking to bring their newfound knowledge back to the work center.

“We can see how applicable it might be for our own data analysis and solutions that we create for ourselves,” said Stefan Johnson, 419 FLTS weapons integration engineer.

Runyan echoed the sentiment.

“It’s a great time to just learn and spend time learning,” she said. “This is my third Hackathon now, and in those three weeks, I have learned more about data science in general than I have in my entire career so far. It’s a chance to learn and network.”

Learning was exactly what Donna Witt, Air Force Research Lab’s librarian, had in mind too when she heard about the Hackathon from a SparkEd meeting.

“I just want to learn anything and everything,” said Witt. “This is totally new to me but it makes me think in different ways.”

Witt contributes to the team by pre-processing, extracting, and organizing information so that the team can work that data into their roles.  

“I’ve always had an interest in coding, but at a really low level, along with processes and automation.”

The Future of Data Hackathon

More information about the AFTC Data Hackathon can be found in DAF365 teams by searching “AFTC Data Hackathon” or by using the join code b4x02wc. 

The representatives of the Data Hackathon encourage more people from any career, regardless of skillset, to participate to widen the improvement of data analytics. 

“Everyone has a part to play in the data revolution,” Soileau said. “This is how we accelerate change in the 21st century.”