Multiple Software Products
Start with a group of data scientists. Divide them into teams of about five. Give each team specific research questions and goals. Tell them to collaborate and be prepared to make their results publicly available.
Then lock the doors for three days.
That’s basically a hackathon NCBI-style—except the part about locking the doors—but there’s no locking up the creativity.
Hackathons are events for computer programmers and other professionals to collaborate on software projects. The National Center for Biotechnology Information hackathons began in response to a specific need.
“It first started with a very specific use case,” explained Ben Busby, the center’s genomics outreach coordinator and organizer of the hackathons. “The NIH high-performance computing group, which manages computing systems for NIH staff, approached me and said, ‘Ben, there are lots of people trying to find all viruses in metagenomic samples, and while they are accomplishing the task, it’s computationally inefficient.’”
Busby’s response: “Well, yeah, we can get some people together and try to fix that.”
And so he did.
On January 5, 2015, the first NCBI hackathon launched. The result was four genomics pipelines (a series of computational steps) put together in three days. These pipelines enabled scientists to do particular analyses on genomes, transcriptomes, epigenomes and metagenomes in one reproducible step.
The January gathering was followed by two more three-day events. In an August 2015 hackathon, participants developed six functional software products, and in January 2016, they put together five. One of the teams released an RNA-seq tutorial, which became part of an open online course available through NCBI that 650 people completed.
Acceptance to participate in the hackathons is competitive, with typically about 100 applications for 35 slots. People have come from down the street and around the world. NCBI does not only recruit data scientists; librarians also participate.
“During the hackathon, people become obsessed,” said Busby. “By the middle of the third day, they have to be reminded to break for lunch.”
More events are in the works. All the previous events were held in Bethesda, MD, but new hackathons will be held outside the DC area. Financial aid is not offered, and everyone must participate in person.
“We’re trying to build a community,” said Busby. “We want projects that build off one another. The idea is to get a critical mass of computational data scientists interested in a number of these software products.”
This is a good thing, because Busby is ambitious. “We built a product that allows anyone to put in a protein sequence and tell you if this protein sequence, with the given threshold, generates an immunogenic response in humans. We want to look at bacterial peptides, as well as peptides that come from cancer cells, and look at immunogenic responses. Then we want to match that to population data.”
For Busby, the bottom line is the ability to detect illnesses earlier. “If you could use this in a clinical setting for early detection, then you could detect even solid tumors or bacterial infections way sooner. We want to build software tools to ask those questions.”
For most participants, the collaboration continues even after the three-day event ends.
Busby said, “This is the coolest part of my job, and I have pretty cool job.”
So it’s fun?