From September 25-27, 2017, NCBI will help with a bioinformatics hackathon at the University of Pittsburgh. The hackathon will focus on advanced bioinformatics analysis of next generation sequencing data, proteomics and metadata.  See here for an article on NCBI-style hackathons.

This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to non-scientific developers, mathematicians and librarians.

The event is open to anyone selected for the hackathon and willing to travel to the University of Pittsburgh (see below for venue address).

Working groups of five to six individuals will be formed into five to seven teams. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. Potential subjects for this iteration include:

  • Machine learning pipelines for germline rare variants linked to phenotypes
  • Building an interactive online environment to run NCBI-style hackathons
  • An integrated pipeline for novel virus discovery
  • Probabilistic identification of past viral exposure based on non-native sequences in host genome
  • Packaging and distributing an automatic corpus-updater for NLP tools
  • Phenotypic Indexing of (CRISPR-derived) mouse models

Please see the application form for more details and additional projects.

Organization

After a brief organizational session, teams will spend three days addressing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets to work on these problems.

Datasets

Datasets will come from public repositories or will be supplied by the project lead. During the hackathon, participants will have an opportunity to include other datasets and tools for analysis. Please note, if you use your own data during the hackathon, we ask that you submit it to a public database within six months of the end of the event.

Products

All pipelines and other scripts, software and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose. Manuscripts describing the design and usage of the software tools constructed by each team may be submitted to an appropriate journal, such as the F1000Research hackathons channel.

Application

To apply, complete this form (approximately 10 minutes to complete). Applications are due Tuesday, August 22nd, 2017 by 5 pm ET.

Participants will be selected based on the experience and motivation they provide on the form. Prior participants and applicants are especially encouraged to apply.

The first round of accepted applicants will be notified on August 25 by 5 pm ET, and have until August 28 at 3 pm ET to confirm their participation. If you confirm, please make sure it is highly likely you can attend, as confirming and not attending prevents other data scientists from attending this event. Please include a monitored email address, in case there are follow-up questions.

Note:

Participants will need to bring their own laptop to this program.

A working knowledge of scripting (e.g., Shell, Python, R) is necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful.

Applicants must be willing to commit to all three days of the event. No financial support for travel, lodging or meals is available for this event.  A block of rooms has been reserved at the Wyndham Pittsburgh University Center; details will be provided upon acceptance.

Also, note that the hackathon may extend into the evening hours on Monday and/or Tuesday. Please make any necessary arrangements to accommodate this possibility.

Please contact ben.busby@nih.gov and iwema@pitt.edu with any questions.

Venue:

University of Pittsburgh, Hillman Library, Digital Scholarship Commons, 221 Schenley Drive, Oakland, PA 15213.

Organized by the following University of Pittsburgh groups:

Health Sciences Library System, University Library System, Center for Research Computing, Computing Services and Systems Development, School of Computing and Information.

***This article was originally posted on the NCBI Insights blog on July, 27, 2017.***