This fall, CMU Libraries is hosting a hackathon in partnership with DNAnexus on the topic of data management and graph extraction for large models in the biomedical space. The hackathon will be held in person at CMU, October 19-21, 2023.

Participants will be separated into teams with dedicated team leads. Organizers are currently seeking qualified individuals (typically junior faculty, postdocs, or senior doctoral students) to serve as team leads. If interested, fill out this brief form.

The hackathon is a collaborative, rather than competitive, event, with each team working on a dedicated part of the problem. The teams will be focused on the following topics:

  • Knowledge graph-based validation for variant (genomic) assertions
  • Continuous monitoring for RLHF and flexible infrastructure for layering assertions with rollback
  • Flexible tokenization of complex data types
  • Assertion tracking in large models
  • Column headers for data harmonization.

All pipelines and other scripts, software, and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose. The outputs are often published as preprints or on the F1000 hackathon channel.

General registration for participants will open later this summer.

Contact Melanie Gainey with any questions about the hackathon or serving as a team lead.