Are you wondering about the quality of a human, mouse or rat genome that you have assembled?

We offer a new service for evaluating the completeness, correctness, and base accuracy of your human, mouse or rat genome assembly compared to a reference assembly. You simply provide NCBI with one or more assemblies in FASTA format and we will do an annotation-based evaluation of the genome(s) using the expert-curated, high-confidence RefSeq transcripts for the species.

When we’ve finished the evaluation, you’ll receive:

  • The counts of transcripts and genes that are poorly or not annotated, or that are frameshifted on your assembly compared to the reference assembly (typically GRCh38, GRCm39, or mRatBN7.2)
  • The corresponding lists of transcripts and genes
  • The coordinates of problem loci
  • A preliminary annotation of the assembly (not a full-fledged RefSeq annotation)
  • The RefSeq transcript alignments to the genome
  • A help document on how to interpret the included files

To run your assembled sequence through this pipeline, email refseq-support@nlm.nih.gov and request that we perform a quality evaluation run on your genome assembly.  Be sure to:

    • Include a link to your sequences in the request. You can use NCBI submission preload  or any other accessible file store.
    • Verify that there are no privacy concerns. (See the WARNING below.)
  • NCBI will run the evaluation pipeline. Expect a two-week turn-around time for one or two assemblies.
  • You’ll receive an email with a link to your results

While this is currently only available for human, mouse, and rat genomes, we may expand to additional species if there is enough interest. Please contact us at info@ncbi.nlm.nih.gov to provide feedback on this new service.

WARNING: Only use for human samples or cell lines that have no privacy concerns. For all studies involving human subjects, it is your responsibility to ensure that the information supplied protects participant privacy in accordance with all applicable laws, regulations and institutional policies. Make sure to remove any direct personal identifiers from your submission. If there are patient privacy concerns regarding making data fully public, please submit samples and data to NCBI’s dbGaP database. dbGaP has controlled access mechanisms and is an appropriate resource for hosting sensitive patient data.