The function of a protein is often determined by how it folds into a 3D structure. Therefore, knowledge of a protein’s structure is essential for a deeper understanding of its role in various cellular processes. However, for most proteins known to mankind, our experimental knowledge lacks their determined structure. For instance, the universal protein database Uniprot archives 229 million unique protein sequences, while the Protein Data Bank, the single worldwide archive for experimentally resolved protein structures, holds 206,000 proteins. X-ray crystallography, or cryo-electron microscopy, the traditional protein-structure-determination method that fires X-rays or electron beams at proteins to create a picture of their shape, is very time-consuming and technologically challenging. It thus contributes to the massive (more than a 1,000-fold) gap between known protein sequences and experimental protein structures.
This gap could be closed by predicting proteins’ 3D configurations straight from their linear amino acid sequence, a solution that AlphaFold may offer. AlphaFold is a program powered by artificial intelligence (AI), developed by DeepMind, part of Alphabet Inc., Google’s parent company. AlphaFold transforms a protein’s sequence into its structure with high accuracy. EMBL-European Bioinformatics Institute (EMBL-EBI), partnering with DeepMind, made the predicted structures of over 200 million cataloged proteins available to science through the AlphaFold Protein Structure Database (AlphaFold DB). This freely available resource offers programmatic access to its data and interactive visualization of predicted structures. Continue reading