Rx Not Done Yet
The human genome, from end to end, has been sequenced, meaning scientists worldwide have identified most of the nearly 20,000 protein-coding genes. However, an international group of scientists — including several researchers from Johns Hopkins — notes there’s more work to be done.
The scientists point out that even though we have nearly converged on the identities of the 20,000 genes, the genes can be cut and spliced to create approximately 100,000 proteins, and gene experts are far from agreement on what those 100,000 proteins are.
The group, which convened in fall 2022 at Cold Spring Harbor Laboratory in New York, has now published a guide, which appeared as a perspective piece in Nature, for prioritizing the next steps in the effort to complete the human gene “catalog.”
“Many scientists have been working on efforts to fully understand the human genome, and it’s much more difficult and complex than we thought,” says Steven Salzberg, Bloomberg Distinguished Professor of Biomedical Engineering, Computer Science and Biostatistics at Johns Hopkins. “We have provided a state of the human gene catalog and a guide on what’s needed to complete it.”
1. Dive Deeper into Protein ‘Isoforms’
The scientists say that while the final list of protein-coding genes is nearly complete, scientists have not yet fully cataloged the variety of ways that a gene can be cut, or spliced, resulting in “isoforms” of proteins that are slightly different. Some protein isoforms will not affect the protein’s function, but some may be different enough to result in increased risk for a particular trait, condition or illness. To complete the catalog, the scientists propose a comprehensive look at how each gene is expressed into functional and nonfunctional proteins, and the three-dimensional shape of those proteins.
2. Focus on Noncoding RNA Genes
The authors also propose a focus on cataloging noncoding RNA genes. RNA is the genetic material that is transcribed by DNA and follows a molecular path to making proteins. Instead of proteins, noncoding RNA genes encode other types of molecular material that perform a cellular function.
3. Enhance Databases, Develop New Technology
Finally, the international group emphasizes the importance of enhancing commonly used databases of gene variations that cause illness and disease, improving clinical laboratory standards for annotating DNA sequencing results and developing new technology to enable more effective and precise methods to match the wide array of proteins with their gene products.
Conclusion
The perspective writers — which also included Johns Hopkins biomedical engineer Mihaela Pertea, postdoctoral researcher Ales Varabyou and 19 other scientists — ended their perspective with an important observation: “Note that, even with a complete gene annotation of a finished genome, we will have only one example of the human gene catalogue — one that will not apply to all humans,” they wrote. “It is probable that many healthy individuals have more or fewer copies of some genes, and future efforts to survey the diversity of the human population will be an important step towards achieving a more complete view of the gene content of our genome.”