The Human Genome Project: Friend or Foe?
In the past century key advancements have paved the way for the decoding of the human genome. These include the discovery of the DNA double helix, chromosomes as the vehicles of hereditary information, and more recently the development of recombinant DNA cloning and sequencing technologies, which together established a cutting-edge field of research called genomics. Thousands of genes and full genomes have been decoded since then. The Human Genome Project is one of the most significant scientific breakthroughs in history. A worldwide and multidisciplinary team of scientists embarked on a biological expedition to study and interpret the entirety of the human DNA (known as the genome). The Human Genome Project, which began in October 1990 and ended in April 2003, decoded the genetic make-up of human race that is, the full sequence of human DNA, thereby boosting the study of human biology while improving medical practice.
What Is a Genome?
Before delving into the history and science behind the Human Genome Project, it is important to first comprehend what a genome is and the enormous impact its decoding has on healthcare and personalized therapy. DNA, also known as deoxyribonucleic acid, is a molecule that contains the genetic information necessary for an organism to grow and function. A double helix is made up of two intertwined strands that wrap around each other to create a twisted ladder (Watson & Crick, 1953). Each strand comprises an alternate backbone of sugar (deoxyribose) and phosphate groups. Each sugar is linked to one of the four DNA bases: adenine (A), cytosine (C), guanine (G), or thymine (T). Chemical bonds between the bases connect the two strands in a complementary manner: adenine binds with thymine and cytosine with guanine. A base, a sugar and a phosphate together form a nucleotide. One particularly striking finding of the Human Genome Project research was that human DNA consists of approximately 3.2 billion base pairs, of which 99.9% are shared between humans (IHGSC, 2004; Venter et al., 2001).
Variations in the remaining 0.1% of the DNA sequence is responsible for our individuality. Despite representing a minimal fraction of the genome, this genetic variation is responsible for the phenotypic differences observed between individuals, such as physical traits (height, intelligence, hair, and eye color), disease susceptibility, and drug responses (Jorde & Wooding, 2004). The order or sequence of these bases dictates the information available to build and maintain an organism, much like letters of the alphabet appear in a specific order to form words and sentences. DNA is the molecule found in all of our cells that holds the instructions for the development and evolution of life. The entire collection of DNA in an organism is referred to as the genome. In other words, the genome is the biological instruction manual for life, holding all the genetic information required for development, function, and reproduction of an individual. This code is carried on from parents to their offspring through genes, the basic physical and functional unit of heredity (Fantini, 1988). A gene is a segment of DNA that stores the information needed to build specific proteins that either have a specific function in a cell or encode an individual physical trait, such as eye or hair color (O’Connor, 2008).
Origin and Goals of the Human Genome Project
Renato Dulbecco openly campaigned for the Human Genome Project (HGP) in 1984, advocating that decoding the human genome sequence could help understand the causes of cancer (Dulbecco, 1986). The Human Genome Project took shape in 1985, when Robert Sinsheimer, Chancellor of the University of California, held a conference with 12 experts to examine the feasibility of such a project (Sinsheimer, 1989). The project was deemed technically feasible, albeit extremely challenging. However, there was debate as to whether this was a good idea, with six of those present voting in support and six voting against it. Those opposed to the project believed that the benefits of mapping the human genome sequence would not outweigh the project's risks and costs, and that the scientific community was unprepared for such a demanding endeavor and should instead wait until technology was suited for the purpose. The key question was whether the huge cost of the project would outweigh the potential benefits. Despite the controversy, the HGP was launched in 1990 by the American geneticist Francis Collins with support from the National Institutes of Health and the U.S. Department of Energy (NIH, 1990).
From its inception, the Human Genome Project has revolved around two key principles. First, it welcomed scientists from all over the world in an effort to cross boarders and create an all-encompassing project aimed at understanding our molecular heritage. The International Human Genome Sequencing Collaboration (IHGSC) was founded by publicly funded researchers from twenty institutes in six countries (International Human Genome Sequencing Consortium, 2001; Palao, 2022). Second, the HGP worked on the Bermuda Principles drafted in 1996 (Maxson Jones, 2018). This meant that all human genome sequencing data had to be released and made publicly available within 24 hours of its generation. This greatly accelerated ongoing research by giving scientists worldwide access to HGP data.
The Human Genome Project was a large-scale, publicly funded and highly collaborative multinational research project whose primary goal was to decipher the chemical fingerprint underlying the human genome. Before tackling such an intricate genome, however, researchers first developed several genome projects on simpler and more well-established model organisms including yeast, bacteria, Drosophila, and Arabidospis thaiiana (a rapidly developing plant with a small genome) (IHGSC, 2001). Two years later, this massive project was launched on the premise that the isolation and analysis of human genetic material (DNA) could offer researchers new avenues and cutting-edge tools to pinpoint the causes of disease and develop new therapeutic and preventive strategies (Collins & Fink, 1995; IHGSC, 2004). The fact that many human diseases, with the exception of physical injuries, are linked to changes (i.e. mutations) in the structure and function of DNA offered additional support for this bold undertaking. These disorders include over 4,000 inherited 'Mendelian' diseases (Chial, 2008) caused by mutations in a single gene like cystic fibrosis or complex diseases resulting from hereditary defects in multiple genes such as Alzheimer's disease; but also diseases such as cancer, which develop over time as a result of acquired DNA mutations (Collins & Fink, 1995). The ultimate goal of the Human Genome Project was to catalog each of the 50,000 to 100,000 genes that make up the human genome (Schuler et al., 1996) and provide the research tools needed to analyze this genetic information, so that genes encoding both rare and common diseases can be traced back.
Milestones of the Human Genome Project
Based on a 15-year timeline, the initial cost of the Human Genome Project was $3 billion. The first five-year plan involved developing DNA analysis tools, mapping the human genome, and determining the order of the 3.2 billion 'letters' that make up the human genome (NIH, 2022). The first mapping goal of the HGP, a comprehensive human genetic linkage map was accomplished in 1994, one full year ahead of schedule (Murray et al., 1994). Linkage maps are created using a collection of markers, which are patterns of DNA (such as genes, variants, and other DNA sequences of interest) found within chromosomes. Because they provide a gene's overall location on the chromosome, they are the primary technique used by researchers to locate a disease gene. Determining the location of genes, including altered genes that cause disease on human chromosomes is the first step in determining the causes of genetic disorders. The HGP's second goal was achieved a year later with the completion of a physical map containing graphical representations of physical locations of these identifiable markers on chromosomes (J. Bell et al., 1995). The physical map therefore serves as the backbone for the final assembly of the complete DNA sequence of the human genome.
Celera Genomics, a private biotechnology company, joined the race to sequence the human genome in 1998. Celera, led by Dr. Craig Venter, proposed sequencing the entire human genome in three years. While the IHGSC and Celera used different approaches to determine the human genome sequence, they employed the same general method for DNA sequencing (Hood & Galas, 2003). Although the IHGSC and Celera scientists took different approaches in their research, they all came to the same conclusions. In doing so, the researchers exceeded their own estimated project timeline by two years and disproved the naysayers.
The third and most challenging goal of the Human Genome Project was to determine the order (i.e. sequence) unit by unit of all 3 billion nucleotides that make up the human genome. Once the genetic and physical maps are complete, a sequence map can be constructed. This allows scientists to locate genes, characterize regions of DNA that control gene activity, and link DNA structure to their function. In June 2000, Collins and Venter announced the completion of a working draft of the human genome sequence (International Human Genome Sequencing Consortium Announces “Working Draft” of Human Genome, 2000) However, this first sequencing only accounted for 90% of the human genome and included more than 150,000 regions where the DNA sequence could not be precisely deciphered (known as gaps). Driven by technological advances, the sequence of the human genome sequence was improved, expanded and further investigated over the next three years. In April 2003, the HGP was declared 'complete', marking the 50th anniversary of the publication explaining the double helix structure of DNA by Watson and Crick (Watson & Crick, 1953), yet it only included about 92% of the genome (NIH, n.d.-b).
"Never would I have dreamed in 1953 that my scientific life would encompass the path from DNA's double helix to the 3 billion steps of the human genome. But when the opportunity arose to sequence the human genome, I knew it was something that could be done - and that must be done," - James D. Watson, The Double Helix
Only 0.3% of the human genome's nucleotides were still undeciphered by 2021, a challenge that was eventually tackled, with the last gapless assembly being accomplished in January 2022 (NIH, n.d.-b; NIH, n.d.-c). Contrary to the conventional wisdom that over 100,000 genes were required to carry out the various cellular functions that support life, the HGP unexpectedly revealed that humans contain only 25,000 genes, which is comparable to the number found in the flowering plant Arabidopsis and slightly fewer than those in the worm Caenorhabditis elegans (“MIT Tech Talk,” 2004). These results are in stark contrast to previous assumption that the complexity of the organism is directly related to the number of its genes.
The Other Side of The Same Coin: Ethical Concerns in the Genomic Era
The Human Genome Project did not decode the human genome from a single person. Instead, it compiled the information from several individuals, the identities of whom were purposefully concealed to preserve their identity. In order to gather blood samples for analysis, researchers carefully chose volunteers and got their informed consent (Human Genome Project Completion: Frequently Asked Questions, n.d.). As for the scope of the project, it raised major ethical and social issues, particularly in relation to the collection and use of genetic information. This has caused considerable concern among the general public and the research specialists involved in the project. While they were aware of the pitfalls and benefits of integrating new genetic information into research and therapy; they were particularly concerned about the misuse of genetic information in a variety of contexts, including insurance and employment (Hood & Kevles, 1993; TH Murray, 1991; Ott, 1995). This led to the establishment of the ELSI Human Genome Project Committee in 1990 to address the ethical, legal and social challenges this project posed (Fundacion BBV Documenta, 1995).
The possibility of gene therapy is one such topic that has been explored. Gene therapy was originally employed in the early 1980s, before the HGP was established (Williams, 1984). Yet, the procedures were time-consuming and offered little success. The HGP sought to provide key research tools that would enable scientists to determine genes involved in normal biology, as well as rare and common diseases. One such tool is positional cloning (Collins 1992), which allows researchers to search for disease-related genes directly in the genome without first having to identify the protein product or the function of the genes. The Human Genome Project paved the way for gene therapy by facilitating the genetic identification and cloning of defective genes that predispose to numerous human diseases (Cavazzana-Calvo, 2004). As technology has advanced to this point, faulty genes can now be easily repaired and replaced with new healthy ones; but genome editing technology still has many pitfalls due to the possibility of off-target effects (insertions occurring in the wrong location) and mosaicism (when some cells carry the edited gene but others do not), with safety still being a primary concern (Abdelnour et al., 2021; Liu et al., 2017). Most ethical considerations about genome editing focus on human germline editing because these modifications are specifically targeted at reproductive cells (egg or sperm) and are therefore integrated into the DNA of each cell of the offspring (Lanphier et al., 2015). In an extraordinary scenario, fueled by the misuse of genetic information and gene editing tools, the hunt for the perfect genetic child (designer babies) and genetic surgeries to enhance specific traits in healthy humans (rendering them stronger, wiser, or taller) could become a reality.
Genetic discrimination has also been highlighted as a possible risk. The HGP has pioneered the development of cutting-edge technology for analyzing DNA and understanding genes. Greater knowledge of genes through the HGP has supported the development of novel diagnostic tests, thereby expanding genetic testing (Collins, 2012). As a result, the diagnosis of the underlying causes of hereditary diseases has soared. Because of this, almost all common diseases and a significant number of rare diseases could be directly linked to a faulty gene once the underlying mutation has been successfully identified. As a result, the HGP has paved the way for the development of specific and accurate diagnostic tools that allow the identification of the genetic predisposition and the likelihood of developing a given disease (van Ommen G., 2002). However, concerns have been raised that companies may have access to employees' genetic information prior to hiring. An employee may face discrimination if found to be genetically predisposed to traits undesirable in the workplace. The same scenario is also conceivable for insurance companies. Health insurance companies may gain access to an individual's genetic information and therefore withhold coverage or charge higher insurance costs. If a person carries a disease-linked gene, obtaining health insurance can become extremely difficult (T. H. Murray, 1991; Ott, 1995). Predictive medicine offers great hopes, but also poses great concerns and ethical dilemmas. The prospect of access to vastly expanding genetic data about individuals and populations requires decisions about what that information should be and who should manage the development and transmission of genetic information. In order to prevent discriminatory deviations, confidentiality is required, the moral imperative of medicine (T. H. Murray, 1991).
The Legacy of the Human Genome Project
The challenge of sequencing the entire human genome, that is, the complete set of DNA, has not been straightforward. In humans, this consists of 23 pairs of chromosomes that carry 20-25,000 genes and 3 billion base pairs of DNA. The enormous amount of effort that has gone into this work has allowed scientists to decode the blueprint of life. This achievement represented a significant leap in human biology and a significant step forward for future genetic investigations. Numerous genes linked to hereditary disorders have been identified, paving the way for the development of novel diagnostic tests and therapies, as well as new studies to understand the genetic processes that operate in specific diseases yet, the initial hope of accelerating the development of new therapies was not necessarily met by the Human Genome Project (Chial, 2008).
With the sequence of the human genome in hand, we have learned that it takes more than just knowing the base-pair arrangement of our genome to cure human disease. Understanding the genome demands not only a greater understanding of its many components, but also of how they interact and what their functions are. The decoding of the human genome sequence was a crucial first step in the right direction, allowing us to categorize every part of a human gene, but it is not sufficient on its own. Current efforts are focused on decoding the protein products encoded by our genes. When a gene is mutated, the associated protein is usually defective. Proteomics is a rapidly evolving field trying to understand how protein function and expression are affected in disease. Notwithstanding its limitations, this massive project sparked an ongoing revolution in the fight against human disease and defined a new vision for the future of medicine - a vision that has yet to be completely realized.
Abdelnour, S. A., Xie, L., Hassanin, A. A., Zuo, E., & Lu, Y. (2021). The Potential of CRISPR/Cas9 Gene Editing as a Treatment Strategy for Inherited Diseases. Frontiers in Cell and Developmental Biology, 9. https://doi.org/10.3389/fcell.2021.699597
Cavazzana-Calvo, M., Thrasher, A. & Mavilio, F. The future of gene therapy. Nature. 427, 779–781 (2004). https://doi.org/10.1038/427779a
Chial, H. (2008) Mendelian genetics: Patterns of inheritance and single-gene disorders. Nature Education 1(1):63
Collins, F. S., & Fink, L. (1995). The Human Genome Project. Alcohol Health and Research World, 19(3), 190–195. http://www.ncbi.nlm.nih.gov/pubmed/31798046
Collins, S. L., & Impey, L. (2012). Prenatal diagnosis: types and techniques. Early human development, 88(1), 3–8. https://doi.org/10.1016/j.earlhumdev.2011.11.003
Dulbecco, R. (1986). A Turning Point in Cancer Research: Sequencing the Human Genome. Science, 231(4742), 1055–1056. https://doi.org/10.1126/science.3945817
Fantini, B. (1988). Genes and DNA. “The transforming principle. Discovering that genes are made of DNA.” By M. McCarty. Essay review. History and Philosophy of the Life Sciences, 10(1), 145–151.
Fridovich-Keil, J. L. (2023). Human Genome Project. In Encyclopedia Britannica.
Fundacion BBV Documenta. (1995). The Human Genome Project: Legal Aspects (Volume I). Fundación BBV.
Hood, L. E., & Kevles, D. J. (1993). The code of codes : scientific and social issues in the human genome project (1st Harvar). Cambridge, Mass.
Hood, L., Galas, D. The digital code of DNA. Nature 421, 444–448 (2003). https://doi.org/10.1038/nature01410
International Human Genome Sequencing Consortium. (2001). Initial sequencing and analysis of the human genome. Nature, 409, 860–921
International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431 (7011), 931–945. https://doi.org/10.1038/nature03001
International Human Genome Sequencing Consortium Announces “Working Draft” of Human Genome. (2000). https://www.genome.gov/10001457/2000-release-working-draft-of-human-genome-sequence
J. Bell, C., L. Budarf, M., W. Nieuwenhuijsen, B., L. Barnoski, B., H. Buetow, K., Campbell, K., M.E.Colbert, A., Collins, J., Daly, M., R.desjardins, P., Dezwaan, T., Eckman, B., Foote, S., Hart, K., Hiester, K., Hoog, M. J. va. H., Hopper, E., Kaufman, A., E.mcdermid, H., … J.hudson, T. (1995). Integration of physical, breakpoint and genetic maps of chromosome 22. Localization of 587 yeast artificial chromosomes with 238 mapped markers. Human Molecular Genetics, 4(1), 59–69. https://doi.org/10.1093/hmg/4.1.59
Jorde, L. B., & Wooding, S. P. (2004). Genetic variation, classification and “race.” Nature Genetics, 36(S11), S28–S33. https://doi.org/10.1038/ng1435
Lanphier, E., Urnov, F., Haecker, S. E., Werner, M., & Smolenski, J. (2015). Don’t edit the human germ line. Nature, 519(7544), 410–411. https://doi.org/10.1038/519410a
Liu, C., Zhang, L., Liu, H., & Cheng, K. (2017). Delivery strategies of the CRISPR-Cas9 gene-editing system for therapeutic applications. Journal of Controlled Release, 266, 17–26. https://doi.org/10.1016/j.jconrel.2017.09.012
Maxson Jones, K., Ankeny, R.A. & Cook-Deegan, R. The Bermuda Triangle: The Pragmatics, Policies, and Principles for Data Sharing in the History of the Human Genome Project. J Hist Biol 51, 693–805 (2018). https://doi.org/10.1007/s10739-018-9538-7
MIT Tech Talk. (2004). News Office., 49(9).
Morse, A. (1999). Searching for the Holy Grail: The Human Genome Project and Its Implications. Journal of Law and Health, 13(2).
Murray, J. C., Buetow, K. H., Weber, J. L., Ludwigsen, S., Scherpbier-Heddema, T., Manion, F., Quillen, J., Sheffield, V. C., Sunden, S., Duyk, G. M., Weissenbach, J., Gyapay, G., Dib, C., Morrissette, J., Lathrop, G. M., Vignal, A., White, R., Matsunami, N., Gerken, S., … Cann, H. (1994). A Comprehensive Human Linkage Map with Centimorgan Density. Science, 265(5181), 2049–2054. https://doi.org/10.1126/science.8091227
Murray, T. H. (1991). Ethical issues in human genome research. The FASEB Journal, 5(1), 55–60. https://doi.org/10.1096/fasebj.5.1.1825074
National Institutes of Health. Human Genome Project Fact Sheet. How Much Did the Human Genome Project Cost? (2022). National Human Genome Research Institute.
National Institutes of Health. (n.d.-a). CHM13 T2T v1.1 – Genome – Assembly. National Human Genome Research Institute.
National Institutes of Health.(n.d.-b) Human Genome Project Completion: Frequently Asked Questions. . National Human Genome Research Institute.
National Institutes of Health. (n.d.-c). T2T-CHM13v2.0 - Genome - Assembly. National Human Genome Research Institute.
National Institutes of Health. (1990). Understanding Our Genetic Inheritance. The U.S. Human Genome Project: The First Five Years. In U.S Department of Health and Human Services U.S. Department of Energy.
O’Connor, C. (2008). Discovery of DNA as the hereditary material using Streptococcus pneumoniae. Nature Education, 1(1), 104.
Ott, B. B. (1995). The human genome project: An overview of ethical issues and public policy concerns. Nursing Outlook, 43(5), 228–231. https://doi.org/10.1016/S0029-6554(05)80009-0
Palao, B. (2022). Human Genome Project: What it is and how it paved the way for personalized medicine. Veritas.
Schuler, G. D., Boguski, M. S., Stewart, E. A., Stein, L. D., Gyapay, G., Rice, K., White, R. E., Rodriguez-Tomé, P., Aggarwal, A., Bajorek, E., Bentolila, S., Birren, B. B., Butler, A., Castle, A. B., Chiannilkulchai, N., Chu, A., Clee, C., Cowles, S., Day, P. J., … Hudson, T. J. (1996). A gene map of the human genome. Science (New York, N.Y.), 274(5287), 540–546. http://www.ncbi.nlm.nih.gov/pubmed/8849440
Sinsheimer, R. L. (1989). The Santa Cruz Workshop—May 1985. Genomics, 5(4), 954–956. https://doi.org/10.1016/0888-7543(89)90142-0
van Ommen G. J. (2002). The Human Genome Project and the future of diagnostics, treatment and prevention. Journal of inherited metabolic disease, 25(3), 183–188. https://doi.org/10.1023/a:1015673727498
Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., Gocayne, J. D., Amanatides, P., Ballew, R. M., Huson, D. H., Wortman, J. R., Zhang, Q., Kodira, C. D., Zheng, X. H., Chen, L., … Zhu, X. (2001). The Sequence of the Human Genome. Science, 291(5507), 1304–1351. https://doi.org/10.1126/science.1058040
Watson, J. D., & Cook‐Deegan, R. M. (1991). Origins of the human genome project. The FASEB Journal, 5(1), 8–11. https://doi.org/10.1096/fasebj.5.1.1991595
Watson, J. D., & Crick, F. . H. C. (1953). Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid. Nature, 171(4356), 737–738. https://doi.org/10.1038/171737a0
Williams, D., Lemischka, I., Nathan, D. et al. Introduction of new genetic material into pluripotent haematopoietic stem cells of the mouse. Nature 310, 476–480 (1984). https://doi.org/10.1038/310476a0
Figure 1 - Winslow Terese. (2015). DNA Structure. National Cancer Institute. [image]. https://visualsonline.cancer.gov/details.cfm?imageid=10062
Figure 2 - Wang, X., Xia, Z., Chen, C. et al. The international Human Genome Project (HGP) and China’s contribution. Protein Cell,9, 317–321 (2018). https://doi.org/10.1007/s13238-017-0474-7
Figure 3 - The Human Genome Project turns the big 3-0! (2001). National Human Genome Research Institute (NHGRI). [image]. https://www.genome.gov/news/news-release/the-Human-Genome-Project-turns-the-big-3-0