The Human Genome

Your genome is your complete set of DNA: every letter, every gene, every chromosome. The human genome contains around 3 billion base pairs arranged into 23 pairs of chromosomes, with about 20,000 genes. The huge effort to read the entire human genome for the first time was called the Human Genome Project, which started in 1990 and finished in 2003. It is one of the biggest scientific achievements in history, and it has transformed almost every part of biology and medicine.

  • Genome sizeapprox. 3 billion base pairsAbout 750 megabytes if stored as text
  • Number of genesapprox. 20,000Fewer than scientists once predicted
  • Chromosomes23 pairs46 total
  • Project started1990Led by the US NIH and international partners
  • First draft2001Both public and private teams
  • Complete sequence200350 years after the discovery of the DNA double helix

What is "the" human genome?

Strictly speaking, there is no single "human genome": every person has their own slightly different version. But all human genomes are about 99.9% identical to each other. The Human Genome Project worked out a kind of reference average that scientists could compare any individual's DNA against.

The tiny 0.1% of differences between people works out to around 3 million letter changes. That sounds like a lot, but it is the difference between you and your neighbour, between every human ever born. The same 0.1% of variation is enough to produce every difference in eye colour, hair colour, height, blood type, voice, face shape, susceptibility to disease, and millions of other features that make every human unique.

The Human Genome Project

The Human Genome Project was an international scientific effort to read the entire DNA sequence of a human for the first time. It started in 1990 and was led by the US National Institutes of Health (NIH), with major contributions from the UK's Wellcome Sanger Institute and labs in France, Germany, Japan and China.

The project was originally supposed to take 15 years and cost about $3 billion. In the late 1990s a private company called Celera Genomics, led by Craig Venter, started a competing effort using faster methods. The race accelerated everything. A first draft of the genome was published by both groups simultaneously in February 2001, and the complete sequence was officially declared finished in April 2003, two years ahead of schedule. By coincidence, 2003 was also the 50th anniversary of Watson and Crick's discovery of the DNA double helix.

What we learned

The Human Genome Project produced some genuine surprises.

  • The number of human genes turned out to be much smaller than expected: only about 20,000, compared to predictions of over 100,000 in the 1990s. Humans have roughly the same number of genes as a tiny worm.
  • Only about 2% of human DNA actually codes for proteins. The other 98% does other things (regulation, structure) or has no clear function.
  • About 8% of our DNA is the leftover code of ancient viruses that infected our ancestors millions of years ago.
  • The differences between any two unrelated humans are only about 0.1% of the genome, much smaller than the differences between us and chimpanzees (approx. 1%) or mice (approx. 15%).

How fast it has got

The Human Genome Project took 13 years and cost about $3 billion. The technology has improved so fast since then that sequencing a complete human genome today takes less than a day and costs around $200. That is a million-fold drop in cost in 20 years, faster even than the famous "Moore's Law" of computer chips. Whole new fields of medicine (cancer genetics, prenatal screening, pharmacogenomics) have opened up as a result.

Fact If you typed out your entire genome at 60 letters per line and 60 lines per page, it would fill around 13,800 large books: enough to fill several library shelves. Yet a complete copy of all that information is squeezed into a nucleus in each one of your cells, which is far smaller than a grain of dust. Biology is the best information compression engineer on Earth by a long way.

What the genome lets us do

Knowing the human genome has opened up entire new areas of medicine.

  • Personalised medicine: doctors can sometimes choose the best drug for a particular patient based on their genome.
  • Cancer treatment: many modern cancer treatments are tailored to the specific mutations in a particular patient's tumour.
  • Genetic testing: parents can find out if their unborn baby is at risk of certain inherited conditions.
  • Ancestry research: companies like 23andMe and Ancestry can trace your family origins back hundreds or even thousands of years.
  • Forensic science: DNA evidence can identify suspects or victims with near-certainty.
  • Disease research: scientists can now compare the genomes of large groups of patients to find genes linked to almost any condition.
Did you know? Even though the Human Genome Project was "completed" in 2003, the result was not really 100% complete: a few tricky regions of the genome were too hard to sequence with the technology of the time. The truly first complete human genome was only published in 2022, almost 20 years later, by a project called the Telomere-to-Telomere (T2T) Consortium. They finally read every single one of the 3 billion base pairs in order.
Deeper dive: ethical questions raised by the human genome

The Human Genome Project did not just produce a list of letters: it raised a wave of ethical questions that society is still working through.

Genetic privacy. Your genome is essentially the most personal information you have. It can reveal your risk of various diseases, your ancestry, who you are biologically related to, even features that you have not yet expressed. Should employers be allowed to see it? Insurance companies? In the UK and US there are now laws (like GINA in the US) that protect people from genetic discrimination, but the rules differ from country to country.

Embryo screening. Couples having a baby through IVF can now screen embryos for certain serious genetic diseases before implantation. This is a great help for families with histories of cystic fibrosis or Huntington's disease. But it raises tough questions: should parents be allowed to screen for less severe conditions? For non-medical features like sex or eye colour? Where do we draw the line?

Gene editing. Tools like CRISPR-Cas9 (discovered in 2012) make it possible to edit specific genes in living cells. So far this has mostly been used for treating diseases in adults. But in 2018, a Chinese scientist called He Jiankui announced he had edited the genomes of two newborn twin baby girls to try to make them resistant to HIV. The international scientific community was outraged. He was jailed in China for three years. There is currently a global moratorium on heritable human gene editing, but the technology exists and the temptation will not go away.

The genome is also a piece of evidence in growing arguments about human identity, race, mental health and many other deep topics. The science is now mostly settled in the sense that "race" has very little biological reality at the genome level (humans are 99.9% identical regardless of skin colour or geography), but social and legal debates carry on.

For the basic building blocks, see what is DNA and genes and chromosomes. For modern gene editing, see genetic engineering.