DNA 🧬

DNA (Deoxyribo Nucleic Acid) is the genetic material in humans and most organisms. DNA constitutes genes and determines health and heredity of an individual.

Sampath Amitash Gadi

The human body is a very efficient machine. To achieve this efficiency, trillions of cells make up the human body.

Most cell types (there are a few exceptions) have DNA inside them. In fact most cells have two copies of DNA. One copy of DNA came from the mother and the other from the father.

Most of the DNA is located in the nucleus of the cell. This DNA is like the control centre of the cell.

A small fraction of DNA can also be found in other parts of the cell, such as mitochondria.

What is DNA made of?

DNA stands for deoxyribonucleic acid. It is a long, chain like chemical made of many links. Each link is called a nucleotide. Each nucleotide is composed of 3 molecules:

A sugar called deoxyribose
A phosphate
One of the four nitrogen bases - Adenine, Thymine, Guanine and Cytosine

The length of the DNA depends on the number of nucleotides in it. A long DNA molecule has more number of nucleotides than a shorter DNA molecule.

You may have already noticed that 4 nucleotides are possible. Each nucleotide is determined by the nitrogen base in it. The sugar (deoxyribose) and phosphate are always the same in every nucleotide. It is only the nitrogen base that varies.

If the nitrogen base is Adenine, then the nucleotide is called “A”. Similarly, if the nitrogen base is Cytosine, the nucleotide is called “C”.

Therefore, there are 4 nucleotides in DNA. They A, T, G and C. The sequence of arrangements of these nucleotides in a DNA molecule determines its DNA sequence.

What is the structure of DNA?

The structure of the DNA was deduced by James Watson and Francis Crick together with Rosalind Franklin and Maurice Wilkins.

Watson, Crick and Wilkins were awarded the Nobel Prize in 1962 for this remarkable achievement.

Rosalind Franklin was not awarded the Nobel Prize due to technical reasons.

The Nobel Committee at that time did not award people posthumously. The award could not be shared by more than three people. As Rosalind Franklin passed away in 1958, the remaining three scientists received the award.

Photos of James Watson, Francis Crick, Rosalind Franklin and Maurice Wilkins. — From left to right: James Watson, Francis Crick, Rosalind Franklin and Maurice Wilkins. They were the key people involved in deducing the structure of DNA.

DNA is polar

The DNA double helix is made of two complementary strands of DNA. These two strands have opposite polarities. The chemical nature of the nucleotides gives these strands polarities. Let me explain.

Recall that the DNA is a poly-nucleotide chain i.e. it is a long chain of nucleotides. Each nucleotide has a sugar and a phosphate group.

The sugar molecule of each nucleotide is bound to the phosphate group of the same nucleotide by an ester bond. This sugar molecule is also bound to phosphate group of the adjacent nucleotide by another ester bond. The two ester bonds connect the two adjacent nucleotides by what is called the phosphodiester bond.

Due to this arrangement, at one end of the DNA, a phosphate group is exposed. At the other end of the DNA, a sugar molecule is exposed.

When a phosphate group is exposed at the end of a DNA strand, it exposes its hydroxyl group (-OH) at the 5th position.

When a sugar group is exposed at the end of a DNA strand, it exposes its hydroxyl group (-OH) at the 3rd position.

As a result, every DNA strand has polarity with each end having exposed hydroxyls. The end with an exposed hydroxyl on the phosphate is called the 5´ (5-prime) end. The end with an exposed hydroxyl on the sugar is called the 3´ (3-prime) end.

Illustration of a section of the structure of DNA showing hydrogen bonds between the bases. — The chemistry of base pairing between complementary DNA strands. Note that A and T base pair via 2 hydrogen bonds. G and C base pair via 3 hydrogen bonds. Within each strand of DNA, nucleotides are connected by a phosphodiester bond between two adjacent sugar-phosphate molecules. Adapted from Darryl Leja (NHGRI).

DNA is a double helix

Each DNA molecule contains two complementary DNA strands arranged in the form of a helix.

The two strands are called complementary strands because the DNA sequence on one strand determines the complementary sequence on the other.

In addition, these two strands have the opposite polarity. When one strands extends from the 5´ to 3´ direction, the complementary strand should be from the 3´ to 5´ direction. This is the antiparallel nature of DNA.

This complementarity is between the nitrogen bases in nucleotides of each strand. Complementary bases pair with each other with hydrogen bonds to form the double-stranded DNA.

Adenine (A) and Thymine (T) are complementary via two hydrogen bonds
Cytosine (C) and Guanine (G) are complementary via three hydrogen bonds

Illustration of the structure of DNA showing the double helix. — Complementary DNA strands in double-stranded DNA twist around each other to form the DNA double helix. This is a simplified illustration.

The double helix is also known as the Watson and Crick model because they described this model in great detail.

There are 10 base-pairs per turn of the DNA double helix.
The distance between adjacent base-pairs is 3.4 Å (angstroms) or 0.34 nm (nanometres).
So, the length of each helix turn, called the pitch is 34 Å (angstroms).

The idea of complementarity was first proposed by Erwin Chargaff in 1950. Watson and Crick further solidified this idea in 1953.

Illustration of a DNA double helix showing its size. — The DNA double helix has ten base-pairs per turn. This is a simplified illustration. I have highlighted the second strand to make it easy to visualise the two strands.

How is DNA organised in the nucleus of the cell?

DNA from a single cell from your body will measure almost 6 feet long end to end. The average cell is only between 30 and 50 μm (micrometres) long. So, the DNA within a cell is more than 35000 times longer than the cell itself. So, how does the cell fit such a long DNA molecule within its nucleus.

Let’s see how a long DNA molecules is packed and organised inside a cell in our body.

DNA is packed into a compact structure called chromosome.

First, the DNA strand is wound around special proteins called histones to form nucleosomes. This structure looks like beads on a string.

Nucleosomes fold and organise as chromatin fibres of 30 nm long.

Finally, chromatin fibres undergo further folding with help from scaffolding proteins to form a chromosome.

DNA in each human cell is 3.2 billion base-pairs long and is organised into 23 chromosomes. This called the genome of the cell.

Humans are diploid meaning there are two copies of every chromosome. One comes from the father and the other from the mother.

So, in total, each cell has 23 + 23 = 46 chromosomes. Each cell therefore has 6.4 billion base pairs of DNA.

Therefore, your haploid genome is 23 chromosomes and diploid genome is 46 chromosomes. Many cells in your body contain a diploid genome.

Sperm cells (in males) and oocytes (in females) contain only a haploid genome. When these two cells fuse during fertilisation, they generate a diploid offspring.

Illustration of DNA organisation in the nucleus of a cell — DNA is organised efficiently in the nucleus of a cell. First, double helix DNA is wound around histones to form nucleosomes. Then, nucleosomes fold further to form chromatin fibres. Finally, after further scaffolding, chromatin is organised as chromosomes in the nucleus of the cell.

What is a DNA sequence?

The order of nucleotides in a DNA molecule is called the DNA sequence. All cells in your body harbour the same DNA sequence.

DNA sequence in your cells is organised into functional units called genes. Genes code for proteins. Simply put, DNA sequence in each gene is literally translated into a protein. Let me explain.

DNA sequence is read to make proteins in a two-step process:

First, DNA makes RNA (mRNA or messenger RNA) by a process called transcription.
Second, mRNA makes proteins by a process called translation.

These two steps are considered the central dogma of life.

The central dogma of life defines the flow of genetic information from genes to proteins, via a messenger like RNA. It is consistent in all forms of life.

Let’s step back a little bit and understand a bit more about RNA and proteins.

What is RNA?

RNA stands for ribonucleic acid. It is a molecule similar to DNA with three key variations:

RNA is usually single-stranded whereas DNA is usually double-stranded.
In RNA, the sugar is ribose instead of deoxyribose.
In RNA, the nitrogen base Thymine (T) is replaced by Uracil (U).

Both DNA and RNA are classified as nucleic acids. The process of converting genetic information on genes (DNA) into RNA and proteins is also called the expression of genes.

What is a Protein?

Proteins are long chains of amino acids folded into 3D shapes and have specific functions. There are 20 common amino acids in nature with corresponding 3 letters and 1 letter representations.

Amino Acid	3-letter code	1-letter code
Alanine	Ala	A
Arginine	Arg	R
Asparagine	Asn	N
Aspartic acid	Asp	D
Cysteine	Cys	C
Glutamine	Gln	Q
Glutamic acid	Glu	E
Glycine	Gly	G
Histidine	His	H
Isoleucine	Ile	I
Leucine	Leu	L
Lysine	Lys	K
Methionine	Met	M
Phenylalanine	Phe	F
Proline	Pro	P
Serine	Ser	S
Threonine	Thr	T
Tryptophan	Trp	W
Tyrosine	Tyr	Y
Valine	Val	V

The activity of proteins manifests as traits. For example, skin colour in humans is due to expression of a set of genes in melanocytes (skin cells).

The Genetic Code

Genes act as code for proteins. A gene with a specific sequence will always make the same protein, no matter which organism. This is called the genetic code, and it is consistent in nearly all forms of life.

Recall that the DNA sequence of genes codes for mRNA (messenger RNA). The mRNA sequence is read as triplets of nucleotides called codons. Each codon containing 3 nucleotides codes for 1 amino acid, the basic unit of protein.

In the genetic code, some codons (usually coding for Methionine) are start codons. They mark the beginning of genes. The codons which mark the end of the gene are called stop codons.

A long chain of amino acids is called a polypeptide. A single or multiple polypeptides fold into 3D structures to become functional proteins.

As there are 4 different nucleotides, a total of 4³ = 64 codons are possible. That’s great as we need to be able to make just 20 amino acids for all the proteins.

The following is a codon chart for mRNA.

The genetic code illustrated as a wheel to easily view all the 64 codons coding for 20 amino acids and stop codons. — The genetic code codons wheel depicting all the 64 codons, including the stop codons. Note that the codon AUG coding for Methionine is the start codon. This is the starting point of gene expression. Read the chart from inside to outside (5´ to 3´ direction). Adapted by Sampath Amitash Gadi based on an illustration obtained from SigmaAldrich.com.

How is DNA sequence maintained?

DNA sequence of genes has to be accurate for a functional protein to be made. A change in the sequence of DNA is called a mutation. A gene with a mutation may result in the production of a non-functional protein.

Loss of protein function may render a cell inefficient or completely disabled. This is the basis of genetic diseases.

Like traits, mutations in genes and therefore diseases can also be inherited by children. Understanding the genetics of a person can help to better diagnose for genetic diseases and disorders.

But, how is DNA sequence copied from the parents in to their children. DNA is copied by a process called DNA replication.

What is DNA Replication?

The process of copying the sequence on one DNA strand into a new DNA strand is called DNA replication.

Cells have evolved a very elegant and elaborate mechanism to accurately and faithfully replicate their DNA sequence.

It is a very complex process and thousands of research papers have been published describing the intricate details of DNA replication. In this overview, I will give you only a birds eye view of DNA replication.

Key steps in DNA replication

Step 1 - Unzipping DNA strands

To replicate the double-stranded DNA, it has to first be unzipped into two single-stranded DNAs. This is achieved by special group of proteins called the replicative helicase.

Step 2 - Synthesise new DNA strands

Unzipped single stranded DNAs act as a template for another special group of proteins called DNA polymerases to synthesise a new complementary DNA strand.

Replicative DNA polymerases can only synthesise a new strand in 5´ to 3´ direction. This means they will always read the template strand in 3´ to 5´ direction.

This means on one strand DNA replication proceeds continuously in the direction of unzipping by the helicase. This is called the leading strand.

On the other strand, DNA replication is against the direction of unzipping by the helicase. Therefore, it is rather replicated in small fragments called Okazaki fragments. This strand is called the lagging strand.

Because of these differences in replication on the two strands of DNA, replication is termed semi-discontinuous.

Step 3 - Separate both DNA copies

After the entire DNA sequence is replicated, the new copy of DNA is called the daughter DNA molecule. From 2 copies of DNA per cell (diploid), we are now at 4 copies of DNA (tetraploid).

The newly copied daughter DNA is eventually transferred to a daughter cell, leaving us with two cells from one.

Illustration depicting the key steps in DNA replication. — Semi-discontinuous DNA replication. DNA is first unzipped by helicases to expose single strands. Polymerases use each single strand as templates to synthesise a new double-stranded DNA.

DNA replication ensures that DNA sequence is maintained and inherited through generations. This is the basis of evolution.

Think of it this way. You obtained your DNA from your parents. Your parents obtained their DNA from their parents and so on. In this way, we can go back up to the last universal common ancestor.

Summary

In this article, I have explored the basics of DNA with an intention to provide an overview. I first described what DNA is and what are the constituents that make up DNA. Then, I described the structure of DNA touching upon the properties of DNA. Next, I wrote about the functions of DNA and DNA sequence and how it is used in life. Finally, I also touched upon how DNA is maintained and inherited through generations.

This article is intended for readers who are just getting started to learn about DNA and genetics. I hope it provides a solid starting point in your quest to understand DNA.

Authors

Author

Sampath Amitash Gadi, Ph.D.

Sampath works as a DNA researcher at the University of Copenhagen. His research deals with how proteins and signaling among proteins help to counter DNA Damage in human cells.