The Science of Existence – Genes


“We do not know what most of our DNA does, nor how, or to what extent it governs traits.” ~ English physicist and chemist Philip Ball

German zoologist Wilhelm Haacke came up with the concept of genes near the end of the 19th century. He imagined that hereditary traits were molecularly self-contained. A gene was originally idealized as a molecular unit of trait heredity.

English evolutionary biologist William Bateson coined the term genetics in 1905, from the Greek word gennō: “to give birth.” Danish botanist Wilhelm Johannsen followed in 1909, using the term genes to define dollops of inheritance information.

These concepts were concocted in anticipation of pursuing the path trod by Gregor Mendel. Researchers set their sights on discovering what they wanted to see rather than exploring what was.

In the 1950s, as molecular biology progressed, a gene was partially redefined as a template for producing a protein. As proteins are the workhorse of cellular life, the fallacious assumption that proteins were associated with heredity lingered, by virtue of their terminological association as genes. Thus, genetics proceeded upon a dogma of woolly wishful thinking.

Simplism in genetics was part and parcel of the mechanistic mindset which has pervaded science since the 17th century, when Descartes formulated the reductionism that materialists have found so appealing.

The facile notion of genes as localized trait heredity units belies the incredibly sophisticated network of knowledge involved in a cell managing its life. Hewing to the simplistic gene paradigm impeded comprehension for decades, and has only loosened in the 21st century, as understanding of epigenetics has dissolved the formulaic concept of genes.

Most of the assumptions that we operate on in molecular biology derive from the initial assumption that most genetic information is transacted by proteins. And while that’s largely true in bacteria, it’s not true for humans. ~ Australian molecular biologist John Mattick

Proteins are the prototypal gene product, but there are also innumerable DNA regions that do not code for protein, but which instead are employed to create useful RNA products. These different RNAs perform a diversity of tasks, including assisting protein synthesis and training, catalyzing biological reactions, cellular communication, and acting instrumentally in gene expression.

Genes are often described as if they are linear sequences, awaiting ready decoding as construction templates for useful products. Nothing could be further from the truth.

The DNA coding schema defies easy characterization because it defies topological comprehension. Further, workable regions vary dynamically, influenced by a variety of factors.

Hemoglobin is the iron-based oxygen transport protein in the red blood cells of vertebrates. Many different proteins go into hemoglobin, as well as other molecular constructions necessary to produce hemoglobin. The instructions for those different components lie on different chromosomes.

The example of hemoglobin highlights not only the distribution of DNA strands, but also the use of nested instructions. The formula for hemoglobin is a set of recipes for distinct components, where each component has its own template (coding sequence).

Genetic material is even more complex. The encoding represented by DNA is nothing more than the score of a symphony that is played by an orchestra, where every player contributes to the overall effect. Lacking a singular conductor, it is an interpretive exercise by a multitude. The way that template data are arrayed and move within cellular space profoundly affects their functioning.

Individual chromosomes occupy distinct territories in the cell nucleus. Where they reside, and what other chromosomes are in the neighborhood, can strongly influence whether the genetic material in a chromosome is active and how it functions.

Operationally, the gene can be defined only as the smallest segment of the gene-string that can be shown to be consistently associated with the occurrence of a specific genetic effect. ~ American geneticist Lewis Stradler in 1954

The definition of gene is non-specific for good reason. A gene is conceptual, not an actual entity: a term for the information encoded within polynucleotides. Genes don’t exist. They are only construals in the minds of geneticists.

20th-century biology was structured according to a linear Newtonian worldview. Molecular biologists were so set about linearity that when the gene came along, they took the gene to be the be-all and end-all of basic biology. That comes out of thinking in terms of particles and linear interactions. ~ Carl Woese

Mapping the notion of genes to reality has meant a constant revision of presumption. Definition of the term gene itself has been a moving target, and its meaning still varies widely. The concept itself is debased in understanding that, however the term is defined, a ‘gene’ is not functionally or structurally delimited. Theoretically, genetics is nothing more than sloppy sophistic philosophy.

The term gene is used as if the recipe and result were synonymous. They are not. Research into epigenetics has shown that genes as an adhered-to rulebook represents an inapt simplification.

However misrepresentative, the concept of genes is so ubiquitously doctrinal that it is the requisite context for introductory exposition. So we proceed.

Genetic Coding

We may have totally misunderstood the nature of the genomic programming. ~ John Mattick

Canonically, the information encoded in a gene serves as a template for assembling a protein from the amino acid level on up. In other words, the construction of each protein is represented by a unique amino acid sequence, which is specified by the nucleotide sequence of the gene encoding the protein.

The relationship between a nucleotide sequence and the corresponding amino acid sequence represents the genetic code. The code is seldom simply translated, nor is it readily packaged in one place.

The canonical genetic code is assumed to be deeply conserved across all domains of life with very few exceptions. ~ Russian geneticist Natalia Ivanova et al

Microbes pay no mind to the canonical dogma of geneticists. Viruses are especially prone to freely interpret genetic codes to suit themselves; a practice called recoding. The little parasites exploit the knowledge that their host typically follows standard scripting, thereby gaining leverage for their wily manipulations.

Codons & Cistrons

DNA and RNA each have 4 nucleobases. Combining base pairs comes up short in expressing enough variations to encode 20 different amino acids: 42=16. But 43=64. So, encoding 20 amino acids requires an arrangement capable of more combinations than base pairs alone can provide.

Hence, genetic coding is not as simple as matched base pairs on the rung-by-rung DNA ladder. Instead, amino acid templates were originally presumed as specified by nucleotide triplets, termed codons, which run along the length of a DNA ladder; not base pairs.

It was once thought that genes would be codon sequences regularly arranged in some discernible order, and that a single gene was a unit of heredity for a trait. Instead, as little as 10% of human DNA has a known coding function. Genetic instructions turned out to be enormously more complicated than codon sequences.

The term codon, while facilely descriptive of the way that genetic data is stored, became considered as insufficient. Codons are not the physical equivalent to genetic function once thought.

Genetics turned out to be infinitely more intricate than early optimism justified. Sometimes the definition of a word is clarified with further understanding, as it was with nucleic acid. Other times, as in the case of codon, where the original definition turns out to be partly inaccurate, a new word is concocted to cover the deficiency.

Though codon is still commonly used, presumed precision of coding sequence to gene is now termed a cistron. A cistron is a hypothetical localized segment of DNA with all the template information required for producing a single protein. The term cistron emphasizes that a gene provides for a specific trait.

The terms gene and cistron are synonymous; 2 names for a baroque complex for producing bioproducts – particularly proteins – needed by cells. These words speciously congeal what Nature adroitly conceals.

Genetic Variations

A poet can survive everything but a misprint. ~ Irish writer Oscar Wilde

Altering a sequence of nucleotides may change the corresponding amino acid sequence, which in turn may affect the structure or function of a protein encoded by DNA. This is the basis of genetic mutation.

A gene’s locus is its position in a genophore or chromosome: the highest level of DNA packaging. Determining the locus for a certain biological trait is termed gene mapping. Genes occasionally move their locus.

Sometimes a gene has a single form. More often there are alleles: genetic variants at the same locus. If the alleles at a locus are the same, they are homozygous; if different, heterozygous.

Different alleles may result in distinct traits, but not necessarily. Alleles may have different levels of influence on genetic expression: equal, or unequal, and unequal to varying degrees.

One allele may be dominant, the other recessive. The dominance hierarchy of alleles is controlled by small RNA (sRNA) molecules that form a regulatory network.

A dominant allele may completely mask a recessive allele. A recessive trait often appears only with homozygous alleles that are recessive.

There are more than 70 known alleles at the gene locus for determining blood type, the ABO locus. This creates a plethora of blood types, some of which are compatible for transfusions between types, others not.

In humans, ~250 different forms of each gene exist. Over half the genes in an individual are unique to that person. Albeit often subtle, this spells a tremendous diversity in personalized proteins and other genetic products.


Our genomes possess an intrinsic level of instability, resulting from the misincorporation of RNA, the chemical sister of DNA. ~ English biochemist Keith Caldecott

While DNA is the universal storage medium for genetic information, cells are awash in RNA, as RNA is employed in decoding the instructions locked away in DNA. There is an operational tension between the two. RNA may mistakenly be incorporated during DNA synthesis. So, there are several quality control and repair processes that ensure DNA integrity.

The notion that the purity of DNA is an intrinsic property of its synthesis is wrong. Effort is required by cells to ensure that genetic material retains its DNA identity. ~ Keith Caldecott

 Gregor Mendel

Gregor Mendel (1822–1884) was an Austrian monk with an inquisitive mind and a love of gardening. Between 1856 and 1863, Mendel cultivated and examined 29,000 pea plants, from which he derived laws of genetic heredity, which he exposited in 1866. His paper was understood as being about hybridization, not heredity, and was ignored for over 3 decades, when it was rediscovered.

Mendel hypothesized hereditary units, as well as speculating about how inheritance manifested, hybridization, and expression of dominant or recessive characteristics. Mendel’s heredity laws were: 1) the law of segregation, and 2) the law of independent assortment.

While genes are paired in normal cells, they are segregated in sex cells (eggs or sperm), which unite to form a gene pair. The pair express either a dominant or recessive characteristic.

A dominant gene trumps a recessive gene, so it takes 2 recessive genes for a recessive characteristic to be expressed. Whence Mendel’s law of segregation.

Mendel’s law of segregation is nominal, and subject to violation. Plants are known to sometimes employ ancestral alleles, not parental genes. This paramutation may occur by inheritance via double-stranded RNA, not DNA.

Mendel’s 2nd law – independent assortment – was that the expression of any one genetic characteristic is not influenced by another. It may have seemed that way for pea plants at first blush, but inheritance is generally much more complex than that.

Genes are often linked. For example, biorhythms determined by an organism’s circadian clock are the product of a gene complex. Many processes are under biorhythmic sway: from organs to tissues to cells. Even the production of ATP in mitochondria oscillates by a molecular clock.


Adaptation is commonly a multidimensional problem, with changes in multiple traits required to match a complex environment. ~ English zoologists M.J. Thompson & C.D. Jiggins

The original quest of geneticists was to comprehend the units of heredity which determined traits – hence the concept of a gene defining a trait with a one-to-one correspondence.

Nature is not so neat. A single gene may influence multiple traits (pleiotropy). Conversely, groups of genes may be inherited together because of close genetic linkage and being functionally related in an evolutionary sense. Such an inherited genetic group is termed a supergene.

Supergenes commonly express variations in traits among a population; an effect achieved by cis-regulatory elements, which are noncoding DNA that regulate transcription of nearby genes. Trait variety may promote population survival when environmental stresses take a toll.

Supergenes and the processes that create them have diverse and far-reaching roles in adaptation and evolution across many groups of organisms. ~ Swedish geneticist Andreas Wallberg

Expression & Regulation

Each cell expresses only a limited amount of its full genetic potential. ~ American biochemist Gordon Tomkins et al

The value of a gene is in its expression: the process of using the genetic information to synthesize a functional protein or other bioproduct.

The conventional view has been that traits manifest on a gene-by-gene basis. Instead, many genes are expressed in groups, and their expressions are affected by innumerable interactions, even contact among DNA strands.

DNA is coiled and tangled like spaghetti inside the cell. So there are many places where the DNA touches and intersects. These interactions could be crucial to how the information in the DNA is read and interpreted by the cell. ~ South African geneticist Marc Weinberg

Regulation is the control of gene expression. There are many ways in which expression may be modified or silenced.

Conventional wisdom holds that modifying a gene to make the encoded protein inactive — ‘knocking out’ the gene — will have more severe effects than merely reducing the gene’s expression level. However, there are many cases in which the opposite occurs. In fact, the knockout of a gene sometimes has no discernible impact, whereas the reduction of expression (knockdown) of the same gene causes major defects. ~ American obstetrician Miles Wilkinson

 Sea Lampreys

Sea lampreys are an ancient jawless fish with over a half-billion years in lineage. Their approach to gene-driven development is extremely conservative.

The genic instructions that guide embryotic development produce pluripotent stem cells, which can differentiate into any cell type. To prevent untoward problems, these potent genes are laid aside in lampreys after early development; sealed away so as not to risk their being misexpressed.

Lampreys experience rampant programmed genome rearrangement and losses during early development. The genes are restricted to the germline compartment suggesting a deeper biological strategy to regulate the genome for highly precise, normal functioning. The strategy removes the possibility that the genes will be expressed in deleterious ways. Humans, on the other hand, must contain these genes through other epigenetic mechanisms that are not foolproof. ~ American biologist Chris Amemiya


A genetic script is not always expressed as it was coded. Gene expression is a multiple-step process, with numerous agents that act with some degree of independence in their performance, albeit under chemical guidance from past events. The protein produced by a gene may itself regulate its natal genetic expression.

Gene expression is a more complex process in eukaryotes than earlier-evolved prokaryotes; an evolutionary elaboration granting further adaptive flexibilities, and perhaps something of a compensation for the ready ability of prokaryotes to share and selectively absorb new genetic information via horizontal gene transfer.

Extracting information from DNA and employing it to produce a biological product is an elaborate process, albeit with 2 principal steps: transcription and translation.


The transcription of a gene is fundamental to the parsing of the genetic code and is highly regulated by the cell. ~ American geneticist Gautham Nair & Indian American molecular biologist Arjun Raj

Transcription is the process of producing an RNA copy from a DNA sequence. An RNA polymerase enzyme unwinds a specific strand of DNA determined by a promoter: a special nucleotide sequence that provides a secure binding site for the RNA polymerase.

The RNA polymerase breaks the hydrogen bonds between the complementary nucleotides, separating the 2 strands of DNA. It then adds matching RNA nucleotides that pair with the complementary DNA bases.

An RNA sugar-phosphate backbone is formed, with assistance from the RNA polymerase. The RNA copy – a transcription unit – is complete. The hydrogen bonds of the untwisted RNA + DNA helices break, freeing the transcription unit.

◊ ◊ ◊

A transcription unit encodes at least 1 gene. If a transcribed gene encodes a protein, the transcription unit is termed messenger RNA (mRNA). Otherwise, the transcription unit may encode various other products: a regulatory RNA (e.g., microRNA), ribosomal RNA, a component used in protein assembly, or a ribozyme.

A central dogma of genetics has long been that the RNA transcription unit is a faithful copy of the DNA master. Geneticists came to this axiom from studying E. coli bacteria, a common gut microbe. They presumed that what worked for a popular prokaryote was a universal genetic truth. Instead, RNA transcription units are often subject to tampering.

Transcription is a highly regulated process, with many decisions made. Various actors in the process communicate in a regulatory network that extends across the transcriptome: all the RNA molecules in a cell or population of cells. Alterations emerge from networked decision-making.

Most genes in the cell are regulated by several transcription factors in a combinatorial fashion, as parts of a complex network. There is also a layer of time-based regulation. ~ Chinese chemist Long Cai

RNA misspellings are common, and they are not random. On average, 20% of RNA copies of human genes contain misspellings. The most common transposition is changing DNA–A to RNA–G. How the misspellings occur is not yet known, nor is their effect understood.

Transcription has other tricks as well. RNA may copy from the strand opposite the one that codes a protein. This too is mysterious.

Protein coding sequences are strongly conserved over evolutionary time. In contrast, changes in transcription binding often factor in speciation. Altering the regulation of transcription is a common avenue for evolutionary adaptation.

◊ ◊ ◊

RNA polymers are compacted and organized in cells to allow protein synthesis. ~ Canadian geneticist Daniel Zenklusen

In prokaryotes, the mRNA created by transcription is ready for translation. The eukaryotic path is more complex.

DNA normally never leaves a eukaryotic cell nucleus. But mRNA can. Messenger RNA carries its amino acid codes out of the nucleus and into the cytoplasm, to a nearby ribosome, which synthesizes proteins from peptide pieces.

In eukaryotes, the mRNA transcription product undergoes a series of modifications prior to translation. Part of this may be an editing process termed RNA splicing, with cutting and pasting segments of exons and introns.


Genes in prokaryotes are continuous DNA strands. In contrast, eukaryotic genes have coding regions (exons) interspersed with noncoding segments (introns).

Introns are nucleotide segments in either DNA or RNA, some of which may be self-splicing: able to extract and insert themselves into gene products. Some introns encode specific proteins; others, functional RNA.

Introns exist in the genomes of bacteria and eukaryotes. Their capabilities are not well understood, but they are known to enhance gene expression. Introns in yeast cells have been found to promote resistance to starvation and promote growth.

DNA coding regions typically comprise several separated exons (coding sequences) that are joined as an RNA transcript. Exons are formed from precursor RNA segments (introns) that are removed from a gene by RNA splicing.

Simultaneous encoding of amino acid and regulatory information within exons is a major functional feature of complex genomes. The information architecture of the received genetic code is optimized for superimposition of additional information, and this intrinsic flexibility has been extensively exploited in evolution. ~ American geneticist Andrew Stergachis et al

When proteins are made from intron-containing genes, RNA splicing is part of the RNA processing pathway that follows transcription and precedes translation.


Translation is the process by which the genetic information contained in mRNA is used to determine the sequential order of amino acids in a protein. ~ English biochemist Michael Ibba & German biochemist Dieter Söll

During translation, messenger RNA is decoded by a ribosome to produce the intended polypeptide: an interim product that will later be folded into an active protein. Just prior to initiating translation is a key point for regulating gene expression – before a ribosome has committed to the energy-intensive process of synthesizing a polypeptide.

Gene expression during translation may be regulated by affecting the stability of mRNA, or by altering whether and how translation transpires. Besides edits, stifling gene expression, either altogether or to a certain degree, plays a significant role.

Proteins may be modified after translation. Some proteins require modification before becoming active.


Closely related protein isoforms can exhibit functional differences. ~ Russian biochemist Anna Kashina et al

Actin is a family of proteins essential to cell functioning. Actin is found in all eukaryotic cells except roundworm sperm. Actin has equivalent cousins (homologs) in prokaryotes.

Distinct actin variants – isoforms – perform a wide variety of different roles, including maintaining cell shape, cell motility, cell division, cell signaling, and intercellular communication. These isoforms are created by mRNA selecting specific exons or through post-translation modifications.

Humans have 6 actin isoforms. 2 in particular – β-actin and γ-actin – are nearly identical in their structure. (β-actin and γ-actin exhibit only minor differences in 4 amino acids in just 1 region of these proteins. Actin altogether comprises 376 amino acids, folded into a labyrinthian arrangement that defines the functional potentialities of this protein family.) Yet these near-twin proteins carry out distinct roles. In mammals, β-actin is critical to embryogenesis, whereas γ-actin plays a regulatory role in managing the proteins in a cell’s cytoskeleton.

Via epigenetic activity, a single gene can produce multiple proteins. Very minor physical changes can have a profound impact on proteome diversity and the behaviors of proteins.

Conversely, different genes may encode selfsame bioproducts. Despite being nearly identical proteins, β-actin and γ-actin are encoded by separate genes. (6 actin genes are expressed in birds and mammals.) The epigenetic activities in producing β-actin and γ-actin are significantly dissimilar yet yield physical self-similarity.

The parts of genes that we think of as being silent actually encode very key functional information. ~ Anna Kashina

Actin illustrates how labyrinthine genetics is, but also shows that there is an energetic component essential to life at the molecular level. The study of genetics has been confined to physical artifacts – DNA sequences – and associated processes upon those artifacts. Neither genetics nor biochemistry can explain how a slight physical difference in proteins affords very distinct behavioral profiles, as with β-actin and γ-actin. More generally, these sciences have no explanation for how DNA sequences can encode the divergent behavioral paradigms which proteins exhibit. (Geneticists cannot even explain how genes encode the patterns of folding which practically define the behavioral potentialities of proteins.) Knowing about epigenetic tweaks does not demystify how protein personalities exist.

Matter transformations cannot explain coherent energy patterns. The issue becomes completely perplexing when considering how molecules such as proteins can behave intelligently through their production via genetics: how matter can inform knowledge and decision-making ability, which are clearly traits of a mind, not a molecular body.


Cephalopods provide a vivid illustration of how genetics is so much more convoluted than geneticists ever imagined, and that so little is understood.


Octopus, squid, and cuttlefish – the coleoid cephalopods – are surprisingly savvy creatures. Scientists have long wondered how soft-bodied coleoids are so much cleverer than their hard-shelled cousin: the nautilus.

One evolutionary hypothesis is that in losing their protective shell, these short-lived creatures compensated with superior acumen; a hypothesis not far removed from hominins losing physical power and gaining abstract reasoning as recompense. This is a big-picture view with molecular implications.

After transcription (transcribing DNA into RNA), coleoids extensively edit directions for making proteins, particularly those involved in making the cells of intelligence: glia and neurons. All told, ~12% of the protein-building instructions related to brain cells are selectively edited. Coleoids also edit RNA related to other tissues, but not nearly as extensively.

It introduces immense complexity and diversity. ~ Israeli geneticist Eli Eisenberg

Coleoid RNA editing has evolutionary significance. Limiting DNA alterations in favor of RNA editing has slowed evolution in coleoids. 10–26% fewer DNA mutations are found in RNA-edited genes than others.

While gene manipulation in these marine mavens seems correlated with intelligence, there is insufficient evidence to infer causality. A mystery lingers.

Quality Control

The cell places a high priority on ensuring that translation produces proteins that accurately reflect the corresponding genetic information. To this end, quality control can be seen at every step in translation where errors might accumulate. ~ Michael Ibba & Dieter Söll

In complex organisms, hundreds of thousands of different proteins are constantly being produced to replace degraded ones. A lot can go wrong in producing proteins, and regularly does. Preventing putting defective proteins on the job can be critical to health.

Protein production quality control is termed nonsense-mediated mRNA decay (NMD). As suggested by its name, NMD focuses on recognizing defective messenger RNA, and efficiently degrading them so that pathetic proteins are not produced.

Messenger RNAs exist in many different configurations in cells, including a stable closed-loop conformation. ~ Indian geneticist Srivathsan Adivarahan

For quality control, mRNA carry a specific protein, termed up-frameshift1 (UPF1). UPF1 is normally removed from the messenger RNA (mRNA) by the ribosome that processes the protein formula carried by the mRNA. But if a ribosome finds the mRNA suspicious, it lets UPF1 stick, thus tagging the mRNA as defective. The ribosome then recruits enzymes to break the bad mRNA down.

Quality control is also applied to ribosomes fresh off the assembly line in the cell nucleolus, before they are exported to the cytoplasm for production work. To ensure that a ribosome has been successfully assembled, a protein border guard does not let the ribosome pass until an enzyme acting as export inspector gives the go-ahead.

Prokaryotic Adaptive Immunity

In 1987, Japanese molecular biologist Yoshizumi Ishino noticed an oddity in an E. coli gene he was studying. It had short, repeating sequences of nucleotides, with 2 repeaters having unique sequences (now known as spacers) between them.

It took 2 decades for geneticists to figure out what Ishino’s discovery meant. In 2007, researchers showed that that the genic repeaters and spacers served as part of an adaptive immune system, herein called pais (an acronym for prokaryotic adaptive immune system). (The prokaryotic adaptive immune system, encapsulated as pais, has hitherto been awkwardly known as CRISPR/Cas. The gene editing tool called CRISPR/Cas9 is covered at the end of the chapter.) Microbes evolved innumerable such immune systems which work in slightly various ways.

Overall, prokaryotes appear to have evolved a nucleic acid-based “immunity” system. ~ French American geneticist Rodolphe Barrangou et al in 2007

Prokaryotes have ever been plagued by viruses. To remember the experience (if they live through it), they preserve the remnants of encountered viral villains within a DNA profile (a spacer bookended by 2 repeater caps).

Prokaryotes can store information in specific loci in their DNA to remember encounters with invaders (such as bacteriophages – viruses that infect bacteria). ~ Israeli microbiologists Rea Globus & Udi Qimron

Spacers are read by specific enzymes that then cut out any exogenous matching DNA they find, which left untouched would spell an infection.

Pais is powerful, but not all microbes have them. 90% of archaea have a pais, but only 35% of bacteria do.

Pais is useful when microbes encounter enough variety of viruses to make adaptive memory worthwhile. But if there is too much viral variety, or viruses are rapidly adapting, pais won’t help, because a microbe might never encounter the same virus again.

All known microbes that live in super-hot environments have pais, as the environment is a fairly stable ecosystem, with a middling viral diversity, which means pais might help.

No immunity comes without a cost. ~ Israeli microbial geneticist Rotem Sorek

Pais has downsides. Microbes may accidentally make spacers from bits of their own DNA, creating an auto-destruct sequence. This rarely happens, as there are built-in preventative mechanisms against it.

(Incoming viral DNA is linear, facilitating its recognition as foreign. A microbe’s genophore is protected because of its circular form. But should a sequence break off and become linear for too long, such as during a stalled replication process, there is a risk of the DNA being taken as alien and encapsulated as a spacer.)

Viruses can fight back against pais, morphing into unrecognizable forms. Alternately, they may develop counter weaponry.

The bacterium Pseudomonas aeruginosa, which resides in soil and water, and can cause dangerous infections in macrobes, has a vigorous pais. Some viruses are not in the least fazed by it. That’s because those viruses have wily proteins that gum up P. aeruginosa’s pais.

Viral anti-pais measures are so common that it leaves geneticists wondering how many pais systems are truly active. There is a tremendous diversity in how vigorously microbes employ their pais as an immune response.

Some E. coli carry a pais that they leave switched off. Why bother? Microbes decide what cellular baggage they keep. They could pitch their pais if it made no difference – thus it must, even if seemingly inactive.

There are many mysteries about pais. For one, spacers should reflect the individual story of the viruses a microbe has encountered. Some do, but most seem generic, and the contents of many remain a conundrum.

Is it the case that there is a huge, unknown amount of viral dark matter in the world? ~ Eugene Koonin

One bacteriophage (a virus that infects bacteria) carries its own pais with it, using it to fight the bacterial defense system that the virus encounters upon infection. The viral pais smartly chops up the segment of bacterial DNA that normally inhibits phage infection.

Beyond the problematic fight against viruses, it’s not always smart for a prokaryote to keep out foreign DNA, which may contain the makings of a useful trait.

Microbes that lack pais are not helpless. Far from it. As much as 10% of the genome of a pais-poor prokaryote may be dedicated to other hawkish defense systems.

Plus, a prokaryote may acquire a pais as conditions warrant. Prokaryotes are prodigious acquirers of environmentally available gene packages, through horizontal gene transfer (HGT). As a form of community altruism, bacteria commonly produce and exude helpful HGT packages for others, as well as picking up on such actionable intel when seeking a solution to their own problems.

Pais may serve as more than just an immune system. Spacers sometimes act to silence genetic expression. By selectively silencing genes, a bacterium may stop making molecules on its surface that are readily detected by a macrobe that the bacterium is intent on infecting. Without a pais in place, the bacterium would blow its cover and be killed.

This is a fairly versatile system that can be used for different things. ~ Russian geneticist Konstantin Severinov


“The genome is a highly sensitive organ of the cell that monitors its activities and corrects common errors, senses unusual and unexpected events, and responds to them, often by restructuring.” ~ American cytogeneticist Barbara McClintock

A genome is (the idea of) the total complement of genes in a cell or organism. If a gene is a recipe, a genome is a recipe book.

Different cell types express different portions of their genome. ~ Gordon Tomkins et al in 1969

It was long supposed that all cells in an organism had the same genome, as the above quote suggests. But that is not so. Multicellular organisms comprise a population of cells, each with its own personal genome (pergenome). Even cells of the same type have their own pergenome.

Prokaryotes have a flexible genome that can change during a single life cycle. This can happen because prokaryotes can readily pick up new genetic material.

Chromosomal mosaicism – genetic variation among cells – can occur by a variety of means, including errors during chromosome segregation or DNA replication, copying variations, gene rearrangement, single-nucleotide variation, or other instabilities.

Such mutations can occur at any stage of development: in stem cells, differentiating cells, and in somatic cells (which are nominally terminally differentiated). The genetic makeup of a multicellular organism is multifarious.

Over evolutionary time, all organisms selectively incorporate alien genetic material. Human DNA includes gene packages from at least 8 retroviruses. Some of these viral genes are essential to human reproduction.

◊ ◊ ◊

A genome comes in no particular order. While genome structure is surmised as significant, it is more likely to have been preserved simply by inertia.

“Intuitively, you wouldn’t believe that just by chance things would be conserved for 500 million years.” ~ French molecular biologist Daniel Chourrout

The number of genes in an organism is a meaningless statistic, especially in comparing organisms in the same kingdom. Some prokaryotes have thousands of genome copies (polyploidy) .

For multicellular eukaryotes, only a fraction of a genome is actively employed. Most of a genome is kept as a historical reference: a database of possibilities for the future from the experiences of the past. This legacy information is accessed as needed.

Plants commonly experiment genetically. They may duplicate their genome, with the original serving as a reference, and the copy as a testbed. For instance, 70 million years ago, the tomato triplicated its genome: keeping a preserved master copy and generating 2 spare copies to adaptively mutate. One result was the birth of the potato, a tuber-producing evolutionary offspring.

“Replication is like a mirror that reflects the evolutionary history of living beings: the first genes to be replicated are the oldest, while those that replicate later on are the youngest.” ~ Spanish biologist Alfonso Valencia

In replicating a genome, the most valued, conserved genes are copied first. Newer genes, in evolutionarily active regions, are copied afterwards.

“The regions that replicate late also have a compact and inaccessible structure; they are hidden zones in the genome that act as evolutionary laboratories, where these genes can acquire new functions without affecting essential processes in the organism.” ~ Spanish biologist David de Juan