What's the biggest obligate anaerobic organism discovered till now?

What's the biggest obligate anaerobic organism discovered till now?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Beside many anaerobic single cell organisms, there are some annelid worms that are obligate anaerobic at least in their early development stages. Probably, due to low concentration levels in nature, some of the other respiration methods (like uranium or iron reduction) can limit the maximum size an organism will grow. Given the lower energy level extracted compared to the oxygen respiration, what size limits enforces the sulfate, sulfur or methanogenesis respiration? What's the biggest discovered organism, extinct or alive, that is obligate anaerobic?

There are three worms which have been found in the sediment of the mediterranean seafloor, which not only live without oxygen but also do not tolerate the exposure to oxygen. They belong to the metazoans, for more details see either the report (reference 1) or the original article (reference 2). They reach a size og about 1mm.


  1. Scientists discover first multicellular life that doesn't need oxygen
  2. The first metazoa living in permanently anoxic conditions


Infectious diseases kill millions of people each year, but the search for treatments is hampered by the fact that laboratory mice are not susceptible to some human viruses, including killers like human immunodeficiency virus (HIV). For decades, researchers have turned to mice whose immune systems have been “humanized” to respond in a manner similar to humans.

Now a team at Princeton University has developed a comprehensive way to evaluate how immune responses of humanized mice measure up to actual humans. The research team looked at the mouse and human immune responses to one of the strongest vaccines known, a yellow fever vaccine called YFV-17D. The comparison of these “vaccine signatures” showed that a newly developed humanized mouse developed at Princeton shares significant immune-system responses with humans. The study was published in the journal Nature Communications.

“Understanding immune responses to human pathogens and potential vaccines remains challenging due to differences in the way our human immune system responds to stimuli, as compared to for example that of conventional mice, rats or other animals,” said Alexander Ploss, associate professor of molecular biology at Princeton. “Until now a rigorous method for testing the functionality of the human immune system in such a model has been missing. Our study highlights an experimental paradigm to address this gap.”

Humanized mice have been used in infectious disease research since the late 1980s. Yet without rigorous comparisons, researchers know little about how well the mice predict human responses such as the production of infection-fighting cells and antibodies.

To address this issue, researchers exposed the mice to the YFV-17D vaccine, which is made from a weakened, or attenuated, living yellow fever virus. Vaccines protect against future infection by provoking the production of antibodies and immune-system cells.

In previous work, the researchers explored the effect of YFV-17D on conventional humanized mice. But the researchers found that the mice responded only weakly. This led them to develop a mouse with responses that are more similar to those of humans.

To do so, the researchers introduced additional human genes for immune system components — such as molecules that detect foreign invaders and chemical messengers called cytokines — so that the complexity of the engrafted human immune system reflected that of humans. They found that the new mice have responses to YFV-17D that are very similar to the responses seen in humans. For example, the pattern of gene expression that occurs in response to YFV-17D in the mice shared significant similarities to that of humans. This signature gene expression pattern, reflected in the “transcriptome,” or total readout of all of the genes of the organism, translated into better control of the yellow fever virus infection and to immune responses that were more specific to yellow fever.

The researchers also looked at two other types of immune responses: the cellular responses, involving production of cytotoxic T cells and natural killer cells that attack and kill infected cells, and the production of antibodies specific to the virus. By evaluating these three types of responses – transcriptomic, cellular, and antibody – in both mice and humans, the researchers produced a reliable platform for evaluating how well the mice can serve as proxies for humans.

Florian Douam, a postdoctoral research associate and the first author on the study, hopes that the new testing platform will help researchers explore exactly how vaccines induce immunity against pathogens, which in many cases is not well understood.

“Many vaccines have been generated empirically without profound knowledge of how they induce immunity,” Douam said. “The next generation of mouse models, such as the one we introduced in our study, offer unprecedented opportunities for investigating the fundamental mechanisms that define the protective immunity induced by live-attenuated vaccines.”

Mice bearing human cells or human tissues have the potential to aid research on treatments for many diseases that infect humans but not other animals, such as – in addition to HIV – Epstein Barr Virus, human T-cell leukemia virus, and Karposi sarcoma-associated herpes virus.

“Our study highlights the importance of human biological signatures for guiding the development of mouse models of disease,” said Ploss. “It also highlights a path toward developing better models for human immune responses.”

The study involved contributions from Florian Douam, Gabriela Hrebikova, Jenna Gaska, Benjamin Winer and Brigitte Heller in Princeton University’s Department of Molecular Biology Robert Leach, Lance Parsons and Wei Wang in Princeton University’s Lewis Sigler Institute for Integrative Genomics Bruno Fant at the University of Pennsylvania Carly G. K. Ziegler and Alex K. Shalek of Massachusetts Institute of Technology and Harvard Medical School and Alexander Ploss in Princeton University’s Department of Molecular Biology.

The research was supported the National Institutes of Health (NIH, R01AI079031 and R01AI107301, to A.P) and an Investigator in Pathogenesis Award by the Burroughs Wellcome Fund (to A.P.). Additionally, A.K.S. was supported by the Searle Scholars Program, the Beckman Young Investigator Program, the NIH (1DP2OD020839, 5U24AI118672, 1U54CA217377, 1R33CA202820, 2U19AI089992, 1R01HL134539, 2RM1HG006193, 2R01HL095791, P01AI039671), and the Bill & Melinda Gates Foundation (OPP1139972). C.G.K.Z. was supported by a grant from the National Institute of General Medical Sciences (NIGMS, T32GM007753). J.M.G. and B.Y.W. were supported by a pre-doctoral training grant from the NIGMS (T32GM007388). B.Y.W. was also a recipient of a pre-doctoral fellowship from the New Jersey Commission on Cancer Research.

By Catherine Zandonella, Office of the Dean for Research



Kid: “Mommy he just offered me drugs.”

Mother: “Oh really what is drugs?”

Drugs are simply substances taken into the body which can affect our activities and our mental state.

Famous quote, “Physically present, mentally absent”.

There are many types of drugs types of drugs are divided based on their effect to the body and their forms these are some of them:

  1. Narcotics
  2. Stimulants/amphetamines
  3. Marijuana
  4. Cocaine/crack
  5. Sedatives/barbiturates
  6. Hallucinogens
  7. Antibiotics

These drugs are widely use in the medical field as a pain relieved. Heroin, opium, codeine and morphine belong to this type but now they are widely abused for fun. narcotics are extremely addictive, both physically and psychologically. Further consumption of these drugs can lead to liver disease, tetanus, anemia, pneumonia, and other worse effects

Stimulants such as Dexedrine and methamphetamine (also known as “crystal meth”) cause the increase in alertness and physical activity. These drugs are addictive. Amphetamines increase the heart rate as well as the breathing rate. In the first take of the drug, users will feel restless, anxious and they become moody and excitable and they will also have a false sense of confidence.

Consuming large amounts of this kind of drug will cause a condition which is called amphetamine psychosis where a person experiences hallucinations, delusions, and mental confusions. Overdose of amphetamine may cause cardiac arrhythmias, headaches, convulsions, hypertension, coma and death.

Marijuana contains more cancer-causing agents than those that are found in tobacco. Even a low dose of marijuana can interfere with the coordination the time of reaction, reasoning and judgment which makes driving under the influence of this drug is extremely dangerous. Overdose can result in death.


Cocaine is an extremely addictive stimulant. These drugs give the users an intense feeling of euphoria but very short-lived. Since it is short-lived, users typically use it again and again to try and recapture that feeling of “high”. Physical effects of this drug include an increase in blood pressure, heart rate body temperature. The snorting of cocaine can severely damage the nasal membranes overtime.


Like stimulants, these drugs are widely used in the medical field where it is used to give the feeling of relaxation to the user, but these drugs have also been abused by any people. Sedatives cause slurred speech, and disorientation of behavior.


Some examples of hallucinogens are LSD, DMT, Mescaline, PCP, and Psilocybin. These drugs have very unpredictable effects. The user may experience morbid hallucinations, the feeling of panicked, confused, or paranoid.


Antibiotics are drugs that are used to cure diseases. These drugs kill microorganisms in the body that cause diseases. Some examples of antibiotics are penicillin and parasetamol. Although these drugs are widely used to cure most diseases, these drugs cannot kill a virus. Since a virus is not a living organism, viruses cannot be killed by antibiotics.

What's the biggest obligate anaerobic organism discovered till now? - Biology

Last updated
August 7, 2003

Here, GNN posts abstracts of scientific papers on whole genome sequences that have been reported by GenomeNewsNetwork.

Agrobacterium tumefaciens C58 Anabaena sp. strain PCC 7120 Anopheles gambiae PEST Arabidopsis thaliana
Bacillus anthracis Bacteroides thetaiotaomicron VPI-5482 Bifidobacterium longum NCC2705 Blochmannia floridanus Brucella melitensis 16M
Brucella suis 1330 strain Buchnera aphidicola [symbiont of Schizaphis graminum (Sg)] Buchnera sp. APS
Caulobacter crescentus Chlorobium tepidum TLS Ciona intestinalis Clostridium perfringens strain 13 Coxiella burnetiiRSA 493
D - E - F
Drosophila melanogaster Encephalitozoon cuniculi (GB-M1) Enterococcus faecalis Escherichia coli O157:H7 Fugu rubripes
Fusobacterium nucleatum strain ATCC 25586
H - L
Halobacterium sp. NRC-1 Homo sapiens Lactococcus lactis ssp. lactis IL1403
Methanococcoides burtonii Methanogenium frigidum Methanopyrus kandleri AV19 Methanosarcina acetivorans C2A Mycobacterium leprae Mycoplasma pulmonis UAB CTIP
N - O
Neisseria meningitidis MC58 (serogroup B) Neisseria meningitidis Z2491 Neurospora crassa Oceanobacillus iheyensis HTE831 Oryza sativa L. ssp. indica
Oryza sativa L. ssp. japonica
Pasteurella multocida, Pm70 Plasmodium falciparum 3D7 Plasmodium yoelii yoelii Pseudomonas aeruginosa PAO1 Pseudomonas putida KT2440 Pyrobaculum aerophilum IM2
Ralstonia solanacearum strain GMI1000 Rhodopirellula baltica Rickettsia conorii strain Malish 7
Salmonella enterica serovar Typhi CT18 Salmonella enterica serovar Typhimurium LT2 Schizosaccharomyces pombe Shewanella oneidensis MR-1 Shigella flexneri 2a Strawberry mottle virus
Streptococcus agalactiae strain 2603 V/R Streptococcus group A strain MGAS8232 Streptococcus mutans UA159 Streptococcus pneumoniae (serotype 4) Streptococcus pneumoniae strain R6 Streptococcus pyogenes M1
Streptomyces avermitilis ATCC31267 Streptomyces coelicolor A3(2) Sulfolobus solfataricus P2 Sulfolobus tokodaii strain7
T - V - W
Thermoanaerobacter tengcongensis MB4(T) Thermoplasma acidophilum Tropheryma whippleiTW08/27 Vibrio cholerae El Tor N16961 Wigglesworthia glossinidia brevipalpis
X - Y
Xanthomonas axonopodis pv. citri (strain 306) Xanthomonas campestris pv. campestris (strain ATCC33913) Xylella fastidiosa 9a5c strain (citrus) Xylella fastidiosa Dixon strain (almond)
Xylella fastidiosa Ann-1 strain (oleander) Yersinia pestis strain CO92

Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life.

Proc Natl Acad Sci U S A. 2003 Aug 5100(16):9388-93.

We generated draft genome sequences for two cold-adapted Archaea, Methanogenium frigidum and Methanococcoides burtonii, to identify genotypic characteristics that distinguish them from Archaea with a higher optimal growth temperature (OGT). Comparative genomics revealed trends in amino acid and tRNA composition, and structural features of proteins. Proteins from the cold-adapted Archaea are characterized by a higher content of noncharged polar amino acids, particularly Gln and Thr and a lower content of hydrophobic amino acids, particularly Leu. Sequence data from nine methanogen genomes (OGT 15 degrees -98 degrees C) were used to generate 1111 modeled protein structures. Analysis of the models from the cold-adapted Archaea showed a strong tendency in the solvent-accessible area for more Gln, Thr, and hydrophobic residues and fewer charged residues. A cold shock domain (CSD) protein (CspA homolog) was identified in M. frigidum, two hypothetical proteins with CSD-folds in M. burtonii, and a unique winged helix DNA-binding domain protein in M. burtonii. This suggests that these types of nucleic acid binding proteins have a critical role in cold-adapted Archaea. Structural analysis of tRNA sequences from the Archaea indicated that GC content is the major factor influencing tRNA stability in hyperthermophiles, but not in the psychrophiles, mesophiles or moderate thermophiles. Below an OGT of 60 degrees C, the GC content in tRNA was largely unchanged, indicating that any requirement for flexibility of tRNA in psychrophiles is mediated by other means. This is the first time that comparisons have been performed with genome data from Archaea spanning the growth temperature extremes from psychrophiles to hyperthermophiles.

Genome Res. 2003 Jul13(7):1580-8.

Pirellula sp. strain 1 ("Rhodopirellula baltica") is a marine representative of the globally distributed and environmentally important bacterial order Planctomycetales. Here we report the complete genome sequence of a member of this independent phylum. With 7.145 megabases, Pirellula sp. strain 1 has the largest circular bacterial genome sequenced so far. The presence of all genes required for heterolactic acid fermentation, key genes for the interconversion of C1 compounds, and 110 sulfatases were unexpected for this aerobic heterotrophic isolate. Although Pirellula sp. strain 1 has a proteinaceous cell wall, remnants of genes for peptidoglycan synthesis were found. Genes for lipid A biosynthesis and homologues to the flagellar L- and P-ring protein indicate a former Gram-negative type of cell wall. Phylogenetic analysis of all relevant markers clearly affiliates the Planctomycetales to the domain Bacteria as a distinct phylum, but a deepest branching is not supported by our analyses.

Proc Natl Acad Sci U S A. 2003 Jul 8100(14):8298-303.

The 1,995,275-bp genome of Coxiella burnetii, Nine Mile phase I RSA493, a highly virulent zoonotic pathogen and category B bioterrorism agent, was sequenced by the random shotgun method. This bacterium is an obligate intracellular acidophile that is highly adapted for life within the eukaryotic phagolysosome. Genome analysis revealed many genes with potential roles in adhesion, invasion, intracellular trafficking, host-cell modulation, and detoxification. A previously uncharacterized 13-member family of ankyrin repeat-containing proteins is implicated in the pathogenesis of this organism. Although the lifestyle and parasitic strategies of C. burnetii resemble that of Rickettsiae and Chlamydiae, their genome architectures differ considerably in terms of presence of mobile elements, extent of genome reduction, metabolic capabilities, and transporter profiles. The presence of 83 pseudogenes displays an ongoing process of gene degradation. Unlike other obligate intracellular bacteria, 32 insertion sequences are found dispersed in the chromosome, indicating some plasticity in the C. burnetii genome. These analyses suggest that the obligate intracellular lifestyle of C. burnetii may be a relatively recent innovation.

Proc Natl Acad Sci U S A 2003 Apr 29100(9):5455-60.

Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes-more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis of the gene set yields insights into unexpected aspects of Neurospora biology including the identification of genes potentially associated with red light photobiology, genes implicated in secondary metabolism, and important differences in Ca(2+) signalling as compared with plants and animals. Neurospora possesses the widest array of genome defence mechanisms known for any eukaryotic organism, including a process unique to fungi called repeat-induced point mutation (RIP). Genome analysis suggests that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes.

Nature 2003 Apr 24422(6934):859-68.

The complete genome sequence of Enterococcus faecalis V583, a vancomycin-resistant clinical isolate, revealed that more than a quarter of the genome consists of probable mobile or foreign DNA. One of the predicted mobile elements is a previously unknown vanB vancomycin-resistance conjugative transposon. Three plasmids were identified, including two pheromone-sensing conjugative plasmids, one encoding a previously undescribed pheromone inhibitor. The apparent propensity for the incorporation of mobile elements probably contributed to the rapid acquisition and dissemination of drug resistance in the enterococci.

Science 2003 Mar 28299(5615):2071-4.

The human gut is colonized with a vast community of indigenous microorganisms that help shape our biology. Here, we present the complete genome sequence of the Gram-negative anaerobe Bacteroides thetaiotaomicron, a dominant member of our normal distal intestinal microbiota. Its 4779-member proteome includes an elaborate apparatus for acquiring and hydrolyzing otherwise indigestible dietary polysaccharides and an associated environment-sensing system consisting of a large repertoire of extracytoplasmic function sigma factors and one- and two-component signal transduction systems. These and other expanded paralogous groups shed light on the molecular mechanisms underlying symbiotic host-bacterial relationships in our intestine.

Science 2003 Mar 28299(5615):2074-6.

BACKGROUND: Whipple's disease is a rare multisystem chronic infection, involving the intestinal tract as well as various other organs. The causative agent, Tropheryma whipplei, is a Gram-positive bacterium about which little is known. Our aim was to investigate the biology of this organism by generating and analysing the complete DNA sequence of its genome. METHODS: We isolated and propagated T whipplei strain TW08/27 from the cerebrospinal fluid of a patient diagnosed with Whipple's disease. We generated the complete sequence of the genome by the whole genome shotgun method, and analysed it with a combination of automatic and manual bioinformatic techniques. FINDINGS: Sequencing revealed a condensed 925938 bp genome with a lack of key biosynthetic pathways and a reduced capacity for energy metabolism. A family of large surface proteins was identified, some associated with large amounts of non-coding repetitive DNA, and an unexpected degree of sequence variation. INTERPRETATION: The genome reduction and lack of metabolic capabilities point to a host-restricted lifestyle for the organism. The sequence variation indicates both known and novel mechanisms for the elaboration and variation of surface structures, and suggests that immune evasion and host interaction play an important part in the lifestyle of this persistent bacterial pathogen.

Lancet 2003 Feb 22361(9358):637-44.

Pseudomonas putida is a metabolically versatile saprophytic soil bacterium that has been certified as a biosafety host for the cloning of foreign genes. The bacterium also has considerable potential for biotechnological applications. Sequence analysis of the 6.18 Mb genome of strain KT2440 reveals diverse transport and metabolic systems. Although there is a high level of genome conservation with the pathogenic Pseudomonad Pseudomonas aeruginosa (85% of the predicted coding regions are shared), key virulence factors including exotoxin A and type III secretion systems are absent. Analysis of the genome gives insight into the non-pathogenic nature of P. putida and points to potential new applications in agriculture, biocatalysis, bioremediation and bioplastic production.

Environ Microbiol 2002 Dec4(12):799-808.

The first chordates appear in the fossil record at the time of the Cambrian explosion, nearly 550 million years ago. The modern ascidian tadpole represents a plausible approximation to these ancestral chordates. To illuminate the origins of chordate and vertebrates, we generated a draft of the protein-coding portion of the genome of the most studied ascidian, Ciona intestinalis. The Ciona genome contains approximately 16,000 protein-coding genes, similar to the number in other invertebrates, but only half that found in vertebrates. Vertebrate gene families are typically found in simplified form in Ciona, suggesting that ascidians contain the basic ancestral complement of genes involved in cell signaling and development. The ascidian genome has also acquired a number of lineage-specific innovations, including a group of genes engaged in cellulose metabolism that are related to those in bacteria and fungi.

Science 2002 Dec 13298(5601):2157-67.

We have sequenced the genome of Shigella flexneri serotype 2a, the most prevalent species and serotype that causes bacillary dysentery or shigellosis in man. The whole genome is composed of a 4 607 203 bp chromosome and a 221 618 bp virulence plasmid, designated pCP301. While the plasmid shows minor divergence from that sequenced in serotype 5a, striking characteristics of the chromosome have been revealed. The S. flexneri chromosome has, astonishingly, 314 IS elements, more than 7-fold over those possessed by its close relatives, the non-pathogenic K12 strain and enterohemorrhagic O157:H7 strain of Escherichia coli. There are 13 translocations and inversions compared with the E. coli sequences, all involve a segment larger than 5 kb, and most are associated with deletions or acquired DNA sequences, of which several are likely to be bacteriophage-transmitted pathogenicity islands. Furthermore, S. flexneri, resembling another human-restricted enteric pathogen, Salmonella typhi, also has hundreds of pseudogenes compared with the E. coli strains. All of these could be subjected to investigations towards novel preventative and treatment strategies against shigellosis.

Nucleic Acids Res 2002 Oct 1530(20):4432-41.

Streptococcus mutans is the leading cause of dental caries (tooth decay) worldwide and is considered to be the most cariogenic of all of the oral streptococci. The genome of S. mutans UA159, a serotype c strain, has been completely sequenced and is composed of 2,030,936 base pairs. It contains 1,963 ORFs, 63% of which have been assigned putative functions. The genome analysis provides further insight into how S. mutans has adapted to surviving the oral environment through resource acquisition, defense against host factors, and use of gene products that maintain its niche against microbial competitors. S. mutans metabolizes a wide variety of carbohydrates via nonoxidative pathways, and all of these pathways have been identified, along with the associated transport systems whose genes account for almost 15% of the genome. Virulence genes associated with extracellular adherent glucan production, adhesins, acid tolerance, proteases, and putative hemolysins have been identified. Strain UA159 is naturally competent and contains all of the genes essential for competence and quorum sensing. Mobile genetic elements in the form of IS elements and transposons are prominent in the genome and include a previously uncharacterized conjugative transposon and a composite transposon containing genes for the synthesis of antibiotics of the gramicidin/bacitracin family however, no bacteriophage genomes are present.

Proc Natl Acad Sci U S A 2002 Oct 23 [epub ahead of print].

Bifidobacteria are Gram-positive prokaryotes that naturally colonize the human gastrointestinal tract (GIT) and vagina. Although not numerically dominant in the complex intestinal microflora, they are considered as key commensals that promote a healthy GIT. We determined the 2.26-Mb genome sequence of an infant-derived strain of Bifidobacterium longum, and identified 1,730 possible coding sequences organized in a 60%-GC circular chromosome. Bioinformatic analysis revealed several physiological traits that could partially explain the successful adaptation of this bacteria to the colon. An unexpectedly large number of the predicted proteins appeared to be specialized for catabolism of a variety of oligosaccharides, some possibly released by rare or novel glycosyl hydrolases acting on "nondigestible" plant polymers or host-derived glycoproteins and glycoconjugates. This ability to scavenge from a large variety of nutrients likely contributes to the competitiveness and persistence of bifidobacteria in the colon. Many genes for oligosaccharide metabolism were found in self-regulated modules that appear to have arisen in part from gene duplication or horizontal acquisition. Complete pathways for all amino acids, nucleotides, and some key vitamins were identified however, routes for Asp and Cys were atypical. More importantly, genome analysis provided insights into the reciprocal interactions of bifidobacteria with their hosts. We identified polypeptides that showed homology to most major proteins needed for production of glycoprotein-binding fimbriae, structures that could possibly be important for adhesion and persistence in the GIT. We also found a eukaryotic-type serine protease inhibitor (serpin) possibly involved in the reported immunomodulatory activity of bifidobacteria.

Proc Natl Acad Sci U S A 2002 Oct 15 [epub ahead of print].

Shewanella oneidensis is an important model organism for bioremediation studies because of its diverse respiratory capabilities, conferred in part by multicomponent, branched electron transport systems. Here we report the sequencing of the S. oneidensis genome, which consists of a 4,969,803-base pair circular chromosome with 4,758 predicted protein-encoding open reading frames (CDS) and a 161,613-base pair plasmid with 173 CDSs. We identified the first Shewanella lambda-like phage, providing a potential tool for further genome engineering. Genome analysis revealed 39 c-type cytochromes, including 32 previously unidentified in S. oneidensis, and a novel periplasmic [Fe] hydrogenase, which are integral members of the electron transport system. This genome sequence represents a critical step in the elucidation of the pathways for reduction (and bioremediation) of pollutants such as uranium (U) and chromium (Cr), and offers a starting point for defining this organism's complex electron transport systems and metal ion-reducing capabilities.

Nat Biotechnol 2002 Oct 7 [epub ahead of print].

Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303 scaffolds the largest scaffold was 23.1 million base pairs. There was substantial genetic variation within this strain, and the apparent existence of two haplotypes of approximately equal frequency ("dual haplotypes") in a substantial fraction of the genome likely reflects the outbred nature of the PEST strain. The sequence produced a conservative inference of more than 400,000 single-nucleotide polymorphisms that showed a markedly bimodal density distribution. Analysis of the genome sequence revealed strong evidence for about 14,000 protein-encoding transcripts. Prominent expansions in specific families of proteins likely involved in cell adhesion and immunity were noted. An expressed sequence tag analysis of genes regulated by blood feeding provided insights into the physiological adaptations of a hematophagous insect.

Science 2002 Oct 4298(5591):129-49.

The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.

Nature 2002 Oct 3419(6906):498-511.

Species of malaria parasite that infect rodents have long been used as models for malaria disease research. Here we report the whole-genome shotgun sequence of one species, Plasmodium yoelii yoelii, and comparative studies with the genome of the human malaria parasite Plasmodium falciparum clone 3D7. A synteny map of 2,212 P. y. yoelii contiguous DNA sequences (contigs) aligned to 14 P. falciparum chromosomes reveals marked conservation of gene synteny within the body of each chromosome. Of about 5,300 P. falciparum genes, more than 3,300 P. y. yoelii orthologues of predominantly metabolic function were identified. Over 800 copies of a variant antigen gene located in subtelomeric regions were found. This is the first genome sequence of a model eukaryotic parasite, and it provides insight into the use of such systems in the modelling of Plasmodium biology and disease.

Nature 2002 Oct 3419(6906):512-9.

Oceanobacillus iheyensis HTE831 is an alkaliphilic and extremely halotolerant Bacillus-related species isolated from deep-sea sediment. We present here the complete genome sequence of HTE831 along with analyses of genes required for adaptation to highly alkaline and saline environments. The genome consists of 3.6 Mb, encoding many proteins potentially associated with roles in regulation of intracellular osmotic pressure and pH homeostasis. The candidate genes involved in alkaliphily were determined based on comparative analysis with three Bacillus species and two other Gram-positive species. Comparison with the genomes of other major Gram-positive bacterial species suggests that the backbone of the genus Bacillus is composed of approximately 350 genes. This second genome sequence of an alkaliphilic Bacillus-related species will be useful in understanding life in highly alkaline environments and microbial diversity within the ubiquitous bacilli.

Nucleic Acids Res 2002 Sep 1530(18):3927-35.

The 3.31-Mb genome sequence of the intracellular pathogen and potential bioterrorism agent, Brucella suis, was determined. Comparison of B. suis with Brucella melitensis has defined a finite set of differences that could be responsible for the differences in virulence and host preference between these organisms, and indicates that phage have played a significant role in their divergence. Analysis of the B. suis genome reveals transport and metabolic capabilities akin to soil/plant-associated bacteria. Extensive gene synteny between B. suis chromosome 1 and the genome of the plant symbiont Mesorhizobium loti emphasizes the similarity between this animal pathogen and plant pathogens and symbionts. A limited repertoire of genes homologous to known bacterial virulence factors were identified.

Proc Natl Acad Sci U S A 2002 Sep 23 [epub ahead of print].

Xylella fastidiosa (Xf) causes wilt disease in plants and is responsible for major economic and crop losses globally. Owing to the public importance of this phytopathogen we embarked on a comparative analysis of the complete genome of Xf pv citrus and the partial genomes of two recently sequenced strains of this species: Xf pv almond and Xf pv oleander, which cause leaf scorch in almond and oleander plants, respectively. We report a reanalysis of the previously sequenced Xf 9a5c (CVC, citrus) strain and the two "gapped" Xf genomes revealing ORFs encoding critical functions in pathogenicity and conjugative transfer. Second, a detailed whole-genome functional comparison was based on the three sequenced Xf strains, identifying the unique genes present in each strain, in addition to those shared between strains. Third, an "in silico" cellular reconstruction of these organisms was made, based on a comparison of their core functional subsystems that led to a characterization of their conjugative transfer machinery, identification of potential differences in their adhesion mechanisms, and highlighting of the absence of a classical quorum-sensing mechanism. This study demonstrates the effectiveness of comparative analysis strategies in the interpretation of genomes that are closely related.

Proc Natl Acad Sci U S A 2002 Sep 1799(19):12403-12408.

The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined with comparative genome hybridization experiments between the sequenced serotype V strain 2603 V/R and 19 S. agalactiae strains from several serotypes using whole-genome microarrays, revealed the genetic heterogeneity among S. agalactiae strains, even of the same serotype, and provided insights into the evolution of virulence mechanisms.

Proc Natl Acad Sci U S A 2002 Sep 1799(19):12391-12396.

Many insects that rely on a single food source throughout their developmental cycle harbor beneficial microbes that provide nutrients absent from their restricted diet. Tsetse flies, the vectors of African trypanosomes, feed exclusively on blood and rely on one such intracellular microbe for nutritional provisioning and fecundity. As a result of co-evolution with hosts over millions of years, these mutualists have lost the ability to survive outside the sheltered environment of their host insect cells. We present the complete annotated genome of Wigglesworthia glossinidia brevipalpis, which is composed of one chromosome of 697,724 base pairs (bp) and one small plasmid, called pWig1, of 5,200 bp. Genes involved in the biosynthesis of vitamin metabolites, apparently essential for host nutrition and fecundity, have been retained. Unexpectedly, this obligate's genome bears hallmarks of both parasitic and free-living microbes, and the gene encoding the important regulatory protein DnaA is absent.

Nat Genet 2002 Sep 3 [epub ahead of print].

The compact genome of Fugu rubripes has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds. In this 365-megabase vertebrate genome, repetitive DNA accounts for less than one-sixth of the sequence, and gene loci occupy about one-third of the genome. As with the human genome, gene loci are not evenly distributed, but are clustered into sparse and dense regions. Some "giant" genes were observed that had average coding sequence sizes but were spread over genomic lengths significantly larger than those of their human orthologs. Although three-quarters of predicted human proteins have a strong match to Fugu, approximately a quarter of the human proteins had highly diverged from or had no pufferfish homologs, highlighting the extent of protein evolution in the 450 million years since teleosts and mammals diverged. Conserved linkages between Fugu and human genes indicate the preservation of chromosomal segments from the common vertebrate ancestor, but with considerable scrambling of gene order.

Science 2002 Jul 25 [epub ahead of print].

Comparison of two fully sequenced genomes of Buchnera aphidicola, the obligate endosymbionts of aphids, reveals the most extreme genome stability to date: no chromosome rearrangements or gene acquisitions have occurred in the past 50 to 70 million years, despite substantial sequence evolution and the inactivation and loss of individual genes. In contrast, the genomes of their closest free-living relatives, Escherichia coli and Salmonella spp., are more than 2000-fold more labile in content and gene order. The genomic stasis of B. aphidicola, likely attributable to the loss of phages, repeated sequences, and recA, indicates that B. aphidicola is no longer a source of ecological innovation for its hosts.

Science 2002 Jun 28296(5577):2376-9.

The complete genome of the green-sulfur eubacterium Chlorobium tepidum TLS was determined to be a single circular chromosome of 2,154,946 bp. This represents the first genome sequence from the phylum Chlorobia, whose members perform anoxygenic photosynthesis by the reductive tricarboxylic acid cycle. Genome comparisons have identified genes in C. tepidum that are highly conserved among photosynthetic species. Many of these have no assigned function and may play novel roles in photosynthesis or photobiology. Phylogenomic analysis reveals likely duplications of genes involved in biosynthetic pathways for photosynthesis and the metabolism of sulfur and nitrogen as well as strong similarities between metabolic processes in C. tepidum and many Archaeal species.

Proc Natl Acad Sci U S A 2002 Jul 999(14):9509-14.

The genus Xanthomonas is a diverse and economically important group of bacterial phytopathogens, belonging to the gamma-subdivision of the Proteobacteria. Xanthomonas axonopodis pv. citri (Xac) causes citrus canker, which affects most commercial citrus cultivars, resulting in significant losses worldwide. Symptoms include canker lesions, leading to abscission of fruit and leaves and general tree decline. Xanthomonas campestris pv. campestris (Xcc) causes black rot, which affects crucifers such as Brassica and Arabidopsis. Symptoms include marginal leaf chlorosis and darkening of vascular tissue, accompanied by extensive wilting and necrosis. Xanthomonas campestris pv. campestris is grown commercially to produce the exopolysaccharide xanthan gum, which is used as a viscosifying and stabilizing agent in many industries. Here we report and compare the complete genome sequences of Xac and Xcc. Their distinct disease phenotypes and host ranges belie a high degree of similarity at the genomic level. More than 80% of genes are shared, and gene order is conserved along most of their respective chromosomes. We identified several groups of strain-specific genes, and on the basis of these groups we propose mechanisms that may explain the differing host specificities and pathogenic processes.

Nature 2002 May 23417(6887):459-63.

Thermoanaerobacter tengcongensis is a rod-shaped, gram-negative, anaerobic eubacterium that was isolated from a freshwater hot spring in Tengchong, China. Using a whole-genome-shotgun method, we sequenced its 2,689,445-bp genome from an isolate, MB4(T) (Genbank accession no. AE008691). The genome encodes 2588 predicted coding sequences (CDS). Among them, 1764 (68.2%) are classified according to homology to other documented proteins, and the rest, 824 CDS (31.8%), are functionally unknown. One of the interesting features of the T. tengcongensis genome is that 86.7% of its genes are encoded on the leading strand of DNA replication. Based on protein sequence similarity, the T. tengcongensis genome is most similar to that of Bacillus halodurans, a mesophilic eubacterium, among all fully sequenced prokaryotic genomes up to date. Computational analysis on genes involved in basic metabolic pathways supports the experimental discovery that T. tengcongensis metabolizes sugars as principal energy and carbon source and utilizes thiosulfate and element sulfur, but not sulfate, as electron acceptors. T. tengcongensis, as a gram-negative rod by empirical definitions (such as staining), shares many genes that are characteristics of gram-positive bacteria whereas it is missing molecular components unique to gram-negative bacteria. A strong correlation between the G + C content of tDNA and rDNA genes and the optimal growth temperature is found among the sequenced thermophiles. It is concluded that thermophiles are a biologically and phylogenetically divergent group of prokaryotes that have converged to sustain extreme environmental conditions over evolutionary timescale.

Genome Res 2002 May12(5):689-700.

Comparison of the whole-genome sequence of Bacillus anthracis isolated from a victim of a recent bioterrorist anthrax attack with a reference reveals 60 new markers that include single nucleotide polymorphisms, indels and tandem repeats. Genome comparison detected four highquality SNPs between the two sequenced B. anthracis chromosomes and seven differences between different preparations of the reference genome. These markers have been tested on a collection of anthrax isolates, and were found to divide these samples into distinct families. These results demonstrate that genome-based analysis of microbial pathogens will provide a powerful new tool for investigation of infectious disease outbreaks.

Science 2002 May 8 [epub ahead of print].

Streptomyces coelicolor is a representative of the group of soil-dwelling, filamentous bacteria responsible for producing most natural antibiotics used in human and veterinary medicine. Here we report the 8,667,507 base pair linear chromosome of this organism, containing the largest number of genes so far discovered in a bacterium. The 7,825 predicted genes include more than 20 clusters coding for known or predicted secondary metabolites. The genome contains an unprecedented proportion of regulatory genes, predominantly those likely to be involved in responses to external stimuli and stresses, and many duplicated gene sets that may represent 'tissue-specific' isoforms operating in different phases of colonial development, a unique situation for a bacterium. An ancient synteny was revealed between the central 'core' of the chromosome and the whole chromosome of pathogens Mycobacterium tuberculosis and Corynebacterium diphtheriae. The genome sequence will greatly increase our understanding of microbial life in the soil as well as aiding the generation of new drug candidates by genetic engineering.

Nature 2002 May 9417(6885):141-147.

Methanogenesis, the biological production of methane, plays a pivotal role in the global carbon cycle and contributes significantly to global warming. The majority of methane in nature is derived from acetate. Here we report the complete genome sequence of an acetate-utilizing methanogen, Methanosarcina acetivorans C2A. Methanosarcineae are the most metabolically diverse methanogens, thrive in a broad range of environments, and are unique among the Archaea in forming complex multicellular structures. This diversity is reflected in the genome of M. acetivorans. At 5,751,492 base pairs it is by far the largest known archaeal genome. The 4524 open reading frames code for a strikingly wide and unanticipated variety of metabolic and cellular capabilities. The presence of novel methyltransferases indicates the likelihood of undiscovered natural energy sources for methanogenesis, whereas the presence of single-subunit carbon monoxide dehydrogenases raises the possibility of nonmethanogenic growth. Although motility has not been observed in any Methanosarcineae, a flagellin gene cluster and two complete chemotaxis gene clusters were identified. The availability of genetic methods, coupled with its physiological and metabolic diversity, makes M. acetivorans a powerful model organism for the study of archaeal biology. [Sequence, data, annotations, and analyses are available at The sequence data described in this paper have been submitted to the GenBank data library under accession no. AE010299.]

Genome Res 2002 Apr12(4):532-42.

We have determined the complete 1,694,969-nt sequence of the GC-rich genome of Methanopyrus kandleri by using a whole direct genome sequencing approach. This approach is based on unlinking of genomic DNA with the ThermoFidelase version of M. kandleri topoisomerase V and cycle sequencing directed by 2'-modified oligonucleotides (Fimers). Sequencing redundancy (3.3x) was sufficient to assemble the genome with less than one error per 40 kb. Using a combination of sequence database searches and coding potential prediction, 1,692 protein-coding genes and 39 genes for structural RNAs were identified. M. kandleri proteins show an unusually high content of negatively charged amino acids, which might be an adaptation to the high intracellular salinity. Previous phylogenetic analysis of 16S RNA suggested that M. kandleri belonged to a very deep branch, close to the root of the archaeal tree. However, genome comparisons indicate that, in both trees constructed using concatenated alignments of ribosomal proteins and trees based on gene content, M. kandleri consistently groups with other archaeal methanogens. M. kandleri shares the set of genes implicated in methanogenesis and, in part, its operon organization with Methanococcus jannaschii and Methanothermobacter thermoautotrophicum. These findings indicate that archaeal methanogens are monophyletic. A distinctive feature of M. kandleri is the paucity of proteins involved in signaling and regulation of gene expression. Also, M. kandleri appears to have fewer genes acquired via lateral transfer than other archaea. These features might reflect the extreme habitat of this organism.

Proc Natl Acad Sci U S A 2002 Apr 299(7):4644-4649.

We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.

Science 2002 Apr 5296(5565):79-92.

The genome of the japonica subspecies of rice, an important cereal and model monocot, was sequenced and assembled by whole-genome shotgun sequencing. The assembled sequence covers 93% of the 420-megabase genome. Gene predictions on the assembled sequence suggest that the genome contains 32,000 to 50,000 genes. Homologs of 98% of the known maize, wheat, and barley proteins are found in rice. Synteny and gene homology between rice and the other cereal genomes are extensive, whereas synteny with Arabidopsis is limited. Assignment of candidate rice orthologs to Arabidopsis genes is possible in many cases. The rice genome sequence provides a foundation for the improvement of cereals, our most important crops.

Science 2002 Apr 5296(5565):92-100.

Acute rheumatic fever (ARF), a sequelae of group A Streptococcus (GAS) infection, is the most common cause of preventable childhood heart disease worldwide. The molecular basis of ARF and the subsequent rheumatic heart disease are poorly understood. Serotype M18 GAS strains have been associated for decades with ARF outbreaks in the U.S. As a first step toward gaining new insight into ARF pathogenesis, we sequenced the genome of strain MGAS8232, a serotype M18 organism isolated from a patient with ARF. The genome is a circular chromosome of 1,895,017 bp, and it shares 1.7 Mb of closely related genetic material with strain SF370 (a sequenced serotype M1 strain). Strain MGAS8232 has 178 ORFs absent in SF370. Phages, phage-like elements, and insertion sequences are the major sources of variation between the genomes. The genomes of strain MGAS8232 and SF370 encode many of the same proven or putative virulence factors. Importantly, strain MGAS8232 has genes encoding many additional secreted proteins involved in human-GAS interactions, including streptococcal pyrogenic exotoxin A (scarlet fever toxin) and two uncharacterized pyrogenic exotoxin homologues, all phage-associated. DNA microarray analysis of 36 serotype M18 strains from diverse localities showed that most regions of variation were phages or phage-like elements. Two epidemics of ARF occurring 12 years apart in Salt Lake City, UT, were caused by serotype M18 strains that were genetically identical, or nearly so. Our analysis provides a critical foundation for accelerated research into ARF pathogenesis and a molecular framework to study the plasticity of GAS genomes.

Proc Natl Acad Sci U S A 2002 Apr 299(7):4668-4673.

We present a complete DNA sequence and metabolic analysis of the dominant oral bacterium Fusobacterium nucleatum. Although not considered a major dental pathogen on its own, this anaerobe facilitates the aggregation and establishment of several other species including the dental pathogens Porphyromonas gingivalis and Bacteroides forsythus. The F. nucleatum strain ATCC 25586 genome was assembled from shotgun sequences and analyzed using the ERGO bioinformatics suite ( The genome contains 2.17 Mb encoding 2,067 open reading frames, organized on a single circular chromosome with 27% GC content. Despite its taxonomic position among the gram-negative bacteria, several features of its core metabolism are similar to that of gram-positive Clostridium spp., Enterococcus spp., and Lactococcus spp. The genome analysis has revealed several key aspects of the pathways of organic acid, amino acid, carbohydrate, and lipid metabolism. Nine very-high-molecular-weight outer membrane proteins are predicted from the sequence, none of which has been reported in the literature. More than 137 transporters for the uptake of a variety of substrates such as peptides, sugars, metal ions, and cofactors have been identified. Biosynthetic pathways exist for only three amino acids: glutamate, aspartate, and asparagine. The remaining amino acids are imported as such or as di- or oligopeptides that are subsequently degraded in the cytoplasm. A principal source of energy appears to be the fermentation of glutamate to butyrate. Additionally, desulfuration of cysteine and methionine yields ammonia, H(2)S, methyl mercaptan, and butyrate, which are capable of arresting fibroblast growth, thus preventing wound healing and aiding penetration of the gingival epithelium. The metabolic capabilities of F. nucleatum revealed by its genome are therefore consistent with its specialized niche in the mouth.

J Bacteriol 2002 Apr184(7):2005-18.

An isolate of Strawberry mottle virus (SMoV) was transferred from Fragaria vesca to Nicotiana occidentalis and Chenopodium quinoa by mechanical inoculation. Electron micrographs of infected tissues showed the presence of isometric particles of approximately 28 nm in diameter. SMoV-associated tubular structures were also conspicuous, particularly in the plasmodesmata of C. quinoa. DsRNA extraction of SMoV-infected N. occidentalis yielded two bands of 6.3 and 7.8 kbp which were cloned and sequenced. Gaps in the sequence, including the 5' and 3' ends, were filled using RT-PCR and RACE. The genome of SMoV was found to consist of RNA1 and RNA2 of 7036 and 5619 nt, respectively, excluding a poly(A) tail. Each RNA encodes one polyprotein and has a 3' non-coding region of approximately 1150 nt. The polyprotein of RNA1 contains regions with identities to helicase, viral genome-linked protein, protease and polymerase (RdRp), and shares its closest similarity with RNA1 of the tentative nepovirus Satsuma dwarf virus (SDV). The polyprotein of RNA2 displayed some similarity to the large coat protein domain of SDV and related viruses. Phylogenetic analysis of the RdRp region showed that SMoV falls into a separate group containing SDV, Apple latent spherical virus, Naval orange infectious mottling virus and Rice tungro spherical virus. Given the size of RNA2 and the presence of a long 3' non-coding region, SMoV is more typical of a nepovirus, although atypically for a nepovirus it is aphid transmissible. We propose that SMoV is a tentative member of an SDV-like lineage of picorna-like viruses.

J Gen Virol 2002 Jan83(Pt 1):229-39.

We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.

Nature 2002 Feb 21415(6874):871-880.

Ralstonia solanacearum is a devastating, soil-borne plant pathogen with a global distribution and an unusually wide host range. It is a model system for the dissection of molecular determinants governing pathogenicity. We present here the complete genome sequence and its analysis of strain GMI1000. The 5.8-megabase (Mb) genome is organized into two replicons: a 3.7-Mb chromosome and a 2.1-Mb megaplasmid. Both replicons have a mosaic structure providing evidence for the acquisition of genes through horizontal gene transfer. Regions containing genetically mobile elements associated with the percentage of G+C bias may have an important function in genome evolution. The genome encodes many proteins potentially associated with a role in pathogenicity. In particular, many putative attachment factors were identified. The complete repertoire of type III secreted effector proteins can be studied. Over 40 candidates were identified. Comparison with other genomes suggests that bacterial plant pathogens and animal pathogens harbour distinct arrays of specialized type III-dependent effectors.

Nature 2002 Jan 31415(6871):497-502.

Clostridium perfringens is a Gram-positive anaerobic spore-forming bacterium that causes life-threatening gas gangrene and mild enterotoxaemia in humans, although it colonizes as normal intestinal flora of humans and animals. The organism is known to produce a variety of toxins and enzymes that are responsible for the severe myonecrotic lesions. Here we report the complete 3,031,430-bp sequence of C. perfringens strain 13 that comprises 2,660 protein coding regions and 10 rRNA genes, showing pronounced low overall G + C content (28.6%). The genome contains typical anaerobic fermentation enzymes leading to gas production but no enzymes for the tricarboxylic acid cycle or respiratory chain. Various saccharolytic enzymes were found, but many enzymes for amino acid biosynthesis were lacking in the genome. Twenty genes were newly identified as putative virulence factors of C. perfringens, and we found a total of five hyaluronidase genes that will also contribute to virulence. The genome analysis also proved an efficient method for finding four members of the two-component VirR/VirS regulon that coordinately regulates the pathogenicity of C. perfringens. Clearly, C. perfringens obtains various essential materials from the host by producing several degradative enzymes and toxins, resulting in massive destruction of the host tissues.

Proc Natl Acad Sci U S A 2002 Jan 15 [epub ahead of print]

We determined and annotated the complete 2.2-megabase genome sequence of Pyrobaculum aerophilum, a facultatively aerobic nitrate-reducing hyperthermophilic (Topt = 100°C) crenarchaeon. Clues were found suggesting explanations of the organism's surprising intolerance to sulfur, which may aid in the development of methods for genetic studies of the organism. Many interesting features worthy of further genetic studies were revealed. Whole genome computational analysis confirmed experiments showing that P. aerophilum (and perhaps all crenarchaea) lack 5' untranslated regions in their mRNAs and thus appear not to use a ribosome-binding site (Shine-Dalgarno)-based mechanism for translation initiation at the 5' end of transcripts. Inspection of the lengths and distribution of mononucleotide repeat-tracts revealed some interesting features. For instance, it was seen that mononucleotide repeat-tracts of Gs (or Cs) are highly unstable, a pattern expected for an organism deficient in mismatch repair. This result, together with an independent study on mutation rates, suggests a "mutator" phenotype.

Proc Natl Acad Sci U S A 2002 Jan 15 [epub ahead of print]

The nucleotide sequence of the entire genome of a filamentous cyanobacterium, Anabaena sp. strain PCC 7120, was determined. The genome of Anabaena consisted of a single chromosome (6,413,771 bp) and six plasmids, designated pCC7120alpha (408,101 bp), pCC7120beta (186,614 bp), pCC7120gamma (101,965 bp), pCC7120delta (55,414 bp), pCC7120epsilon (40,340 bp), and pCC7120zeta (5,584 bp). The chromosome bears 5368 potential protein-encoding genes, four sets of rRNA genes, 48 tRNA genes representing 42 tRNA species, and 4 genes for small structural RNAs. The predicted products of 45% of the potential protein-encoding genes showed sequence similarity to known and predicted proteins of known function, and 27% to translated products of hypothetical genes. The remaining 28% lacked significant similarity to genes for known and predicted proteins in the public DNA databases. More than 60 genes involved in various processes of heterocyst formation and nitrogen fixation were assigned to the chromosome based on their similarity to the reported genes. One hundred and ninety-five genes coding for components of two-component signal transduction systems, nearly 2.5 times as many as those in Synechocystis sp. PCC 6803, were identified on the chromosome. Only 37% of the Anabaena genes showed significant sequence similarity to those of Synechocystis, indicating a high degree of divergence of the gene information between the two cyanobacterial strains.

DNA Res 2001 Oct 318(5):205-13.

Brucella melitensis is a facultative intracellular bacterial pathogen that causes abortion in goats and sheep and Malta fever in humans. The genome of B. melitensis strain 16M was sequenced and found to contain 3,294,935 bp distributed over two circular chromosomes of 2,117,144 bp and 1,177,787 bp encoding 3,197 ORFs. By using the bioinformatics suite ERGO, 2,487 (78%) ORFs were assigned functions. The origins of replication of the two chromosomes are similar to those of other a-proteobacteria. Housekeeping genes, including those involved in DNA replication, transcription, translation, core metabolism, and cell wall biosynthesis, are distributed on both chromosomes. Type I, II, and III secretion systems are absent, but genes encoding sec-dependent, sec-independent, and flagella-specific type III, type IV, and type V secretion systems as well as adhesins, invasins, and hemolysins were identified. Several features of the B. melitensis genome are similar to those of the symbiotic Sinorhizobium meliloti.

Proc Natl Acad Sci U S A. 2002 Jan 899(1):443-448.

The 5.67-megabase genome of the plant pathogen Agrobacterium tumefaciens C58 consists of a circular chromosome, a linear chromosome, and two plasmids. Extensive orthology and nucleotide colinearity between the genomes of A. tumefaciens and the plant symbiont Sinorhizobium meliloti suggest a recent evolutionary divergence. Their similarities include metabolic, transport, and regulatory systems that promote survival in the highly competitive rhizosphere differences are apparent in their genome structure and virulence gene complement. Availability of the A. tumefaciens sequence will facilitate investigations into the molecular basis of pathogenesis and the evolutionary divergence of pathogenic and symbiotic lifestyles.

Science. 2001 Dec 14294(5550):2317-2323.

Agrobacterium tumefaciens is a plant pathogen capable of transferring a defined segment of DNA to a host plant, generating a gall tumor. Replacing the transferred tumor-inducing genes with exogenous DNA allows the introduction of any desired gene into the plant. Thus, A. tumefaciens has been critical for the development of modern plant genetics and agricultural biotechnology. Here we describe the genome of A. tumefaciens strain C58, which has an unusual structure consisting of one circular and one linear chromosome. We discuss genome architecture and evolution and additional genes potentially involved in virulence and metabolic parasitism of host plants.

Science. 2001 Dec 14294(5550):2323-2328.

Streptomyces avermitilis is a soil bacterium that carries out not only a complex morphological differentiation but also the production of secondary metabolites, one of which, avermectin, is commercially important in human and veterinary medicine. The major interest in this genus Streptomyces is the diversity of its production of secondary metabolites as an industrial microorganism. A major factor in its prominence as a producer of the variety of secondary metabolites is its possession of several metabolic pathways for biosynthesis. Here we report sequence analysis of S. avermitilis, covering 99% of its genome. At least 8.7 million base pairs exist in the linear chromosome this is the largest bacterial genome sequence, and it provides insights into the intrinsic diversity of the production of the secondary metabolites of Streptomyces. Twenty-five kinds of secondary metabolite gene clusters were found in the genome of S. avermitilis. Four of them are concerned with the biosyntheses of melanin pigments, in which two clusters encode tyrosinase and its cofactor, another two encode an ochronotic pigment derived from homogentiginic acid, and another polyketide-derived melanin. The gene clusters for carotenoid and siderophore biosyntheses are composed of seven and five genes, respectively. There are eight kinds of gene clusters for type-I polyketide compound biosyntheses, and two clusters are involved in the biosyntheses of type-II polyketide-derived compounds. Furthermore, a polyketide synthase that resembles phloroglucinol synthase was detected. Eight clusters are involved in the biosyntheses of peptide compounds that are synthesized by nonribosomal peptide synthetases. These secondary metabolite clusters are widely located in the genome but half of them are near both ends of the genome. The total length of these clusters occupies about 6.4% of the genome.

Proc Natl Acad Sci U S A 2001 Oct 998(21):12215-20

Microsporidia are obligate intracellular parasites infesting many animal groups. Lacking mitochondria and peroxysomes, these unicellular eukaryotes were first considered a deeply branching protist lineage that diverged before the endosymbiotic event that led to mitochondria. The discovery of a gene for a mitochondrial-type chaperone combined with molecular phylogenetic data later implied that microsporidia are atypical fungi that lost mitochondria during evolution. Here we report the DNA sequences of the 11 chromosomes of the approximately 2.9-megabase (Mb) genome of Encephalitozoon cuniculi (1,997 potential protein-coding genes). Genome compaction is reflected by reduced intergenic spacers and by the shortness of most putative proteins relative to their eukaryote orthologues. The strong host dependence is illustrated by the lack of genes for some biosynthetic pathways and for the tricarboxylic acid cycle. Phylogenetic analysis lends substantial credit to the fungal affiliation of microsporidia. Because the E. cuniculi genome contains genes related to some mitochondrial functions (for example, Fe-S cluster assembly), we hypothesize that microsporidia have retained a mitochondrion-derived organelle.

Nature 2001 Nov 22414(6862):450-3.

The complete genomic sequence of an aerobic thermoacidophilic crenarchaeon, Sulfolobus tokodaii strain7 which optimally grows at 80 degrees C, at low pH, and under aerobic conditions, has been determined by the whole genome shotgun method with slight modifications. The genomic size was 2,694,756 bp long and the G + C content was 32.8%. The following RNA-coding genes were identified: a single 16S-23S rRNA cluster, one 5S rRNA gene and 46 tRNA genes (including 24 intron-containing tRNA genes). The repetitive sequences identified were SR-type repetitive sequences, long dispersed-type repetitive sequences and Tn-like repetitive elements. The genome contained 2826 potential protein-coding regions (open reading frames, ORFs). By similarity search against public databases, 911 (32.2%) ORFs were related to functional assigned genes, 921 (32.6%) were related to conserved ORFs of unknown function, 145 (5.1%) contained some motifs, and remaining 849 (30.0%) did not show any significant similarity to the registered sequences. The ORFs with functional assignments included the candidate genes involved in sulfide metabolism, the TCA cycle and the respiratory chain. Sequence comparison provided evidence suggesting the integration of plasmid, rearrangement of genomic structure, and duplication of genomic regions that may be responsible for the larger genomic size of the S. tokodaii strain7 genome. The genome contained eukaryote-type genes which were not identified in other archaea and lacked the CCA sequence in the tRNA genes. The result suggests that this strain is closer to eukaryotes among the archaea strains so far sequenced. The data presented in this paper are also available on the internet homepage (

DNA Res 2001 Aug 318(4):123-40.

Salmonella enterica serovar Typhi (S. typhi) is the aetiological agent of typhoid fever, a serious invasive bacterial disease of humans with an annual global burden of approximately 16 million cases, leading to 600,000 fatalities. Many S. enterica serovars actively invade the mucosal surface of the intestine but are normally contained in healthy individuals by the local immune defence mechanisms. However, S. typhi has evolved the ability to spread to the deeper tissues of humans, including liver, spleen and bone marrow. Here we have sequenced the 4,809,037-base pair (bp) genome of a S. typhi (CT18) that is resistant to multiple drugs, revealing the presence of hundreds of insertions and deletions compared with the Escherichia coli genome, ranging in size from single genes to large islands. Notably, the genome sequence identifies over two hundred pseudogenes, several corresponding to genes that are known to contribute to virulence in Salmonella typhimurium. This genetic degradation may contribute to the human-restricted host range for S. typhi. CT18 harbours a 218,150-bp multiple-drug-resistance incH1 plasmid (pHCM1), and a 106,516-bp cryptic plasmid (pHCM2), which shows recent common ancestry with a virulence plasmid of Yersinia pestis.

Nature 2001 Oct 25413(6858):848-52.

Salmonella enterica subspecies I, serovar Typhimurium (S. typhimurium), is a leading cause of human gastroenteritis, and is used as a mouse model of human typhoid fever. The incidence of non-typhoid salmonellosis is increasing worldwide, causing millions of infections and many deaths in the human population each year. Here we sequenced the 4,857-kilobase (kb) chromosome and 94-kb virulence plasmid of S. typhimurium strain LT2. The distribution of close homologues of S. typhimurium LT2 genes in eight related enterobacteria was determined using previously completed genomes of three related bacteria, sample sequencing of both S. enterica serovar Paratyphi A (S. paratyphi A) and Klebsiella pneumoniae, and hybridization of three unsequenced genomes to a microarray of S. typhimurium LT2 genes. Lateral transfer of genes is frequent, with 11% of the S. typhimurium LT2 genes missing from S. enterica serovar Typhi (S. typhi), and 29% missing from Escherichia coli K12. The 352 gene homologues of S. typhimurium LT2 confined to subspecies I of S. enterica-containing most mammalian and bird pathogens-are useful for studies of epidemiology, host specificity and pathogenesis. Most of these homologues were previously unknown, and 50 may be exported to the periplasm or outer membrane, rendering them accessible as therapeutic or vaccine targets.

Nature 2001 Oct 25413(6858):852-56.

Rickettsia conorii is an obligate intracellular bacterium that causes Mediterranean spotted fever in humans. We determined the 1,268,755-nucleotide complete genome sequence of R. conorii, containing 1374 open reading frames. This genome exhibits 804 of the 834 genes of the previously determined R. prowazekii genome plus 552 supplementary open reading frames and a 10-fold increase in the number of repetitive elements. Despite these differences, the two genomes exhibit a nearly perfect colinearity that allowed the clear identification of different stages of gene alterations with gene remnants and 37 genes split in 105 fragments, of which 59 are transcribed. A 38-kilobase sequence inversion was dated shortly after the divergence of the genus.

Science 2001 Sep 14293(5537):2093-8.

The Gram-negative bacterium Yersinia pestis is the causative agent of the systemic invasive infectious disease classically referred to as plague, and has been responsible for three human pandemics: the Justinian plague (sixth to eighth centuries), the Black Death (fourteenth to nineteenth centuries) and modern plague (nineteenth century to the present day). The recent identification of strains resistant to multiple drugs and the potential use of Y. pestis as an agent of biological warfare mean that plague still poses a threat to human health. Here we report the complete genome sequence of Y. pestis strain CO92, consisting of a 4.65-megabase (Mb) chromosome and three plasmids of 96.2 kilobases (kb), 70.3 kb and 9.6 kb. The genome is unusually rich in insertion sequences and displays anomalies in GC base-composition bias, indicating frequent intragenomic recombination. Many genes seem to have been acquired from other bacteria and viruses (including adhesins, secretion systems and insecticidal toxins). The genome contains around 150 pseudogenes, many of which are remnants of a redundant enteropathogenic lifestyle. The evidence of ongoing genome fluidity, expansion and decay suggests Y. pestis is a pathogen that has undergone large-scale genetic flux and provides a unique insight into the ways in which new and highly virulent pathogens evolve.

Nature 2001 Oct 4413(6855):523-7.

Streptococcus pneumoniae is among the most significant causes of bacterial disease in humans. Here we report the 2,038,615-bp genomic sequence of the gram-positive bacterium S. pneumoniae R6. Because the R6 strain is a virulent and, more importantly, because it is readily transformed with DNA from homologous species and many heterologous species, it is the principal platform for investigation of the biology of this important pathogen. It is also used as a primary vehicle for genomics-based development of antibiotics for gram-positive bacteria. In our analysis of the genome, we identified a large number of new uncharacterized genes predicted to encode proteins that either reside on the surface of the cell or are secreted. Among those proteins there may be new targets for vaccine and antibiotic development.

J Bacteriol 2001 Oct183(19):5709-17.

The scarcity of usable nitrogen frequently limits plant growth. A tight metabolic association with rhizobial bacteria allows legumes to obtain nitrogen compounds by bacterial reduction of dinitrogen (N2) to ammonium (NH4+). We present here the annotated DNA sequence of the alpha-proteobacterium Sinorhizobium meliloti, the symbiont of alfalfa. The tripartite 6.7-megabase (Mb) genome comprises a 3.65-Mb chromosome, and 1.35-Mb pSymA and 1.68-Mb pSymB megaplasmids. Genome sequence analysis indicates that all three elements contribute, in varying degrees, to symbiosis and reveals how this genome may have emerged during evolution. The genome sequence will be useful in understanding the dynamics of interkingdom associations and of life in soil environments.

Science 2001 Jul 27293(5530):668-72.

The 2,160,837-base pair genome sequence of an isolate of Streptococcus pneumoniae, a Gram-positive pathogen that causes pneumonia, bacteremia, meningitis, and otitis media, contains 2236 predicted coding regions of these, 1440 (64%) were assigned a biological role. Approximately 5% of the genome is composed of insertion sequences that may contribute to genome rearrangements through uptake of foreign DNA. Extracellular enzyme systems for the metabolism of polysaccharides and hexosamines provide a substantial source of carbon and nitrogen for S. pneumoniae and also damage host tissues and facilitate colonization. A motif identified within the signal peptide of proteins is potentially involved in targeting these proteins to the cell surface of low-guanine/cytosine (GC) Gram-positive species. Several surface-exposed proteins that may serve as potential vaccine candidates were identified. Comparative genome hybridization with DNA arrays revealed strain differences in S. pneumoniae that could contribute to differences in virulence and antigenicity.

Science 2001 Jul 20293(5529):498-506.

Mycoplasma pulmonis is a wall-less eubacterium belonging to the Mollicutes (trivial name, mycoplasmas) and responsible for murine respiratory diseases. The genome of strain UAB CTIP is composed of a single circular 963 879 bp chromosome with a G + C content of 26.6 mol%, i.e. the lowest reported among bacteria, Ureaplasma urealyticum apart. This genome contains 782 putative coding sequences (CDSs) covering 91.4% of its length and a function could be assigned to 486 CDSs whilst 92 matched the gene sequences of hypothetical proteins, leaving 204 CDSs without significant database match. The genome contains a single set of rRNA genes and only 29 tRNAs genes. The replication origin oriC was localized by sequence analysis and by using the G + C skew method. Sequence polymorphisms within stretches of repeated nucleotides generate phase-variable protein antigens whilst a recombinase gene is likely to catalyse the site-specific DNA inversions in major M.pulmonis surface antigens. Furthermore, a hemolysin, secreted nucleases and a glyco-protease are predicted virulence factors. Surprisingly, several of the genes previously reported to be essential for a self-replicating minimal cell are missing in the M.pulmonis genome although this one is larger than the other mycoplasma genomes fully sequenced until now.

Nucleic Acids Res. 2001 May 1529(10):2145-53.

The genome of the crenarchaeon Sulfolobus solfataricus P2 contains 2,992,245 bp on a single chromosome and encodes 2,977 proteins and many RNAs. One-third of the encoded proteins have no detectable homologs in other sequenced genomes. Moreover, 40% appear to be archaeal-specific, and only 12% and 2.3% are shared exclusively with bacteria and eukarya, respectively. The genome shows a high level of plasticity with 200 diverse insertion sequence elements, many putative nonautonomous mobile elements, and evidence of integrase-mediated insertion events. There are also long clusters of regularly spaced tandem repeats. Different transfer systems are used for the uptake of inorganic and organic solutes, and a wealth of intracellular and extracellular proteases, sugar, and sulfur metabolizing enzymes are encoded, as well as enzymes of the central metabolic pathways and motility proteins. The major metabolic electron carrier is not NADH as in bacteria and eukarya but probably ferredoxin. The essential components required for DNA replication, DNA repair and recombination, the cell cycle, transcriptional initiation and translation, but not DNA folding, show a strong eukaryal character with many archaeal-specific features. The results illustrate major differences between crenarchaea and euryarchaea, especially for their DNA replication mechanism and cell cycle processes and their translational apparatus.

Proc Natl Acad Sci U S A. 2001 Jul 398(14):7835-7840.

The 1,852,442-bp sequence of an M1 strain of Streptococcus pyogenes, a Gram-positive pathogen, has been determined and contains 1,752 predicted protein-encoding genes. Approximately one-third of these genes have no identifiable function, with the remainder falling into previously characterized categories of known microbial function. Consistent with the observation that S. pyogenes is responsible for a wider variety of human disease than any other bacterial species, more than 40 putative virulence-associated genes have been identified. Additional genes have been identified that encode proteins likely associated with microbial "molecular mimicry" of host characteristics and involved in rheumatic fever or acute glomerulonephritis. The complete or partial sequence of four different bacteriophage genomes is also present, with each containing genes for one or more previously undiscovered superantigen-like proteins. These prophage-associated genes encode at least six potential virulence factors, emphasizing the importance of bacteriophages in horizontal gene transfer and a possible mechanism for generating new strains with increased pathogenic potential.

Proc Natl Acad Sci U S A 2001 Apr 1098(8):4658-63.

Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.]

Genome Res 2001 May11(5):731-53.

The complete genome sequence of was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extra cytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living -class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.

Proc Natl Acad Sci U S A 2001 Mar 2798(7):4136-41.

We present here the complete genome sequence of a common avian clone of Pasteurella multocida, Pm70. The genome of Pm70 is a single circular chromosome 2,257,487 base pairs in length and contains 2,014 predicted coding regions, 6 ribosomal RNA operons, and 57 tRNAs. Genome-scale evolutionary analyses based on pairwise comparisons of 1,197 orthologous sequences between P. multocida, Haemophilus influenzae, and Escherichia coli suggest that P. multocida and H. influenzae diverged approximately 270 million years ago and the gamma subdivision of the proteobacteria radiated about 680 million years ago. Two previously undescribed open reading frames, accounting for approximately 1% of the genome, encode large proteins with homology to the virulence-associated filamentous hemagglutinin of Bordetella pertussis. Consistent with the critical role of iron in the survival of many microbial pathogens, in silico and whole-genome microarray analyses identified more than 50 Pm70 genes with a potential role in iron acquisition and metabolism. Overall, the complete genomic sequence and preliminary functional analyses provide a foundation for future research into the mechanisms of pathogenesis and host specificity of this important multispecies pathogen.

Proc Natl Acad SCI U S A 2001 Mar 1398(6):3460-5.

Leprosy, a chronic human neurological disease, results from infection with the obligate intracellular pathogen Mycobacterium leprae, a close relative of the tubercle bacillus. Mycobacterium leprae has the longest doubling time of all known bacteria and has thwarted every effort at culture in the laboratory. Comparing the 3.27-megabase (Mb) genome sequence of an armadillo-derived Indian isolate of the leprosy bacillus with that of Mycobacterium tuberculosis (4.41 Mb) provides clear explanations for these properties and reveals an extreme case of reductive evolution. Less than half of the genome contains functional genes but pseudogenes, with intact counterparts in M. tuberculosis, abound. Genome downsizing and the current mosaic arrangement appear to have resulted from extensive recombination events between dispersed repetitive sequences. Gene deletion and decay have eliminated many important metabolic activities including siderophore production, part of the oxidative and most of the microaerophilic and anaerobic respiratory chains, and numerous catabolic systems and their regulatory circuits.

Nature 2001 Feb 22409(6823):1007-11.

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion BP DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-BP segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 BP or more, and 25% of the genome is in scaffolds of 10 million BP or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 BP per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

Science 2001 Feb 16291(5507):1304-51.

The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions—all of which could be targets for surveillance.

Nature 2001 Jan 25409(6819):529-33.

The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans—the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.

Nature 2000 Dec 14408(6814):796-815.

We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-BP genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea.

Proc Natl Acad SCI U S A 2000 Oct 2497(22):12176-81.

Thermoplasma acidophilum is a thermoacidophilic archaeon that thrives at 59 degrees C and pH 2, which was isolated from self-heating coal refuse piles and solfatara fields. Species of the genus Thermoplasma do not possess a rigid cell wall, but are only delimited by a plasma membrane. Many macromolecular assemblies from Thermoplasma, primarily proteases and chaperones, have been pivotal in elucidating the structure and function of their more complex eukaryotic homologues. Our interest in protein folding and degradation led us to seek a more complete representation of the proteins involved in these pathways by determining the genome sequence of the organism. Here we have sequenced the 1,564,905-base-pair genome in just 7,855 sequencing reactions by using a new strategy. The 1,509 open reading frames identify Thermoplasma as a typical euryarchaeon with a substantial complement of bacteria-related genes however, evidence indicates that there has been much lateral gene transfer between Thermoplasma and Sulfolobus solfataricus, a phylogenetically distant crenarchaeon inhabiting the same environment. At least 252 open reading frames, including a complete protein degradation pathway and various transport proteins, resemble Sulfolobus proteins most closely.

Nature 2000 Sep 28407(6803):508-13.

Almost all aphid species (Homoptera, Insecta) have 60-80 huge cells called bacteriocytes, within which are round-shaped bacteria that are designated Buchnera. These bacteria are maternally transmitted to eggs and embryos through host generations, and the mutualism between the host and the bacteria is so obligate that neither can reproduce independently. Buchnera is a close relative of Escherichia coli, but it contains more than 100 genomic copies per cell, and its genome size is only a seventh of that of E. coli. Here we report the complete genome sequence of Buchnera sp. strain APS, which is composed of one 640,681-base-pair chromosome and two small plasmids. There are genes for the biosyntheses of amino acids essential for the hosts in the genome, but those for non-essential amino acids are missing, indicating complementarity and syntrophy between the host and the symbiont. In addition, Buchnera lacks genes for the biosynthesis of cell-surface components, including lipopolysaccharides and phospholipids, regulator genes and genes involved in defence of the cell. These results indicate that Buchnera is completely symbiotic and viable only in its limited niche, the bacteriocyte.

Nature 2000 Sep 7407(6800):81-6.

Pseudomonas aeruginosa is a ubiquitous environmental bacterium that is one of the top three causes of opportunistic human infections. A major factor in its prominence as a pathogen is its intrinsic resistance to antibiotics and disinfectants. Here we report the complete sequence of P. aeruginosa strain PAO1. At 6.3 million base pairs, this is the largest bacterial genome sequenced, and the sequence provides insights into the basis of the versatility and intrinsic drug resistance of P. aeruginosa. Consistent with its larger genome size and environmental adaptability, P. aeruginosa contains the highest proportion of regulatory genes observed for a bacterial genome and a large number of genes involved in the catabolism, transport and efflux of organic compounds as well as four potential chemotaxis systems. We propose that the size and complexity of the P. aeruginosa genome reflect an evolutionary adaptation permitting it to thrive in diverse environments and resist the effects of a variety of antimicrobial substances.

Nature 2000 Aug 31406(6799):959-64.

Here we determine the complete genomic sequence of the gram negative, gamma-Proteobacterium Vibrio cholerae El Tor N16961 to be 4,033,460 base pairs (BP). The genome consists of two circular chromosomes of 2,961,146 BP and 1,072,314 BP that together encode 3,885 open reading frames. The vast majority of recognizable genes for essential cell functions (such as DNA replication, transcription, translation and cell-wall biosynthesis) and pathogenicity (for example, toxins, surface antigens and adhesins) are located on the large chromosome. In contrast, the small chromosome contains a larger fraction (59%) of hypothetical genes compared with the large chromosome (42%), and also contains many more genes that appear to have origins other than the gamma-Proteobacteria. The small chromosome also carries a gene capture system (the integron island) and host 'addiction' genes that are typically found on plasmids thus, the small chromosome may have originally been a megaplasmid that was captured by an ancestral Vibrio species. The V. cholerae genomic sequence provides a starting point for understanding how a free-living, environmental organism emerged to become a significant human bacterial pathogen.

Nature 2000 Aug 3406(6795):477-83.

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis—a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (BP) circular chromosome and two plasmids of 51,158 BP and 1,285 BP We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium-bacterium and bacterium-host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer.

Nature 2000 Jul 13406(6792):151-7.

Neisseria meningitidis causes bacterial meningitis and is therefore responsible for considerable morbidity and mortality in both the developed and the developing world. Meningococci are opportunistic pathogens that colonize the nasopharynges and oropharynges of asymptomatic carriers. For reasons that are still mostly unknown, they occasionally gain access to the blood, and subsequently to the cerebrospinal fluid, to cause septicaemia and meningitis. N. meningitidis strains are divided into a number of serogroups on the basis of the immunochemistry of their capsular polysaccharides serogroup A strains are responsible for major epidemics and pandemics of meningococcal disease, and therefore most of the morbidity and mortality associated with this disease. Here we have determined the complete genome sequence of a serogroup A strain of Neisseria meningitidis, Z2491. The sequence is 2,184,406 base pairs in length, with an overall G+C content of 51.8%, and contains 2,121 predicted coding sequences. The most notable feature of the genome is the presence of many hundreds of repetitive elements, ranging from short repeats, positioned either singly or in large multiple arrays, to insertion sequences and gene duplications of one kilobase or more. Many of these repeats appear to be involved in genome fluidity and antigenic variation in this important human pathogen.

Nature 2000 Mar 30404(6777):502-6.

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

Science 2000 Mar 24287(5461):2185-95.

The 2,272,351-base pair genome of Neisseria meningitidis strain MC58 (serogroup B), a causative agent of meningitis and septicemia, contains 2158 predicted coding regions, 1158 (53.7%) of which were assigned a biological role. Three major islands of horizontal DNA transfer were identified two of these contain genes encoding proteins involved in pathogenicity, and the third island contains coding sequences only for hypothetical proteins. Insights into the commensal and virulence behavior of N. meningitidis can be gleaned from the genome, in which sequences for structural proteins of the pilus are clustered and several coding regions unique to serogroup B capsular polysaccharide synthesis can be identified. Finally, N. meningitidis contains more genes that undergo phase variation than any pathogen studied to date, a mechanism that controls their expression and contributes to the evasion of the host immune system.


Eleanor Gaunt, B.Sc. 4

A systematic literature survey suggests that there are 1399 species of human pathogen. Of these, 87 were first reported in humans in the years since 1980. The new species are disproportionately viruses, have a global distribution, and are mostly associated with animal reservoirs. Their emergence is often driven by ecological changes, especially with how human populations interact with animal reservoirs. Here, we review the process of pathogen emergence over both ecological and evolutionary time scales by reference to the “pathogen pyramid.” We also consider the public health implications of the continuing emergence of new pathogens, focusing on the importance of international surveillance.


In this review, we will be particularly concerned with species of pathogen that have recently been reported to be associated with an infectious disease in humans for the first time. As discussed more fully below, not all such pathogens (possibly very few of them) will be truly “new,” at least in the sense that the pathogen has only recently discovered us rather than we have only recently discovered the pathogen. This focus on novel pathogens differs somewhat from the more general topic of 𠇎merging infectious diseases,” which is often taken to include previously rare disease which are now on the increase, and sometimes diseases once considered to be in decline but which are now resurgent—the so-called “re-emerging” diseases. However, our focus does fairly reflect one of the major public health concerns of the early 21st century, the possible emergence of new pathogens species and novel variants (OSI 2006).

At first glance, a pre-occupation with yet-to-emerge disease problems may seem extravagant, given the massive and all too immediate health burdens imposed by malaria, tuberculosis, measles, and other familiar examples. An obvious counterargument is the relatively recent advent of HIV-1, unrecognized less than a generation ago and yet now one of the world’s biggest killers. As we shall discuss, the great majority of novel pathogens have not caused public problems on anything like this scale. However, AIDS (reinforced by knowledge of other plagues occurring throughout human history—see Diamond 2002) reminds us that the possibility that they could do so is real. In the early stages of the emergence of a new disease, it is a possibility that all too often cannot easily be dismissed as current concerns about H5N1 influenza A virus attest. A second reason for concern is that outbreaks of new diseases, and the public reaction to them, can cause economic and political shocks far greater than might be anticipated. The 2003 SARS epidemic, for example, resulted in fewer than 1000 deaths but cost the global economy many billions of dollars (King et al. 2006). Variant CJD, which has caused just over 100 deaths mostly confined to the UK, has had a global economic impact of a similar magnitude. Moreover a better understanding of the natural history of the emergence of new infectious diseases should inform our ability to combat them and, as the 2003 SARS epidemic illustrated, rapid, coordinated intervention can be highly effective.

Pathogen Diversity

Surveys of Pathogen Species

Although the existence of pathogens has been recognized for centuries, the first comprehensive list of human pathogen species was not published until 2001 (Taylor, Latham, and Woolhouse 2001). This list was generated from a comprehensive review of the secondary literature available at the time (see Taylor, Latham, and Woolhouse 2001 for full details). Each entry was a distinct species known to be infectious to and capable of causing disease in humans under natural transmission conditions. Species only known to cause infection through deliberate laboratory exposure were excluded. Species only known to cause disease in immuno-compromised patients and species only associated with a single human case of infection (e.g., Zika virus) were included. Ectoparasites such as ticks and leeches were not included. The 2001 list included species names that appeared in either (1) a text book published within the previous 10 years, or (2) standard web-based taxonomy browsers (see below), or (3) an ISI Web of Science citation index search covering the preceding 10 years. In subsequent work (e.g., Woolhouse and Gowtage-Sequeira 2005) NCBI taxonomies were used throughout (

This methodology has the advantage that it is (or, at least, aspires to be) systematic, transparent and reproducible by other researchers. However, it does have its limitations and two of these in particular are worth highlighting. First, the criterion �pable of causing disease” has been variously interpreted and not all text book reports of disease-causing organisms can be confirmed from the primary literature. Second, some taxonomies have been revised since 2001, altering which pathogen variants are regarded as “species.” Further revisions can reasonably be anticipated. More fundamentally, using the species as the unit of analysis ignores a wealth of important and interesting variation that occurs within species in traits such as virulence factors, antigenicity, host specificity or antibiotic resistance. Moreover, what is meant by “species” may differ from one group to another some pathogens have complex subspecific taxonomies (e.g., Salmonella enterica, Listeria monocytogenes, human rhinoviruses, Candiru virus complex, Trypanosoma brucei complex), making direct comparisons of different “species” potentially problematic. With these caveats noted however, a survey of recognized species represents a natural starting point for investigations of the diversity of human pathogens.

Surveys of New Pathogen Species

A subset of human pathogen species of special interest here is those that have only recently been discovered. In this context, “recently” is taken (arbitrarily) as meaning from 1980 onwards and 𠇍iscovered” means recognized as causing infection and disease in humans. Thus there are several possible reasons for a pathogen to appear in the list of “new” species.

Strictly speaking, only the first of these possibilities constitutes an 𠇎merging” infectious disease as defined earlier. In practice, however, most post-1980 pathogens probably fall into categories (2) to (5). For example, phylogenetic evidence has demonstrated clearly that the evolutionary origins of the human immunodeficiency viruses pre-date their discoveries in the 1980s by at least several decades (van Heuverswyn et al. 2006).

To provide a more complete picture of new pathogens the list of species described above was supplemented in early 2007 by searching the WHO, CDC, and ProMed web sites and the primary literature.

Results of Pathogen Surveys

Based on the above methodologies an updated version of the previously reported surveys generates a list of 1399 species of human pathogen. The most diverse group is the bacteria (over 500 species) with fungi, helminths and viruses making up most of the remainder (Table 5-2).


Numbers of Pathogen Species by Taxonomic Category.

Of these 1399 species of human pathogen, 87 have been discovered from 1980 onwards (Table 5-3). The composition of the subset of new species is very different from the full list. New species are dominated by viruses, and there are relatively few bacteria, fungi or helminths (Table 5-2). Within these broad categories certain taxa stand out: human retroviruses were not reported until 1980 most of the new fungi are microsporidia and almost half the new bacteria are rickettsia. Although the over-representation of viruses is highly statistically significant (odds ratio (OR) = 18.0, P < 0.001), it is not clear that (excluding retroviruses) particular kinds of viruses have special status. Single-stranded RNA viruses make up the largest subset of new species (45 species) but are only marginally over-represented. Similarly, bunyaviruses are the largest single family but are also only marginally over-represented in the list of new viruses.


Dates of First Reports of Human Infection with Novel Pathogen Species.

In summary, since 1980 new human pathogen species have been discovered at an average rate of over 3 per year. Almost 75% of these have been virus species even though viruses still represent a small fraction (less than 14%) of all recognized human pathogen species.

Geographic Origins of Novel Pathogens

For those pathogen species discovered in the post-1980 period, the geographic location of the first reported human case(s) can often be determined from the primary literature, at least to within specific countries and often to specific regions or municipalities. However, this is not possible for all new pathogen species. For example, although the early history of HIV-1 has been exhaustively investigated the exact origin of the first reported human case remains unclear (Barre-Sinoussi et al. 1983). Similarly, the only reported human case of European bat lyssavirus 2 in a human could have resulted from exposure in Finland, Switzerland or Malaysia (Lumio et al. 1986). Moreover, some new human pathogens were already endemic or ubiquitous in the human population when they were first discovered examples include human metapneumovirus and human bocavirus. For those pathogens which were discovered previously, but were only recently associated with human disease (such as commensals which have become pathogenic in patients immunosuppressed due to infection with HIV) the geographic origin is taken as the location in which the patient became sick (if the patient was not reported as having recent travel history).

Figure 5-6 shows a map of the points of origin of the first human cases of disease caused by 51 of the 87 pathogen species discovered since 1980. Data of this kind must be interpreted cautiously, not least because of likely ascertainment bias (variable likelihood of detection and identification of novel pathogens) in different parts of the world. Nonetheless, Figure 5-6 does make the important point that the emergence of new pathogens shows a truly global pattern, with multiple incidents being reported from every continent except Antarctica (with other gaps apparent in, for example, the Middle East and central Asia). There is no striking tendency for new pathogens to be more likely to be reported from tropical rather than temperate regions, or from less developed regions, or from more densely populated regions.


World map indicating points of origin of the first reported human cases of disease caused by 51 novel pathogen species since 1980. Locations are identified to municipality or region (occasionally country), jiggled as necessary to avoid overlap.

Process of Pathogen Emergence

Reservoirs of Infection

Relatively few human pathogens are known solely as human pathogens. The remainder also occur in other contexts: as commensals or free-living in the wider environment or as infections of hosts other than humans.

Overall, probably no more than 50 to 100 species are specialist human pathogens. These range from major killers such as Plasmodium falciparum, mumps virus, Treponema pallidum, smallpox and HIV-1 to those causing more minor problems such as the human adenoviruses and rhinoviruses.

Hundreds of species which can cause human disease occur naturally as 𠇌ommensals” found on the skin, on mucosal surfaces, or in the gut. They are normally benign but are sometimes pathogenic, for example if introduced into the blood system via a wound or in association with AIDS or other immunosuppressive conditions. Examples include the streptococci and Candida spp.

Several hundred human pathogen species have environmental reservoirs these are referred to as “sapronoses.” Examples include Bacillus anthracis, Legionella pneumophila, and Cryptococcus neoformans. Here, we do not take sapronotic to include pathogens which are transmitted via the fecal-oral route or via a free-living stage of a complex parasite life cycle. Most sapronoses are bacteria or fungi, plus some protozoa, and cause sporadic infections of humans. Few are highly transmissible (directly or indirectly) between humans, an important exception being Vibrio cholerae. Some human pathogens (e.g., Listeria spp.) are both sapronotic and zoonotic.

Many more pathogens—over 800 species𠅊re capable of infecting animal hosts other than humans. These range from species where humans are largely incidental hosts—such as rabies or Bartonella henselae—to species in which the main reservoir (sensu Haydon et al. 2002) is the human population and animals may be largely incidental hosts, that is, the so-called “reverse zoonoses” such as Schistosoma haematobium, rubella virus, Mycobacterium tuberculosis, or Necator americanus. We refer to all of these as “zoonotic,” following the World Health Organization’s definition of zoonoses as 𠇍iseases or infections which are naturally transmitted between vertebrate animals and humans.” In contrast to some other authors (e.g., Hubalek 2003) we do not consider pathogens with invertebrate reservoirs, and especially pathogens which are transmitted by arthropod vectors, as zoonotic. Note that the WHO definition does not include human pathogen species which recently evolved from animal pathogens, such as HIV-1. Nor does it include pathogens with complex life cycles where vertebrate animals are involved only as intermediate hosts with humans as the sole definitive host. It does, however, include reverse zoonoses.

Few of the 87 new human pathogen species in Table 5-3 are commensals or sapronoses. The great majority𠅊round 80%𠅊re associated with nonhuman vertebrate reservoirs (e.g., SARS coronavirus, vCJD agent and Borrelia burgdorferi) and most of the remainder appear to be long-standing human pathogens which have only recently been identified (e.g., Hepatitis G virus). Even some of the nonzoonotic pathogens, notably HIV-1 and HIV-2, are recently evolved from pathogens of nonhuman vertebrates (Keele et al. 2006). Compared with human pathogen species reported before 1980 the new species are statistically significantly more likely to be associated (or, at least, are more likely to be known to be associated) with a nonhuman animal reservoir (OR = 2.75, P < 0.001).

The reservoirs of the new, zoonotic human pathogens are mainly mammals, although a small number are associated with birds (Figure 5-7). However, the reservoirs include a wide range of mammal groups with ungulates, carnivores, and rodents most frequently involved, but also bats, primates, marsupials and occasionally other taxa (Figure 5-7). These observations must be interpreted with some caution because our knowledge of the host range of many pathogens is still incomplete. Nevertheless, the data available give the impression that taxonomic relatedness is less important than ecological opportunity as a determinant of the reservoirs of novel human pathogens. Homo sapiens as a species is classified within primates and, beyond that, the most closely related major groups are the rodents and lagomorphs. Ungulates, carnivores, and bats are more distant relatives. One related observation is that emerging human pathogens are especially likely to have a broad host range which includes more than one of these groups (Woolhouse and Gowtage-Sequeira 2005).


Counts of recently discovered human pathogens species (see Table 5-3) associated with various categories of non-human animal reservoirs. Some pathogens species are associated with more than one category of reservoir. These data should be regarded as no (more. )

Drivers of Pathogen Emergence

As discussed earlier, not all the pathogens in the list of new species should be regarded as truly emerging some have only recently been identified as the causative agents of established infectious diseases. However, for 30 or more of new species the literature suggests various drivers deemed to be associated with their emergence at the present time. These drivers can be considered within a framework originally suggested by the Institute of Medicine (IOM 2003), noting that this framework was devised with reference to all emerging and re-emerging infectious diseases, not just newly discovered pathogen species.

The most commonly cited drivers fall within the following IOM categories: economic development and land use human demographics and behavior international travel and commerce changing ecosystems human susceptibility and hospitals. Economic development and land use, and especially changes in economic development and land use, are associated with the emergence of pathogens such as Nipah virus and Borrelia burgdorferi through activities such as intensification of farming and forest encroachment respectively. Human demographics and behavior, and especially changes in human demographics and behavior, are associated with the emergence of pathogens such as HIV-1 and Hepatitis C virus through activities such as sexual activity and intravenous drug use. International travel and trade are increasing as part of the process of globalization and are associated with the emergence of pathogens such as SARS coronavirus. Changing ecosystems covers unintended consequences of human activities such as desertification, pollution, and climate change and is associated with the emergence of pathogens such as the hantaviruses. Broadly speaking, the set of drivers listed so far are all �ological” in nature they are to do with the ways that humans interact with their wider environment (especially with other vertebrate animals both domestic and wild), providing opportunities for pathogens to infect humans, and with the ways that humans interact with each other, providing opportunities for pathogens to spread within human populations. A particular concern—implicit but not highlighted in the IOM’s list—is increasing use of 𠇎xotic” animal species, whether as food, farm animals or pets, and the trade that accompanies this.

The other most commonly cited drivers are to do with human population health. Human susceptibility is particularly important in the context of coinfections associated with AIDS (e.g., several species of microsporidia) but also covers the effects of malnutrition and other immunosuppressive conditions. The hospitals category covers iatrogenic transmission (e.g., vCJD), and xenotransplantation (e.g., baboon cytomegalovirus), as well as nosocomial infections (e.g., Ebola viruses and Rotavirus C).

Other categories listed by the IOM—such as “intent to harm”—have not been or are not commonly cited as associated with the emergence of novel human pathogen species. Among these is the category “microbial adaptation and change,” an observation that we expand on below.

Transmission and Disease

The 87 new species of human pathogen are associated with public health problems of hugely variable magnitudes. At one extreme is HIV-1 which has killed an estimated 25 million people since it was first reported in 1983, with 40 million more currently infected (UNAIDS 2007). HIV-1 has a high transmission potential within many human populations (combining transmission mainly by sexual contact or by needle-sharing associated with intravenous drug use with an infectious period of several years) and is highly pathogenic (with a case fatality rate close to 100% in the absence of treatment). At the other extreme, Menangle virus is known to have infected only 2 farm workers in which it may have caused a mild febrile illness (Chant et al. 1998). Menangle virus does not appear to be highly infectious to or transmissible between humans and has not so far been associated with severe disease. In the following section we consider the kinds of epidemiological and biological differences that underlie the vast difference in public health impacts between pathogens such as HIV-1 and pathogens such as Menangle virus.

Pathogen Pyramid

A useful aid to conceptualizing the process of pathogen emergence is the pathogen pyramid. The concept of the pathogen pyramid was first put forward by Wolfe et al. (2004) and developed further in Wolfe, Dunavan, and Diamond (2007). A very similar framework but with a more formal mathematical underpinning was adopted by Woolhouse, Haydon, and Antia (2005). The pyramid we use here has four levels corresponding to exposure, infection, transmission, and epidemic spread (Figure 5-8). Wolfe, Dunavan, and Diamond (2007) subdivided epidemic spread into (in their terminology): Stages 4a, b, and c, infectious diseases that exist in animals but with different balances of animal-to-human and human-to-human spread (where Stage 4c corresponds to reverse zoonoses as defined above) and Stage 5, pathogens exclusive to humans (corresponding to specialist human pathogens as defined above).


The pathogen pyramid (adapted from Wolfe, Dunavan, and Diamond 2007). Each level represents a different degree of interaction between pathogens and humans, ranging from exposure through to epidemic spread. Some pathogens are able to progress from one (more. )

Level 1: Exposure The first stage of the emergence of a new pathogen is the exposure of humans to that pathogen. Exposure requires 𠇌ontact” between humans and the pathogen reservoir (which may be animal or environmental exposure to commensals is implicit). The nature of 𠇌ontact” is determined by the mode of transmission of the pathogen, e.g., animal bite, contamination of food with fecal material, blood-feeding by arthropod vectors or exposure to aerosols. The only barrier to exposure is insufficient overlap between habitats occupied by humans and habitats occupied by the pathogen. Changes in human ecology, particularly patterns of land use and interactions with animal reservoirs, are likely to change our exposure to potential new pathogens, as are changes in the ecology of the pathogens, their reservoirs or their vectors, e.g., as a result of climate change or other kinds of environmental change.

We do not know how many potential human pathogen species there are which we have not yet been exposed to, but we do know that human pathogens make up only a fraction of the known biodiversity of viruses, bacteria, fungi, protozoa and helminths, which in turn probably makes up only a fraction of the biodiversity which exists (Dykhuizen 1998).

Level 2: Infection The second stage of pathogen emergence is reached if the pathogen proves capable of infecting humans, possibly causing disease. As reviewed above, we know of 1399 species that have reached this stage. Others may have done so but have yet to be identified. Others may do so in the future but, to date, we have had no or insufficient exposure to them. Clearly, there will often be significant biological barriers—referred to as species barriers—preventing organisms infecting other kinds of host from infecting humans. We do not, for example, share any pathogens with plants, very few with invertebrates, and only a small number with cold-blooded vertebrates (e.g., Salmonella spp. in reptiles and amphibians—Mermin et al. 2004 helminth infections from fish𠅌hai et al. 2005). In contrast, we share many more of our pathogens with birds, and we share more than half with other species of mammal.

Indeed, the species barrier (at least between humans and other mammals) may not be as profound as is sometimes implied. According to Cleaveland et al. (2001) over 500 different species of pathogen are known to occur in domestic livestock and as many as 40% of these are zoonotic. The same authors report for domestic carnivores (dogs and cats) that almost 400 pathogen species are known, of which almost 70% are zoonotic. These data imply that, given the opportunities for exposure to pathogens that proximity to domestic animals must surely provide, many pathogens, perhaps even a majority, are capable of crossing the species barrier and infecting humans.

As suggested by the IOM (2003) report, an important contributor to the ability of a new pathogen to infect humans is variation in human susceptibility. In some cases this variation might have a genetic basis for example, apparently pre-existing genetic variation in human susceptibility to HIV (Arien, Vanham, and Arts 2007). More commonly, phenotypic variation in the human population will be important, particularly factors which compromise the human immune system. The most striking examples come from the wide range of opportunistic infections associated with the immunosuppressive effects of HIV infection these include several pathogen species, such as the microsporidia Brachiola algerae and Enterocytozoon bieneusi which were first recognized in AIDS patients.

Level 3: Transmission The third stage of pathogen emergence is reached if a pathogen that can infect humans also proves capable of transmission from one human to another. Transmission in this context need not be direct (e.g., by aerosol spread or sexual contact) it might be indirect (e.g., via contamination of food) or via an arthropod vector. The requirement is simply that an infection of one human leads ultimately to an infection of another.

In most cases the barriers preventing transmission will be biological, often reflecting tissue tropisms within the human host since pathogens normally need to access the gut, upper respiratory tract, urogenital tract or (especially for vector-borne infections) blood in order to be able to exit the body. However, sometimes such barriers can be overcome by changes in human behavior. The two best examples concern prion diseases. Kuru is only transmitted through cannibalism, which is extremely rare in most human societies. vCJD is not transmissible between humans except iatrogenically as a result of surgical procedures or blood transfusions.

Again, these barriers to human-to-human transmission are far from insuperable. Although information is lacking for many pathogen species (Taylor, Latham, and Woolhouse 2001), the literature suggests that a substantial minority𠅊t least 500 species, over one third of the total, and possibly many more𠅊re transmissible between humans.

Level 4: Epidemic Spread The fourth and, in our version, final level of the pathogen pyramid is reached if a pathogen is sufficiently transmissible within the human population to cause major epidemics or pandemics and/or to become endemic, without the involvement of the original reservoir. This represents a quantitative rather than qualitative distinction and it can be made more formally precise by reference to the concept of the basic reproduction number, R0. R0 can be defined as the average number of secondary cases of infection produced when a primary case is introduced into a large population of previously unexposed hosts (adapted from Anderson and May 1991). The distinction between Level 3 and Level 4 pathogens can be expressed in terms of R0. If R0 is less then one then, on average, a single primary case will fail to replace itself and although there may be chains of transmission these will be self-limiting—this corresponds to Level 3. On the other hand, if R0 is greater than one then, on average, a single primary case will produce more than one secondary case and, at least initially, there will be an exponential increase in the number of cases and ultimately a major epidemic is possible—this corresponds to Level 4. (A proviso is that, even if R0 ϡ, stochastic extinction of the infection chain is quite possible, especially in the early stages of the epidemic when numbers of cases are low—see May, Gupta, and McLean 2001.)

The barriers between Level 3 and Level 4 are both biological and epidemiological. The biological barriers are to do with pathogen infectivity, host susceptibility, the infectiousness of the infected host and for how long the host is infectious (whether this is terminated by recovery or death). The epidemiological barriers are to do with the rate and pattern of contacts between infectious and susceptible hosts. Here again, the nature of a 𠇌ontact” reflects the mode of transmission of the pathogen (see above). The rate and pattern of contacts can increase, and hence R0 can increase, independently of the pathogen, as a result of shifts in host demography or behavior. In the context of human hosts such shifts could constitute changes in factors such as population density (e.g., urbanization), living conditions, water supply and sanitation, patterns of travel and migration, or sexual behavior and intravenous drug use, depending on the specific pathogen involved. These might be augmented by changes in host susceptibility due to the kinds of factors listed earlier. Clearly, for the same pathogen R0 can vary considerably from one human population to another. Similarly, different strains of the same pathogen species may have very different R0 values in humans, e.g., different subtypes of influenza A virus.

In principle, this barrier might seem quite fragile the kinds of changes in host demography and behavior alluded to above are certainly occurring. In practice, it is not clear how many species of human pathogen have reached Level 4 since we have estimates of R0 values within human populations for only a handful of them. Based on earlier studies (Taylor, Latham and Woolhouse 2001 Woolhouse and Gowtage-Sequeira 2005) a plausible estimate is that 100 to 150 pathogen species are capable of causing major outbreaks within human populations, with half to two-thirds of these being specialist human pathogens and the remainder also occurring in animal reservoirs or the wider environment. This implies considerable attrition between levels 3 and 4 of the pathogen pyramid.

Status of New Pathogens

We can now consider where the 87 new human pathogen species fit within the pathogen pyramid. It is immediately clear that the majority of them are at Level 2 they can infect humans but are rarely if at all transmitted between humans. Examples include Borrelia burgdorferi, vCJD agent, most of the hantaviruses and Ehrlichia spp. At the other extreme, although there are a number that appear to be at Level 4, most of these are pathogens which are probably long established in human populations but have only recently been recognized, such as human metapneumovirus or hepatitis C virus. Only a very small number are likely to be recent additions to the repertoire of Level 4 human pathogens, namely HIV-1, HIV-2 and, arguably before its spread was contained, SARS coronavirus. In between, at Level 3, there is a significant minority of new pathogens that are somewhat transmissible between humans but which have so far been restricted to relatively minor outbreaks. These include Andes virus, human torovirus and some Encephalitozoon spp. For these species the value of the basic reproduction number R0 is of particular interest, especially if it lies close to one, the threshold for potential epidemic spread. R0 can be estimated from data on the distribution of outbreak sizes as follows.

The quantitative analysis of outbreak data used to estimate R0 is based upon a methodology developed by Jansen et al. (2003) for measles case data from the UK (to monitor the effect of changes in childhood vaccination coverage). Here, we apply the technique (see also Matthews and Woolhouse 2005) to data on human outbreaks of Andes virus (see Figure 5-9 for details). Andes virus is an emerging South American hantavirus and there are concerns that, unusually for hantaviruses, it can be transmitted directly between humans (Wells et al. 1997). Most reports of Andes virus represent sporadic cases (i.e., outbreaks of size 1) but clusters of cases also occur, ranging in size from 2 to 20 (Figure 5-9). This pattern—many small outbreaks and a few larger ones—is typical of a wide range of infectious diseases (Woolhouse, Taylor, and Haydon 2001). The best estimate of R0 based on these data lies in the range 0.22 to 0.37. This is well below one and in reality is likely to be an over-estimate since at least some of the clusters of cases may reflect exposure to a common source rather than, as is assumed in the analysis, person-to-person spread. However, the analysis does suggest that occasional larger outbreaks will occur (the R0 estimates are consistent with up to 1 in 200 outbreaks being of size 10 or more) without necessarily implying that there has been a major change in Andes virus epidemiology. This same approach can be applied to other “Level 3” pathogens to determine how close they are to reaching Level 4 of the pyramid (cf. Jansen et al. 2003).


Analysis of Andes virus outbreaks. Frequencies of outbreaks of different sizes (grey bars) are compared with the fit of a statistical model to the data (open bars). Outbreak data are taken from Wells et al. (1997) and Lazaro et al. (2007). The model is (more. )

Evolution and Emergence

So far we have examined the emergence of new species of human pathogens over time scales of a few decades. However, the origins of many human pathogens are considerably more ancient, extending back over time scales of thousands to millions of years. This process has been reviewed by, among others, Weiss (2001), Diamond (2002), and Wolfe, Dunavan, and Diamond (2007). Of particular interest here are examples of pathogens which have emerged in human populations as a result of successfully crossing the species barrier from an animal reservoir and reaching Level 4 status. Any analysis must be prefaced by the observation that we have good evidence for the origins of only a small minority of pathogens, plausible hypotheses (usually based on the epidemiologies of related species) for some of the remainder, and no information at all for the majority. Wolfe, Dunavan, and Diamond (2007) have proposed that this lack is addressed by a research program they term an “origins initiative”. That said, 16 examples of putative species jumps are listed in Table 5-4. Inspection of this list suggests two tentative observations. First, although a variety of different kinds of pathogen are listed including several species of bacteria and protozoa, the majority are viruses. Second, a variety of different animal reservoirs are involved: primates, ungulates, rodents and birds. Wolfe, Dunavan, and Diamond (2007) point out that primates are much better represented in this list than might be expected given their much more modest role as reservoirs of modern zoonoses. This may reflect both the much greater ecological overlap between humans and other primates in the distant past and the notion that pathogens of our closest relatives are more likely to be epidemiologically successful in humans. The latter idea is supported by the observation that two of the most recent examples of successful species jumps—HIV-1 and HIV-2—have primate origins (Keele et al. 2006). Similarly, several human pathogens with much deeper evolutionary origins, perhaps even pre-dating Homo sapiens as a distinct species, are also most closely related to modern primate pathogens. Examples include the hepatitis B and G viruses (Simmonds 2001). It is worth noting that species jumps can occur in both directions. For example, it is thought that Mycobacterium bovis—predominantly a cattle pathogen𠅎volved from the human pathogen M. tuberculosis (Brosch et al. 2002).


List of Human Pathogens Which Have Successfully Crossed the Species Barrier and Proved Capable of Epidemic Spread and, in Some Cases, Endemic Persistence in Human Populations. The Original Hosts Have Been Identified with Varying Degrees of Certainty. (more. )

HIV-1 and HIV-2 illustrate that the evolution of new species of pathogen is an ongoing process. Both are sufficiently divergent from their closest relatives— SIVcpz and SIVsmg respectively—in terms of both their genome sequences and their biologies to be regarded as distinct species. This has probably occurred within the last 100 years. In a nonhuman context, over even shorter time scales we have seen the evolution of another new species of pathogen, canine parvovirus (CPV), associated with a cat virus, feline panleukopenia virus (FPV), jumping into dogs (Parrish and Kawaoka 2005). CPV has spread to dog populations around the world in only a few years.

All of these examples concern RNA viruses, and RNA viruses differ from pathogens with DNA genomes in having far higher nucleotide substitution rates and so the potential for rapid adaptation to new host species (Holmes and Rambaut 2004). The importance of this kind of genetic lability has been explored by Antia et al. (2003) using simple mathematical models. These authors suggested that the potential for successful adaptation (which they defined as becoming sufficiently transmissible that R0 in humans became greater than one) is sensitive both to the size of initial outbreaks (determined mainly by the initial R0 value) and, especially, to the rate of genetic change and the genetic distance to be traveled. As discussed earlier, the initial R0 value is a function not only of pathogen biology but also of features of human demography and behavior which promote transmission and thus the kinds of changes in these mentioned above have the potential to increase the likelihood of the evolution of new human pathogens.

The successful adaptation of a nonhuman pathogen to humans is itself a highly stochastic process. This is illustrated by the early evolution of the human immunodeficiency viruses (see Van Heuverswyn et al. 2006). There is phylogenetic evidence for numerous introductions of SIVs into human populations most of these failed to become established (Arien et al. 2007) and only HIV-1 M subtype C has become truly pandemic.

This pattern raises the question of where, in practice, the relevant genetic changes that allow a pathogen to successfully invade a human population occur. Antia et al.’s analysis focuses on the process of adaptation within the human population. However, it may be that genetic change within the original reservoir (whether animal or environmental) is also critical for producing variants which are capable of infecting humans in the first place. With a handful of exceptions, such as the simian immunodeficiency viruses, we typically have very little information on the genetic and functional diversity of human pathogens or their immediate ancestors in nonhuman reservoirs.

This is a potentially important topic for future research but a reasonable working hypothesis, supported by our knowledge of the origins of HIV, is that genetic variation in nonhuman pathogen populations does occasionally and incidentally produce human infective variants, and this explains why so many novel human pathogens are RNA viruses (Woolhouse, Taylor, and Haydon 2001). This idea is further supported by the observation that RNA viruses tend to have broader host ranges than DNA viruses (Cleaveland, Laurenson, and Taylor 2001 Woolhouse, Taylor, and Haydon 2001), implying that they can more easily adapt to new host species.

The implication of the preceding discussion is that pathogen evolution is not only an important driver of progression up the pathogen pyramid over long time scales but that, especially for RNA viruses, this process may be relevant over much shorter time scales as well. In addition, we note that evolution is clearly a key driver of the emergence of new variants of existing human pathogen species, with potentially significant epidemiological consequences. This is evident in the generation of antibiotic resistant bacteria and chloroquine resistant malaria, as well as variants expressing novel virulence factors (e.g., E. coli O157) or with distinct pathogenicities (e.g., H5N1 influenza A).

Finally, we note that an important feature of new pathogens is that they have not been previously subject to evolutionary constraints on their virulence (i.e., the degree of harm they do to the host) in the new host (Ebert 1998). Moreover, the new host may make only a small contribution to the epidemiology of the pathogen (Level 3 of the pathogen pyramid), or even none at all [if] it is an epidemiological � end” in the sense that although infection can occur there is no onward transmission of infection (Level 2). In such cases evolutionary constraints on pathogen virulence may be weakened or absent (Woolhouse, Taylor and Haydon 2001). Putting these observations together it is unsurprising that many new human pathogens (e.g., Nipah and Ebola viruses, some hantaviruses, SARS coronavirus, and HIV-1) are very virulent, as indicated by their high case-fatality rates.

Public Health Implications

Future Emergence Events

It seems likely that the kinds of ecological changes that have been associated with pathogen emergence in the recent past (see IOM 2003) will continue to occur in the immediate future, e.g., continued deforestation for agriculture, intensification of livestock production, globalization, bush meat trade, urbanization, and so on. In that case, we can reasonably anticipate the reporting of yet more new species of human pathogen (currently happening at a rate of over 3 per year—Table 5-3) in the immediate future as well.

The survey of new pathogen species reported since 1980 suggests the kinds of pathogens that are most likely to emerge in the future. Four characteristics are expected to be particularly important:

The above criteria are certainly not intended as absolute predictors of pathogen emergence a good historical counterexample is syphilis (new to the Old World in the late 15th century, its origins remain disputed but it is a bacterium not associated with nonhuman reservoirs—Weiss, 2001). Even so, it is helpful to have some indication of what kinds of new pathogen we are most likely to encounter.


The first line of defense against any emerging pathogen is its rapid detection and identification. Recent practical experience with BSE and SARS demonstrates that rapid detection and identification leading to the rapid introduction of preventive measures can prove highly effective in combating outbreaks of novel diseases (Wilesmith 1994 Stohr 2003). Moreover, computer simulation studies motivated by concerns about the possible emergence of pandemic influenza suggest that only if a new strain is detected in the very earliest stages and interventions are put in place extremely promptly is their any realistic prospect of curtailing an epidemic (Ferguson et al. 2006).

Surveillance for novel pathogens, however, does present some particular challenges. Initially, this is likely to depend on clinical observation, such as the reporting of clusters of cases of disease with unusual symptoms. Internet surveillance for reports of unusual disease outbreaks is also possible and, in the longer term, generic diagnostic tools𠅏or example, lab-on-a-chip tests for all known human viruses—should become available (OSI 2006).

The map of reports of new pathogen species (Figure 5-6) argues strongly that surveillance needs to be global, especially considering the unprecedented rates of international travel and trade that can allow new infectious diseases, such as SARS, to spread around the world over time scales of days or weeks. Pathogen emergence is an international problem.


Another key lesson from surveying novel pathogens is the importance of animal reservoirs in the emergence of new infectious diseases. One implication of this is that surveillance in reservoir populations likely to be an effective tool for monitoring risks to humans (Cleaveland, Meslin, and Breiman 2007). On top of this, it may often be the case that most scientific knowledge of the basic biology of an unusual human pathogen lies, at least initially, with the veterinary community rather than the medical community. Palmarini (2007) lists a number of examples of this: infectious cancers, retroviruses, lentiviruses, transmissible spongiform encephalopathies, rotaviruses, and papilloma viruses. To this list could be added coronaviruses and ehrlichiosis. More generally, it is now widely recognized that humans share the majority of their pathogens with other animals (Taylor, Latham, and Woolhouse 2001).

Together, these observations underline the importance of close linkages between medical and veterinary researchers, resonating with the “one medicine” concept originally put forward by Schwabe (1969) and seeming especially appropriate in the context of emerging infectious diseases.

However, understanding the process of emergence requires much more than an understanding of the basic biology of the host-pathogen interaction, important though this undoubtedly is. A theme of this review has been the importance of ecological factors for the emergence of new pathogens. But we have used �ological” to cover a very wide range of environmental, agricultural, entomological, demographic, behavioral, cultural, economic, and sociological drivers of pathogen emergence. In specific contexts these could include the bush meat trade (associated with the emergence of HIV and SARS), livestock feed production (associated with BSE/vCJD) or changes in pig farming practices (associated with Nipah virus). These examples emphasize that disease emergence is a multi-disciplinary problem and needs to be understood at a number of scientific levels. Collaborations need to be developed not just between the human and animal health branches of the biomedical research community but also with researchers covering a much wider range of disciplines.


The pathogen pyramid provides a useful conceptual framework for thinking about the process of the emergence of a new species of human pathogen. However, it is immediately clear that at each level of the pyramid there are some important gaps in our knowledge.

First, we still have very little idea of the diversity of pathogens to which humans are being or could be exposed. Systematic surveys across a range of possible sources of new pathogens (notably other mammal species) using techniques such as shotgun sequencing are possible in principle, and would provide this information.

Establishing a priori which pathogens are capable of infecting humans is even more challenging. A first step would be to identify the cell receptors used by the 189 recognized species of human virus. At present, we have this information for only around half of the virus species.

Estimating the transmission potential of a new pathogen within the human population can only be achieved by closely monitoring initial outbreaks. Analysis of such data can provide some early warning of crucial epidemiological changes (as illustrated by the analysis of measles data mentioned above). Real time analysis of epidemic data can also provide timely estimates of the transmission potential (see Lipsitch et al. 2003 for application to the SARS epidemic) which can help inform control efforts. On the other hand, for many of the rarer human pathogens we do not currently know whether or not they are transmissible between humans (Woolhouse 2002).

It is extremely likely that we will encounter new species of human pathogen in the near future. We urgently need the scientific and logistic capacity to rapidly detect and evaluate the threat that new pathogens present and to intervene quickly and effectively wherever necessary. Experience of SARS provides some encouragement that, given adequate resources, efforts to combat emerging pathogens can be successful, but further challenges lie ahead.

5 Biodegradation of PPG

As the use of PPG is expected to increase in the future, studies of its biodegradation are equally important as those for PEG. The susceptibility of PPG to biological degradation has not been well characterized, although several groups have reported microbial assimilation of the monomer, 1,2-propylene glycol, which is supplied by the petrochemical industry at low cost. Fincher and Payne ( 1962 ) noted that a PEG-utilizing isolate could assimilate 1,2-propylene glycol and dimer as a sole carbon and energy source. Meanwhile, our PEG-utilizing isolates (Kawai, 1987 ), or those isolated by Watson and Jones ( 1977 ), did not grow on dimer or PPG. Neither did the anaerobic PEG-utilizing bacteria isolated by Schink and Stieb ( 1983 ) and Dwyer and Tiedje ( 1986 ) degrade PPG.

PPG-utilizing bacteria were isolated by an enrichment culture containing PPG 2000 or 4000 from soils or activated sludges acclimatized to PPG 2000 or 4000 for a few months under aerobic conditions (Kawai et al., 1977 ). As the culture medium became turbid after vigorous shaking, the cells were collected by centrifugation and resuspended in distilled water the turbidity of the cell suspension was then measured using its optical density at 610 nm. In a preliminary study, Tween 20 was used to emulsify a PPG medium homogeneously, but it was found later that sonication would emulsify PPG uniformly (unpublished data). Strain No. 7 was the most favorable this was identified as Corynebacterium sp., but later re-identified as Stenotrophomonas maltophilia based on 16S rRNA homology (Tachibana et al., 2002 ). The strain grew on various PPGs (diol and triol types, Mn 670∼4000), monomer and dimer, but did not assimilate PEGs (Table 3). The strain also grew on a few PEG-PPG copolymers, which contained a larger amount of PPG than PEG, where from the weight ratio of PPG and PEG (approximately 10:1), perhaps either two or one of the terminal hydroxyl groups of PPG was not blocked by PEG and could be available to the organism. In contrast to this, the block copolymers Epan 485 and 785–both of which contained a greater amount of PEG than PPG – were utilized by a PEG 20,000-utilizing consortium E-1, which cannot utilize PPG (Kawai, 1992 ). Epan 450 and 750 seemed to be toxic for either PPG- or PEG-utilizing bacteria.

  • * Re-identified as Stenotrophomonas maltophilia (Tachibana et al., 2002 )
  • b Reproduced from Kawai et al. ( 1977 ).

The aerobic metabolism of PPG by the strain was studied using dimer – dipropylene glycol (DPG) – as a model substrate for biodegradation, since PPG contains molecules of different molecular weights (Kawai et al., 1985 ). As commercially obtained PPG is randomly polymerized from optically active 1,2-propylene oxide, the resultant polymer must include atactic structures, and DPG must include (in theory) several structural and optical isomers, as shown in Figure 5. These isomers were separated by gas chromatography on a PEG 20M column (0.25 mm×25 m). R,R- and S,S-isomers were eluted together as a single peak, and R,S- and S,R-isomers as another single peak (Figure 6). The area ratios of peaks II to III and IV to V were almost equal by either total ion monitoring or selective ion monitoring on GC/MS analysis. From their mass spectra, relative quantities of five peaks and retention times, peak I was assigned to be structural isomer A, II and III to be B, and IV and V to be C (Figure 7) in peak I, optical isomers could not be separated peaks II and III are diastereomers (the R,R-S,S and R,S-S,R complexes) peaks IV and V are also diastereomers (the R,R-S,S complex and the meso form). The ratio of structural isomers A, B, and C was 36.3, 48.5, and 15.2%, respectively.

Presumed structural and optical isomers in chemically synthesized DPG. (Modified from Kawai et al., 1985 .)

Isolation and identification of isomers in DPG by GC/MS. (A) Total ion monitoring (B) selective ion monitoring. (Modified from Kawai et al., 1985 .)

Mass spectra of isomers contained in DPG. (Modified from Kawai et al., 1985 .)

When the intact cells of the strain No. 7 were incubated with DPG, degradation of DPG depended on the shaking conditions, as in a nonshaken culture DPG was barely degraded. Taken together with the results of a culture filtrate or cell-free extract, this suggested that DPG was not metabolized by a hydrolytic reaction, but by an oxidative reaction. With vigorous shaking on a reciprocal shaker, over 90% of DPG was consumed within 23 h, but traces of metabolites were accumulated in the reaction mixtures. With moderate shaking, the degradation rate was slower, but considerable amounts of metabolic products were accumulated (M1 and M2). Hence, the reaction was carried out at 30°C for 20–50 h with moderate shaking (60–70 rpm). Metabolites (M1 and M2) were characterized by GC/MS analysis using a capillary column (0.25 mm×50 m) (Figure 8). M1 corresponded to 1,2-propylene glycol. M2 was further separated into two peaks, M-2 and M-2′. From the mass spectra of the two peaks, M-2 and M-2′ seemed to correspond to OC(CH3)CH2OCH2CH(CH3)OH and OC(CH3)CH2OCH(CH3)CH2OH, respectively. The fourth small peak, M3 was found on capillary GC, and considered to be OC(CH3)CH2OCH2CO(CH3), based on its elution position and mass spectrum. The residual isomers in the reaction supernatant (30-h incubation) were analyzed. Isomer A (peak I) was degraded by 51.4%, while the diastereomers of isomer B (peaks II and III) were equally degraded by 40.7%, and those of isomer C (peaks IV and V) were also equally degraded by 18.5%. This result supports the assumption that compounds corresponding to peaks II and III, and IV and V have the same structure, respectively. These results indicated that secondary alcohol groups were preferentially oxidized, with the bacterium utilizing all structural and optical isomers included in the dimer, as well as isotactic and atactic structures. Generally speaking, the stereospecificity of microorganisms and their enzymes is strictly controlled, but in this case the bacterium either has several stereospecific enzymes for optical isomers, or a nonstereospecific enzyme.

Mass spectra of metabolic products from DPG. (Modified from Kawai et al., 1985 .)

PPG was not degraded by either a culture filtrate or a cell-free extract, but could be degraded by intact cells and/or cell debris. Hence, PPG was not metabolized by extracellular enzymes or a hydrolase, but possibly by intracellular enzymes including membrane-bound enzymes. The intracellular metabolism of PPG was supported by the finding that bacterial cells entrapped in polyacrylamide gels degraded PPG efficiently (Kawai, 1987 ). The PPG-degrading activity of the cell-free extracts prepared from DPG-grown cells was investigated, but because of clouding due to the PPG attached to the cells the activity of the cell-free extract prepared from PPG-grown cells could not be measured. DCIP- and phenazine methosulfate (PMS)-dependent dehydrogenase (PPG-DH) activities were detected with cell-free extracts, and these must be linked with a respiratory chain of the organism. The effects of side or main chain structures on the growth of PPG-utilizing strain were examined (Table 4). As the bacterium grew well on PBG 400 and 2000, a methyl group in PPG was replaceable by an ethyl group in PBG. The microorganism grew on polyglycerines to some extent, but not on polyglycidols, PEG, PTMG or polyvinyl alcohol. These results indicated that the bacterium recognized an ether oxygen adjacent to two or three carbon chains and a hydrophobic side group such as a methyl or ethyl group.

Polyglycidol 13 300 ( R )

  • a NG, no growth. *Re-identified as Stenotrophomonas maltophilia (Tachibana et al., 2002 ). Reproduced from Kawai ( 1993 ).

As PEG and PPG-monoalkyl derivatives may be assimilated in a similar manner to free PEG and PPG, yet dialkyl PEG/monoalkyl PPG acetate cannot be utilized by PEG/PPG-utilizers, at least one free alcohol group is necessary for metabolism (Kawai, 1993 ). These results – intracellular PPG-DH, a need for the terminal free alcohol group for growth, and oxidized metabolic products – suggested that PPG is incorporated into cells at least through the outer membranes, and is oxidatively metabolized. The principal mechanism is possibly similar to that for PEG – that the terminal oxidation precedes the ether cleavage.

Several PPG dehydrogenase (PPG-DH) activities were found in PPG-utilizing S. maltophilia (Tachibana et al., 2002 ). During growth on PPG 2000, three PPG-DH peaks appeared in 36 h, 7 days, and 9 days: the majority (88%) of the first peak at the early logarithmic phase was localized in the cytoplasm. In the second peak (the highest) at the stationary phase of growth, activity was found in the membrane (54%), the periplasm (34%), and the cytoplasm (12%). The third peak may not contribute significantly to the assimilation of PPG, because PPG was already consumed in 9 days. As well as differing in their localization and induction times, these PPG-DHs also showed differences in their specificity towards electron acceptors. Further characterization of these enzymes is eagerly awaited.

Why Is the Black Sea Important?

There are two principal reasons why the Black Sea is uniquely important to science. First, it is an ideal place in which to study redox processes. Redox processes occur in sediments all over the world's oceans, but they are confined to very narrow sedimentary bands of millimeter or centimeter intervals that cannot be readily separated from one another. In contrast, the biogeochemical processes of the redox gradient in the Black Sea are spread over meters so samples can be readily collected for each redox process of interest. Furthermore, the broad redox gradient of the Black Sea is highly stable (5).

The second major reason for the study of the Black Sea's redox zone relates to astrobiology, the study of life in the universe. The Black Sea is analogous to the oceans of earth's Proterozoic period (6). During this period, which occurred from ≈2.3 to 1.0 giga-annum B.P., cyanobacterial oxygen was produced but was insufficient to saturate the world's oceans. The result was aerobic surface waters overlying anoxic waters beneath, thereby mimicking the Black Sea. The study of the Black Sea will also provide important clues to better understanding the evolution of life and metabolism on other planetary bodies such as Jupiter's moon Europa, which may have anoxic oceans.

Microbial Evolution and Co-Adaptation: A Tribute to the Life and Scientific Legacies of Joshua Lederberg: Workshop Summary.

Emerging infections, as defined by Stephen Morse of Columbia University in his contribution to this chapter, are infections that are rapidly increasing in incidence or geographic range, including such previously unrecognized diseases as HIV/AIDS, severe acute respiratory syndrome (SARS), Ebola hemorrhagic fever, and Nipah virus encephalitis. Among his many contributions to efforts to recognize and address the threat of emerging infections, Lederberg co-chaired the committees that produced two landmark Institute of Medicine (IOM) reports, Emerging Infections: Microbial Threats to Health in the United States (IOM, 1992) and Microbial Threats to Health (IOM, 2003), which provided a crucial framework for understanding the drivers of infectious disease emergence (Box WO-3 and Figure WO-13). As the papers in this chapter demonstrate, this framework continues to guide research to elucidate the origins of emerging infectious threats, to inform the analysis of recent patterns of disease emergence, and to identify risks for future disease emergence events so as to enable early detection and response in the event of an outbreak, and perhaps even predict its occurrence.

In the chapter’s first paper, Morse describes two distinct stages in the emergence of infectious diseases: the introduction of a new infection to a host population, and the establishment within and dissemination from this population. He considers the vast and largely uncharacterized “zoonotic pool” of possible human pathogens and the increasing opportunities for infection presented by ecological upheaval and globalization. Using hantavirus pulmonary syndrome and H5N1 influenza as examples, Morse demonstrates how zoonotic pathogens gain access to human populations. While many zoonotic pathogens periodically infect humans, few become adept at transmitting or propagating themselves, Morse observes. Human activity, however, is making this transition increasingly easy by creating efficient pathways for pathogen transmission around the globe. “We know what is responsible for emerging infections, and should be able to prevent them,” he concludes, through global surveillance, diagnostics, research, and above all, the political will to make them happen.

The authors of the chapter’s second paper, workshop presenter Mark Woolhouse and Eleanor Gaunt of the University of Edinburgh, draw several general conclusions about the ecological origins of novel human pathogens based on their analysis of human pathogen species discovered since 1980. Using a rigorous, formal methodology, Woolhouse and Gaunt produced and refined a catalog of the nearly 1,400 recognized human pathogen species. A subset of 87 species have been recognized since 1980𠅊nd are currently thought to be “novel” pathogens. The authors note four attributes of these novel pathogens that they expect will describe most future emergent microbes: a preponderance of RNA viruses pathogens with nonhuman animal reservoirs pathogens with a broad host range and pathogens with some (perhaps initially limited) potential for human-human transmission.

Like Morse, Woolhouse and Gaunt consider the challenges faced by novel pathogens to become established in a new host population and achieve efficient transmission, conceptualizing Morse’s observation that “many are called but few are chosen” in graphic form, as a pyramid. It depicts the approximately 1,400 pathogens capable of infecting humans, of which 500 are capable of human-to-human transmission, and among which fewer than 150 have the potential to cause epidemic or endemic disease evolution—over a range of time scales𠅍rives pathogens up the pyramid. The paper concludes with a discussion of the public health implications of the pyramid model, which suggests that ongoing global ecological change will continue to produce novel infectious diseases at or near the current rate of three per year.

In contrast to other contributors to this chapter, who focus on what, why, and where infectious diseases emerge, Jonathan Eisen, of the University of California, Davis, considers how new functions and processes evolve to generate novel pathogens. Eisen investigates the origin of microbial novelty by integrating evolutionary analyses with studies of genome sequences, a field he terms “phylogenomics.” In his essay, he illustrates the results of such analyses in a series of “phylogenomic tales” that describe the use of phylogenomics to predict the function of uncharacterized genes in a variety of organisms, and in elucidating the genetic basis of a complex symbiotic relationship involving three species.

Knowledge of microbial genomes, and the functions they encode, is severely limited, Eisen observes. Among 40 phyla of bacteria, for example, most of the available genomic sequences were from only three phyla sequencing of Archaea and Eukaryote genomes has proceeded in a similarly sporadic manner. To fill these gaps in our knowledge of the “tree of life,” his group has begun an initiative called the Genomic Encyclopedia of Bacteria and Archaea. Eisen describes this effort and advocates the further integration of information on microbial phylogeny, genetic sequence, and gene function with biogeographical data, in order to produce a 𠇏ield guide to microbes.”

The chapter’s final paper, by Peter Daszak of the Consortium for Conservation Medicine, Wildlife Trust, makes the leap from knowing how infectious diseases emerge to predicting where, and under what circumstances, an emergent disease event is likely to occur. Daszak presents several examples of his group’s efforts to build predictive approaches to infectious disease emergence based on a thorough understanding of the underlying ecology. These include constructing a model to predict relative risks for Nipah virus reemergence in Malaysia, where a 1999 outbreak devastated a thriving pig farming industry identifying likely sources by which West Nile virus could spread to Hawaii, the Galapagos, and Barbados and determining likely reservoirs of H5N1 influenza for specific geographic locations worldwide.

Daszak’s group constructed a database of emerging infectious disease 𠇎vents” first reported in human populations between 1940 and 2004, which they have used to examine correspondences between events and ecological variables, such as human population density and wildlife diversity, in a geographical context. These analyses have revealed “hotspots” for infectious disease emergence. Daszak discusses the implications of hotspot location for global infectious disease surveillance, and describes how he and coworkers have used their knowledge of hotspots to target surveillance for Nipah virus in India, and also to discover a virus with zoonotic potential in Bangladesh.

Uses in studying pathogenicity and epidemiology

There are a number of laboratories using PM technology in novel ways to delve into various aspects of bacterial pathogenicity and the related issue of epidemiology. Diverse pathogens have been analyzed, including E. coli, Salmonella enterica, P. aeruginosa and Pseudomonas syringae, Enterobacter (now Cronobacter) sakazakii, Yersinia pestis, Vibrio cholerae, Campylobacter jejuni, Helicobacter pylori, S. aureus, Listeria monocytogenes, Mycobacterium sp., C. burnetii, and Legionella pneumophila.

One of the most interesting and productive series of works has come from studies of the highly clonal pathogen, S. enterica serovar Enteritidis. Different strains of this pathogen share 99.99% genomic identity yet, surprisingly, they vary greatly in their pathogenic properties. Guard-Bouldin and colleagues used PM technology to compare two strains of the same phage type (PT13a), with clearly distinct biological properties: one is a biofilm-forming strain and a good colonizer of chickens but does not infect eggs, whereas the second does not form biofilms but does infect eggs. Coinfection with both subtypes causes the most serious infections and disease spread ( Guard-Bouldin, 2004). In spite of the very close genomic relatedness of these strains and the inability of genetic typing methods (DNA microarray hybridization, pulsed-field gel electrophoresis, and ribotyping) to detect significant polymorphisms ( Morales, 2005, 2006), PM technology quickly uncovered many phenotypic differences between these two strains (the egg-infecting strain is more metabolically active) and provided essential information that has led to the elucidation of 447 small-scale genetic polymorphisms that appear to be important hot spots for genetic change such as the d -serine operon.

Interestingly, the d -serine operon was found to be a hot spot for genetic alteration in another highly clonal pathogen, E. coli O157. By analyzing a large collection of strains from foodborne outbreaks, Cebula and colleagues ( Mukherjee, 2006) first found a sucrose-positive, d -serine-negative phenotype common to most O157 strains and subsequently confirmed that a sucrose operon had inserted into the d -serine operon. Genetic analysis of this genome region showed that it was a hot spot for genetic mosaicism. Cebula's group concluded that phenotypic analysis is a very useful tool in strain attribution ( Bochner, 2008). In a comparative PM analysis of the O157 strain from the summer, 2006, spinach outbreak in the United States, they showed ( Mukherjee, 2008) that it had a rare N-acetyl- d -galactosamine-negative phenotype, which had only been found once previously. Both the Cebula and Guard-Bouldin laboratories have shown that, especially with clonal pathogens, it can be easier, more efficient, and more productive to go from phenotype back to genotype, instead of starting with genomic analyses. One other approach demonstrating the usefulness of analyzing phenotypes of natural isolates is the work of Hutkins and colleagues ( Durso, 2004), where PM analysis of C metabolism showed significant differences between commensal strains of E. coli from cattle vs. O157:H7 strains. These differences in metabolic capabilities are likely to contribute to colonization capabilities in various environments.

An area of particular interest is the use of PM technology to examine changes in the physiology of a bacterium peculiar to stages of pathogenic adaptation in vivo. Preston and colleagues ( Rico & Preston, 2008) analyzed phenotypic changes of the plant pathogen P. syringae, growing in laboratory culture media vs. growing in one of its environmental media – tomato apoplastic fluid. As previously mentioned, Omsland and Heinzen ( Bochner, 2008) investigated the metabolic phenotypic properties of C. burnetii extracted from culture inside of mammalian cells. In a work on a related pathogen, L. pneumophila, which can also survive and grow in macrophages by evading the endocytic-lysosomal destruction pathway, Swanson and colleagues produced two novel and exciting findings. First ( Sauer, 2005), they used PMs to show that phtA (phagosomal transporter defective) mutants require l -threonine for replication and that this amino acid triggers differentiation of the cell from a motile transmissive form to the nonmotile replicative form. More recently ( Dalebroux, 2008 Edwards, 2008), this group reported using PM technology to screen a flaA–gfp fusion strain (an indicator of differentiation back to the motile transmissive state) under hundreds of culture conditions and discovered that growth arrest and the transition to the motile transmissive form is triggered by carboxylic acids, especially short-chain fatty acids.

Cool Job: One green chemist is mining zoo dung for biological helpers

Michelle O’Malley, 37, is a chemical and biological engineer at the University of California, Santa Barbara. Her team is looking to mine microbes from animal wastes in a search for making “greener” products.

Share this:

October 11, 2019 at 5:50 am

You might call this group of lab members a poop patrol. Sometimes, they hang out at the Santa Barbara Zoo waiting for certain goats and sheep to do their business. But they’re not there just as a cleanup crew. To them, these droppings are more than waste. They’re the source of microbes that might one day become the route to greener fuels and chemicals.

Michelle O’Malley, 37, leads this group. She’s a chemical and biological engineer at the University of California campus across town. Her team is hunting for fungi that live in the digestive tract of plant-eating animals, such as sheep, goats, cows, giraffes and elephants. As anaerobes (AN-uh-roabs), those fungi can only live in the absence of oxygen. Together with some anaerobic bacteria, these fungi can break down grass and other plants. Along the way, they release sugars and other nutrients.

Educators and Parents, Sign Up for The Cheat Sheet

Weekly updates to help you use Science News for Students in the learning environment

These particular microbial helpers do not usually show up in the human gut. That’s why much of the fibrous parts of plants that we eat goes undigested. It passes through our guts, largely unaltered, exiting as wastes out the other end.

Here at the zoo, the researchers are focusing on San Clemente Island goats and Navajo-Churro sheep. “It can be hard to tell the difference between goat and sheep poop,” notes O’Malley. So it helps to “watch the donation take place.”

Once collected, their pellets go to the lab. There, team members coax out the fungi that enable these animals to digest certain plants.

O’Malley had to learn what she calls “very old-school technology” to grow the finicky fungi in her lab. Then she turned to tracking down the distinctive plant-degrading enzymes that these fungi make. Her big-picture plan is to help society move away from fossil fuels, such as petroleum. In their place, she hopes to find more sustainable ways to make chemicals and fuels. Her raw materials could be agricultural leftovers — such as corn stover and wheat straw, for example. In the past, such leftover plant materials have often been viewed as waste because people can’t eat them.

Fungi point to helpful enzymes

The fibrous parts of plants are made of lignocellulose (Lig-no-SEL-yu-loas). That chemical, in turn, is made from smaller compounds. Among them are two sugars: cellulose and hemicellulose. And those sugars are rich in carbon. Carbon is also a major ingredient in most fuels and many other chemicals and drugs.

O’Malley’s team would like to mine lignocellulose for its carbon. The problem is that the fibrous parts of plants also contain lignin. It’s a structural material that serves “to keep microbes and their enzymes out” of plant cell walls, O’Malley explains. So lignin makes it difficult to get to the sugars in lignocellulose.

Industrial chemists have found ways to chemically or physically remove lignin. But those processes often are costly, toxic and wasteful (as lignin itself contains valuable chemicals).

Some fungi have a better approach. And that’s what drew O’Malley’s attention to plant-eating animals and their poop. It turns out, O’Malley says, that certain anaerobic fungi found in the guts of these animals could give industry a greener way to break down cellulose, hemicellulose — and even lignin.

After a goat’s grassy lunch, certain anaerobic fungi burrow into the plant cell walls. There they release enzymes that break down lignocellulose — lignin and all. These challenging fungi have a 10-syllable name: Neocallimastigomycota.

O’Malley studied under chemical engineer Anne Robinson. That was back when she was in graduate school at Carnegie Mellon University in Pittsburgh, Pa. Robinson isn’t surprised her former student is working with such a challenging microbe. She recalls her student as being “very unafraid to tackle problems” and “able to recognize the interesting or unusual result.”

One fungus: Many, many digestive enzymes

After graduate school, O’Malley contacted scientists who had worked with anaerobic fungi. Most had abandoned those studies. The microbes just proved too difficult to work with. Then Michael Theodorou invited her to work with him. He had pioneered research on such microbes. Today he works at Harper Adams University in Newport, England. Back then, though, Theodorou was in Wales. And there he taught O’Malley how to isolate and grow these fungi.

Their challenge was feeding the fungi what they needed to grow, all the while keeping oxygen out.

Her team now begins with roll tubes. Think of them as 3-D petri dishes that can support growth across their inner walls, all in an oxygen-free environment. Carbon dioxide and a food source with digestive fluids are added to the closed tubes. Next, her team rolls the tubes to get an even coat on the internal walls. After adding a fungi-rich poop slurry, they roll the tubes again. If the process works, fungal colonies grow.

“All of this requires a lot of careful, coordinated, quick movements,” O’Malley says. It’s “a lost art.”

In her UC Santa Barbara lab, O’Malley has been isolating fungi from zoo samples and studying their enzymes. Until now, “nobody really knew their true power,” she says. Those fungi (with a mouthful of a name) turn out to have genes to make the largest number of biomass-degrading enzymes known. That’s something her team reported in Science three years ago.

Scientists Say: Yeast

The researchers have now partnered these anaerobic fungi with brewers’ yeast (Saccharomyces cerevisiae). That yeast is a mainstay of the biochemical industry.

O’Malley’s group showed that the fungi efficiently broke down lignocellulose in reed canary grass. That freed the sugars to be converted to other products by the yeast. O’Malley and colleagues shared their findings, last year, in Biotechnology and Bioengineering.

With the goal of unleashing these powers for the biotechnology industry, O’Malley and her group are exploring whether it makes sense to harvest the enzymes from the fungi. Maybe they should just insert fungal DNA into yeast and bacteria. That could turn them into enzyme-making machines.

Figuring out the ideal way to break down lignocellulose “has been a really [unsolveable] problem for a long time,” says Michael Betenbaugh. This biochemical engineer works at Johns Hopkins University in Baltimore, Md. O’Malley “kind of forged out on her own,” he says. Her trick, he adds, was “looking for these unusual microbes that have been doing [it] for millennia.”

Power Words

3-D Short for three-dimensional. This term is an adjective for something that has features that can be described in three dimensions &mdash height, width and length.

anaerobe An organism able to live in the absence of oxygen.

anaerobic Occurring in the absence of oxygen. Anaerobic reactions take place in oxygen-free locations.

bacteria (singular: bacterium) Single-celled organisms. These dwell nearly everywhere on Earth, from the bottom of the sea to inside other living organisms (such as plants and animals). Bacteria are one of the three domains of life on Earth.

biochemical (adj.) Referring to something made and used within living things.

bioengineering The application of technology for the beneficial manipulation of living things. Researchers in this field use the principles of biology and the techniques of engineering to design organisms or products that can mimic, replace or augment the chemical or physical processes present in existing organisms. This field includes researchers who genetically modify organisms, including microbes. It also includes researchers who design medical devices such as artificial hearts and artificial limbs. Someone who works in this field is known as a bioengineer.

biomass Matter that contains carbon and can be used as a fuel, especially in a power station for the generation of electricity. Plants are a kind of biomass.

carbon The chemical element having the atomic number 6. It is the physical basis of all life on Earth. Carbon exists freely as graphite and diamond. It is an important part of coal, limestone and petroleum, and is capable of self-bonding, chemically, to form an enormous number of chemically, biologically and commercially important molecules.

carbon dioxide (or CO2) A colorless, odorless gas produced by all animals when the oxygen they inhale reacts with the carbon-rich foods that they&rsquove eaten. Carbon dioxide also is released when organic matter burns (including fossil fuels like oil or gas). Carbon dioxide acts as a greenhouse gas, trapping heat in Earth&rsquos atmosphere. Plants convert carbon dioxide into oxygen during photosynthesis, the process they use to make their own food.

cell The smallest structural and functional unit of an organism. Typically too small to see with the unaided eye, it consists of a watery fluid surrounded by a membrane or wall. Depending on their size, animals are made of anywhere from thousands to trillions of cells. Most organisms, such as yeasts, molds, bacteria and some algae, are composed of only one cell.

cellulose A type of fiber found in plant cell walls. It is formed by chains of glucose molecules.

cereals Plants in the grass family that provides an edible seed, which serves as a food staple (such as wheat, barley, corn, oats and rice).

chemical A substance formed from two or more atoms that unite (bond) in a fixed proportion and structure. For example, water is a chemical made when two hydrogen atoms bond to one oxygen atom. Its chemical formula is H2O. Chemical also can be an adjective to describe properties of materials that are the result of various reactions between different compounds.

chemical engineer A researcher who uses chemistry to solve problems related to the production of food, fuel, medicines and many other products.

colleague Someone who works with another a co-worker or team member.

compound (often used as a synonym for chemical) A compound is a substance formed when two or more chemical elements unite (bond) in fixed proportions. For example, water is a compound made of two hydrogen atoms bonded to one oxygen atom. Its chemical symbol is H2O.

defecate To discharge solid waste from the body.

digest (noun: digestion) To break down food into simple compounds that the body can absorb and use for growth. Some sewage-treatment plants harness microbes to digest &mdash or degrade &mdash wastes so that the breakdown products can be recycled for use elsewhere in the environment.

digestive tract The tissues and organs through which foods enter and move through the body. In people, these organs include the esophagus, stomach, intestines, rectum and anus. Foods are digested &mdash broken down &mdash and absorbed along the way. Any materials not used will exit as wastes (feces and urine).

DNA (short for deoxyribonucleic acid) A long, double-stranded and spiral-shaped molecule inside most living cells that carries genetic instructions. It is built on a backbone of phosphorus, oxygen, and carbon atoms. In all living things, from plants and animals to microbes, these instructions tell cells which molecules to make.

dung The feces of animals, also known as manure.

environment The sum of all of the things that exist around some organism or the process and the condition those things create. Environment may refer to the weather and ecosystem in which some animal lives, or, perhaps, the temperature and humidity (or even the placement of things in the vicinity of an item of interest).

enzymes Molecules made by living things to speed up chemical reactions.

fiber Something whose shape resembles a thread or filament. (in nutrition) Components of many fibrous plant-based foods. These so-called non-digestible fibers tend to come from cellulose, lignin, and pectin &mdash all plant constituents that resist breakdown by the body&rsquos digestive enzymes.

fossil fuel Any fuel &mdash such as coal, petroleum (crude oil) or natural gas &mdash that has developed within the Earth over millions of years from the decayed remains of bacteria, plants or animals.

fuel Any material that will release energy during a controlled chemical or nuclear reaction. Fossil fuels (coal, natural gas and petroleum) are a common type that liberate their energy through chemical reactions that take place when heated (usually to the point of burning).

fungus (plural: fungi) One of a group of single- or multiple-celled organisms that reproduce via spores and feed on living or decaying organic matter. Examples include mold, yeasts and mushrooms.

gene (adj. genetic) A segment of DNA that codes, or holds instructions, for a cell&rsquos production of a protein. Offspring inherit genes from their parents. Genes influence how an organism looks and behaves.

graduate school A university program that offers advanced degrees, such as a Master&rsquos or PhD degree. It&rsquos called graduate school because it is started only after someone has already graduated from college (usually with a four-year degree).

gut An informal term for the gastrointestinal tract, especially the intestines.

hemicellulose A comparatively soft fiber found in a plant&rsquos cell walls.

lignin A natural substance that helps strengthen the cell walls of plants. Although lignin is made from a large number of sugar molecules, which should provide energy, livestock can&rsquot digest this material because of the way its sugars are chemically bonded together.

microbe Short for microorganism. A living thing that is too small to see with the unaided eye, including bacteria, some fungi and many other organisms such as amoebas. Most consist of a single cell.

millennia (singular: millennium) Thousands of years.

nutrient A vitamin, mineral, fat, carbohydrate or protein that a plant, animal or other organism requires as part of its food in order to survive.

oxygen A gas that makes up about 21 percent of Earth's atmosphere. All animals and many microorganisms need oxygen to fuel their growth (and metabolism).

petroleum A thick flammable liquid mixture of hydrocarbons. Petroleum is a fossil fuel mainly found beneath the Earth&rsquos surface. It is the source of the chemicals used to make gasoline, lubricating oils, plastics and many other products.

stover Corn stalks, once the ears have been harvested. This fibrous material is used to feed livestock.

sustainable An adjective to describe the use of resources in a such a way that they will continue to be available long into the future.

technology The application of scientific knowledge for practical purposes, especially in industry &mdash or the devices, processes and systems that result from those efforts.

toxic Poisonous or able to harm or kill cells, tissues or whole organisms. The measure of risk posed by such a poison is its toxicity.

tract A particular, well-defined area. It can be a patch of land, such as the area on which a house is located. Or it can be a bit of real estate in the body. For instance, important parts of an animal&rsquos body will include its respiratory tract (lungs and airways), reproductive tract (gonads and hormone systems important to reproduction) and gastro-intestinal tract (the stomach and intestines &mdash or organs responsible for moving food, digesting it, absorbing it and eliminating wastes).

Wales One of the three components of Great Britain (the other two being England and Scotland. It&rsquos also part of the United Kingdom (whose other members include England, Scotland and Northern Ireland).

waste Any materials that are left over from biological or other systems that have no value, so they can be disposed of as trash or recycled for some new use.

yeast One-celled fungi that can ferment carbohydrates (like sugars), producing carbon dioxide and alcohol. They also play a pivotal role in making many baked products rise.

Watch the video: Will Biden Defend Taiwan As He Said? (December 2022).