3.3: Escherichia coli is a small prokaryotic cell - Biology

3.3: Escherichia coli is a small prokaryotic cell - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

In this course, we will be using the proteobacterium, E. coli, to propagate copies of METgenes on plasmids, which are small circles of DNA that replicate in the E. coli cytoplasm. Unlike the E. coli strains that you hear about in food-borne outbreaks, our lab strain has been engineered for use in molecular biology labs and is unable to colonize the human intestinal tract. E. coli
are particularly useful to molecular biologists because they rapidly grow to very high densities
in laboratory culture media, reaching densities of 1-10 billion cells per mL. Although E. coli are 10-100 times smaller than yeast cells, their sheer numbers and their distinct motion renders them visible with the light microscope.

Leica DM500 Light microscope

To observe E. coli with any detail, you will need to use the 100X lens, which is also known as an oil immersion lens. This is the longest, most powerful and most expensive lens on the microscope, requiring extra care when using it. As the name implies, the 100X lens is immersed in a drop of oil on the slide. Immersion oil has the same refractive index as glass, so it prevents light from bending as it enters the lens. The oil should be removed immediately from the 100X lens after use. Oil should NEVER touch the 4X, 10X or 40X lenses, which are destroyed by the oil.

Download and print this article for your personal scholarly, research, and educational use.

Buy a single issue of Science for just $15 USD.


Vol 313, Issue 5788
11 August 2006

Article Tools

Please log in to add an alert for this article.

By Jean-Philippe Nougayrède , Stefan Homburg , Frédéric Taieb , Michèle Boury , Elzbieta Brzuszkiewicz , Gerhard Gottschalk , Carmen Buchrieser , Jörg Hacker , Ulrich Dobrindt , Eric Oswald

Science 11 Aug 2006 : 848-851

Microbes that normally live in the gut produce a small molecule that slows the turnover of the gut lining by damaging host DNA, possibly enhancing their colonization.


Antibiotic-tolerant persisters are often implicated in treatment failure of chronic and relapsing bacterial infections, but the underlying molecular mechanisms have remained elusive. Controversies revolve around the relative contribution of specific genetic switches called toxin–antitoxin (TA) modules and global modulation of cellular core functions such as slow growth. Previous studies on uropathogenic Escherichia coli observed impaired persister formation for mutants lacking the pasTI locus that had been proposed to encode a TA module. Here, we show that pasTI is not a TA module and that the supposed toxin PasT is instead the bacterial homolog of mitochondrial protein Coq10 that enables the functionality of the respiratory electron carrier ubiquinone as a “lipid chaperone.” Consistently, pasTI mutants show pleiotropic phenotypes linked to defective electron transport such as decreased membrane potential and increased sensitivity to oxidative stress. We link impaired persister formation of pasTI mutants to a global distortion of cellular stress responses due to defective respiration. Remarkably, the ectopic expression of human coq10 largely complements the respiratory defects and decreased persister levels of pasTI mutants. Our work suggests that PasT/Coq10 has a central role in respiratory electron transport that is conserved from bacteria to humans and sustains bacterial tolerance to antibiotics.


Fixation and embedding of cells.

Escherichia coli strain RP437 was used as the control for wild-type experiments. Plasmids pHSe5.tsrQQQQ and pHSe5.tsrQEQE were used to produce Tsr in HCB721 cells, which do not express the chemotaxis-related proteins Tar, Tsr, Trg, Tap, CheA, CheW, CheR, and CheB, as described previously (32). Harvested cells were fixed at room temperature for 2.5 h in a mixture of 2% paraformaldehyde and 0.2% glutaraldehyde in the presence of 60 mM piperazine-N,N′-bis(2-ethanesulfonic acid) (PIPES), 50 mM HEPES (pH 6.9), 4 mM MgCl2, and 20 mM EGTA. The fixed cells were collected by centrifugation and resuspended in prewarmed 0.1 M phosphate buffer containing 12% gelatin. After the cell-containing gelatin pellets were solidified on ice, they were cut into 1-mm cubes and incubated with a solution containing 2.3 M sucrose and 0.1 M sodium phosphate buffer (pH 7.4). Cubes of gelatin were frozen on the surfaces of aluminum pins by plunging them into liquid nitrogen and were sectioned with a cryoultramicrotome at �ଌ. Labeling with anti-Tsr antibody (which specifically reacts with the conserved signaling domain) and protein A-gold and subsequent embedding in methyl cellulose with uranyl acetate with reagents were carried out as described previously (22).

Electron microscopy.

The projection images shown in Fig. ​ Fig.1 1 were recorded with a Gatan 2K charge-coupled device camera mounted on a Tecnai 12 electron microscope (FEI Corporation, Hillsboro, Oreg.) equipped with an LaB6 filament operating at 120 kV. For tomography, a series of images were recorded at room temperature with the aid of a Gatan 2K charge-coupled device (magnification, ∼흇,500) by tilting the specimen from �° to 70° in increments of 0.5° in a Tecnai F30 microscope equipped with a field emission gun tip operating at 300 kV. Images were recorded at underfocus values that were between 2 and 3 μm along the tilt axis. A back-projection algorithm, as implemented in the IMOD reconstruction package (15), was used to convert the information present in the series of tilted projection images into three-dimensional density maps.

Preparation of Tween 80-extracted membranes.

Membrane preparations (16) isolated on sucrose gradients were typically incubated with Tween 80 at a protein/Tween 80 molar ratio of 0.004 for about 4 h as described previously (32).

Segmentation and rendering.

The tomogram was segmented in the environment of the program Amira (TGS Inc., San Diego, Calif.) by marking all regions in the volume where the bilayer (white lines in the slice) could be visualized clearly in three dimensions. An isosurface was created by tracing the path of the bilayer in each slice of the tomogram. Structural models of the two types of receptor assemblies shown in Fig. 3d and e were docked onto the isosurface by using the program 3dsmax (DISCREET, Montreal, Quebec, Canada). The coordinates for the receptor dimer were the coordinates in the model described by Kim et al. (14) and kindly provided by Sung-Hou Kim. Starting from the model of the dimer, a variety of plausible arrangements for higher-order arrangements, such as the trimer of dimers shown in Fig. 3d and e , were then generated by using the electron microscopic images as a guide. One set of the plausible arrangements is shown in Fig. ​ Fig.3 3 .

(a to c) Negatively stained electron micrographs obtained from a Tween 80 extract of cells expressing Tsr. The extract preserved the overall appearance of the membrane morphologies in the tomogram, and the images show in more detail the organization of the zippered structures in cross section (a) and top view (b), as well as rounded vesicular structures (c). Scale bars = 50 nm. (d) Molecular model for packing in the zippered regions based on the atomic model for the structure of Tsr shown in Fig. ​ Fig.1a 1a and the electron crystallographic analysis of the crystalline regions (32). The packing in the crystalline areas of the membrane (b) roughly corresponds to a lattice with the following constants: a, � Å b, � Å and γ, �°. The cross-sectional area of each unit in the crystal is 𢏅,000 Å 2 , which can accommodate at most three Tsr dimers, based on the presence of four helices per Tsr monomer at the wider, periplasmic end of the receptor (20) and the known values of about 180 Å 2 that characterize the cross-sectional areas of helices in well-packed two-dimensional crystalline arrays (12). The array shown is one of many possible arrangements of the three dimers and is presented primarily to provide an indication of the density of receptor packing in the membrane. The white regions at the center of the zippered structure in panel a are interpreted as high-density regions where the cytoplasmic ends of the Tsr trimer are presumed to be interdigitated. The depth of the interdigitation is consistent with the proportional width of this region in the micrograph in panel a. The average internal spacing of the zippered structures in tomographic slices such as the slice shown in Fig. ​ Fig.2c 2c is 31.5 + 2.5 nm (averaged over 26 separate measurements). This compares well with the value for the same internal spacing in these negatively stained specimens (31.3 ± 0.6 nm [averaged over 10 measurements], as reported by Weis et al. [32]). (e) Molecular model for packing in the vesicular regions (sectional view). The outer white ring of the circular structure in panel c is interpreted as the surface of the membrane region enclosing the volume, and the less clear inner ring is interpreted as a high-density region where the cytoplasmic ends of Tsr come together.

3. Pathway for phospholipid biosynthesis in E. coli

3.1. Synthesis of phosphatidic acid

As in eukaryotic cells, PA is the precursor to all the glycerol-based (as distinguished from Lipid A core of lipopolysaccharide (LPS)) phospholipids of E. coli. PA is synthesized in two sequential steps employing long chain acyl-CoA or acyl-ACP (acyl carrier protein) for acylation first at the sn-1 and then the sn-2 position catalyzed by the plsB [24] and plsC [25] gene products, respectively. PA [26] synthesis in bacteria is covered in detail elsewhere in this issue. Kennedy followed many of his trainees have been the central figures in defining the pathway (see Fig. 3 ), metabolism and function of phospholipids in E. coli. This basic framework along with the early studies of phospholipid synthesis in mammals has served as a starting point for extending the understanding of phospholipid metabolism in Archaea [1, 27], yeast [28], plants [29] and somatic cells [30, 31], which can be accessed in numerous reviews some of which are noted above.

Pathways for synthesis of phospholipids in E. coli. The following enzymes with their respective genes named carry out: 1. CDP-diacylglycerol synthase (CdsA) 2. phosphatidylserine synthase (PssA) 3. phosphatidylserine decarboxylase (Psd) 4. phosphatidylglycerophosphate synthase (PgsA) 5. phosphatidylglycerophosphate phosphatase (Pgp) encoded by three genes 6. cardiolipin synthase (Cls) encoded by three genes with one substrate being PG in all three cases and the second substrate for and [ClsC] indicated by the brackets. Definitive identification of the second substrate for ClsB has not been established 7. Phosphatidylglycerol:pre-membrane derived oligosaccharide (MDO) sn-glycerol-1-P transferase (MDO synthase) 8. diacylglycerol kinase (DgkA).

3.2. Formation of phosphatidylethanolamine and phosphatidylglycerol

In 1963 Kanfer and Kennedy [32] noted that rigorous characterization of bacterial phospholipids and the pathways leading to their biosynthesis were understudied relative to what was known for mammalian systems. They first labeled growing cells of E. coli with 32 Pi for increasing times from 30 sec to 30 min, which was followed by a chase of label. The lipid fraction was isolated, and the incorporation of label into the mild alkaline deacylation products of phospholipids was quantified after chromatographic separation. PE, phosphatidylserine (PS), PA and phosphatidylglycerol (PG) were all detected at early time points. Label in PA and PS steadily decreased with longer labeling times consistent with being intermediates to other phospholipids. Label in PE remained stable for several generations during the chase, indicating that the phosphate moiety does not turnover to other products. Later, it will be shown that the fatty acid at the sn-1 position of PE [33, 34] is used to acylate the N-terminal amino acid of outer membrane lipoproteins followed by the reformation of PE by a specific acyltransferase [35]. An important observation was that label in PG was not stable during the chase, indicating turnover to water-soluble products due to degradation or some other phosphate labeled compound. Cardiolipin (CL) had not yet been identified in E. coli. More interesting was the fate of much of this label for the synthesis of membrane-derived oligosaccharide (MDO) of the periplasmic space, a study that Kennedy returned to in the 1970’s.

Next came the establishment of the pathway for PE and PG biosynthesis [36]. Involvement of CDP-ethanolamine was ruled out, but the conversion of PS to PE by decarboxylation was confirmed [37] to be the same as already seen in animal cells [38] while the formation of PS was quite different in E. coli. Building on the role of CDP-DAG involvement in PI synthesis, Kanfer and Kennedy tried the liponucleotide as substrate with L-serine for the synthesis of PS followed by its decarboxylation. In mammalian cells, PS is made by exchange of L-serine with the hydrophilic head group of either PE or PC by two separate enzymes [39]. Thus far CDP-DAG-dependent PS synthesis is unique to bacteria and yeast and appears to be absent in higher eukaryotes except for its presence in wheat [40]. Use of sn-glycerol-3-phosphate in place of serine resulted in the formation of PG but not PG-phosphate (PGP), which they presumed was acted on by a phosphatase as had been shown for the same reaction in chicken liver [41]. Inactivation of the PGP phosphatase in crude extracts by sulfhydryl reagents demonstrated the intermediate formation of PGP [42]. It would be nearly 50 years later before the primary PGP phosphatase is identified [43]. In 1968 Carter demonstrated the synthesis of CDPDAG from CTP and PA by a particulate fraction of E. coli [44].

3.3. Cardiolipin synthesis is different in E. coli than in eukaryotic cells

Of the major phospholipids of E. coli only the synthesis of CL remained to be established. CL was first isolated and characterized from beef heart in pursuit of the substance in alcohol extracts that reacted with sera from patients with syphilis [45]. Pangborn started with 15 beef hearts for the initial alcohol extract followed by CdCl2 precipitation. After solubilization of the precipitate with petroleum ether and about 12 more extraction/precipitation steps, 5 g of pure CL was isolated. Although Pangborn proposed a structure for CL, it was not until 1958 that Macfarlane established the correct structure [46]. Kennedy’s group [47] demonstrated the incorporation of sn-[2- 3 H]glycerol-3-phosphate into PG, PGP and CL by a particulate fraction of E. coli dependent on the presence of CDP-DAG. They established that PGP conversion to PG was a prerequisite for CL synthesis since use of sn-[2- 3 H]glycerol-3-[ 32 P] as substrate failed to incorporate label into CL. However, the strong stimulation by CDP-DAG of the incorporation of radiolabeled PG into CL lead the investigators to make the wrong conclusion on the mechanism by which E. coli makes CL.

In virtually all simple and complex eukaryotes, CL is make by the displacement of CMP from CDP-DAG by the free hydroxyl of PG at the sn-3 position of glycerol as initially shown by van Deenen’s group [48]. However, as was later shown by Hirschberg and Kennedy [49], E. coli condenses two PG molecules to make CL with the release of glycerol. Considerable evidence had accumulated from E. coli and other bacteria in support of a non-CDP-DAG-dependent pathway for CL synthesis (see [49]). Notable among the evidence was the continued formation of CL in the absence of significant metabolic energy, the release of glycerol during CL synthesis and the incorporation of labeled PG into CL in the absence of CDP-DAG. In a series of elegant single and double label experiments, it was established by Hirschberg and Kennedy [49] that CDP-DAG stimulated the formation of CL but did not directly participate in the reaction. They also ruled out any exchange reactions between PG and existing lipids in the crude membrane preparations. With the establishment of the most common route to CL formation in eukaryotes being through CDP-DAG, a clear division between prokaryotes and eukaryotes appeared to exist. However, as will be discussed later, this line has become blurred with respect to complete restriction to either prokaryotes or eukaryotes and a third pathway for CL synthesis recently found in E. coli [50]. Again, it was not until 2012 that all three genes encoding CL synthases in E. coli were accounted for [50].

By the late 1960’s and early 1970’s, the basic outline for the synthesis of the major phospholipids in bacteria and somatic cells had been established. What followed was an era of enzyme purification, establishment of enzymological properties and the identification of genes encoding the enzymes. Around 1969 to 1974 the Kennedy lab was populated by a group of medical students and postdoctoral fellows who initiated many of the above studies and became leaders in their fields after departing. Bill Wickner initiated the purification of the first membrane-associated enzyme that carried out a step in phospholipid metabolism. He went on to successfully purify the complex E. coli replication machinery in Arthur Kornberg’s lab followed by his own work that defined how proteins are inserted into and translocated across the E. coli membrane. Chris Raetz began purification of several enzymes of phospholipid metabolism and then during his postdoc with Herb Tabor developed novel methods to isolate mutants in phospholipid metabolism. During his independent years he defined the “Raetz Pathway” for the synthesis of the membrane embedded core of LPS. Carlos Hirschberg determined how CL was made in E. coli and then in his independent career defined many of the important steps in synthesis of the carbohydrate moieties of glycoproteins. Ed Dennis studied several aspects of phospholipid metabolism in Tetrahymena and then went on to be a leader in studying the role of phospholipases in cell signaling. I was fortunate to be in the lab at the time where my interests in analysis of phospholipid biosynthetic enzymes began and was followed by pursuing the underlying genetics of these enzymes and finally studying the role of phospholipids in cell function.


The work of Kiwako Sakabe, Reiji Okazaki and Tsuneko Okazaki provided experimental evidence supporting the hypothesis that DNA replication is a discontinuous process. Previously, it was commonly accepted that replication was continuous in both the 3' to 5' and 5' to 3' directions. 3' and 5' are specifically numbered carbons on the deoxyribose ring in nucleic acids, and refer to the orientation or directionality of a strand. In 1967, the Tsuneko Okazaki and Toru Ogawa suggested that there is no found mechanism that showed continuous replication in the 3' to 5' direction, only 5' to 3' using DNA polymerase, a replication enzyme. The team hypothesized that if discontinuous replication was used, short strands of DNA, synthesized at the replicating point, could be attached in the 5' to 3' direction to the older strand. [5]

To distinguish the method of replication used by DNA experimentally, the team pulse-labeled newly replicated areas of Escherichia coli chromosomes, denatured, and extracted the DNA. A large number of radioactive short units meant that the replication method was likely discontinuous. The hypothesis was further supported by the discovery of polynucleotide ligase, an enzyme that links short DNA strands together. [6]

In 1968, Reiji and Tsuneko Okazaki gathered additional evidence of nascent DNA strands. They hypothesized that if discontinuous replication, involving short DNA chains linked together by polynucleotide ligase, is the mechanism used in DNA synthesis, then "newly synthesized short DNA chains would accumulate in the cell under conditions where the function of ligase is temporarily impaired." E. coli were infected with bacteriophage T4 that produce temperature-sensitive polynucleotide ligase. The cells infected with the T4 phages accumulated a large number of short, newly synthesized DNA chains, as predicted in the hypothesis, when exposed to high temperatures. This experiment further supported the Okazakis' hypothesis of discontinuous replication and linkage by polynucleotide ligase. It disproved the notion that short chains were produced during the extraction process as well. [7]

The Okazakis' experiments provided extensive information on the replication process of DNA and the existence of short, newly synthesized DNA chains that later became known as Okazaki fragments.

Two pathways have been proposed to process Okazaki fragments: the short flap pathway and the long flap pathway.

Short Flap Pathway Edit

In the short flap pathway in eukaryotes the lagging strand of DNA is primed in short intervals. In the short pathway only, the nuclease FEN1 is involved. Pol δ frequently encounters the downstream primed Okazaki fragment and displaces the RNA/DNA initiator primer into a 5′ flap. The FEN1 5’-3’ endonuclease recognizes that the 5’ flap is displaced, and it cleaves, creating a substrate for ligation. In this method the Pol a-synthesized primer is removed. Studies [ which? ] show that in the FEN1 suggest a ‘tracking model where the nuclease moves from the 5’ flap to its base to preform cleavage. The Pol δ does not process a nuclease activity to cleave the displaced flap. The FEN1 cleaves the short flap immediately after they form. The cleavage is inhibited when the 5’ end of the DNA flap is blocked either with a complementary primer or a biotin-conjugated streptavidin moiety. DNA ligase seals the nick made by the FEN1 and it creates a functional continuous double strand of DNA. PCNA simulates enzymatic functions of proteins for both FEN1 and DNA ligase. The interaction is crucial in creating proper ligation of the lagging DNA strand. Sequential strand displacement and cleavage by Pol δ and FEN1, respectively, helps to remove the entire initiator RNA before ligation. Many displacements need to take place and cleavage reactions are required to remove the initiator primer. The flap that is created and processes and it is matured by the short flap pathway.

Long Flap Pathway Edit

In some cases, the FEN1 lasts for only a short period of time and disengages from the replication complex. This causes a delay in the cleavage that the flaps displaced by Pol δ become long. When the RPA reaches a long enough length, it can bind stably. When the RPA bound flaps are refactorized to FEN1 cleavage the require another nuclease for processing, this has been identified as an alternate nuclease, DNA2. DNA2 has defects in the DEN1 overexpression. The DNA2 showed to work with FEN1 to process long flaps. DNA2 can dissociate the RPA from a long flap, it does this by using a mechanism like the FEN1. It binds the flap and threads the 5’ end of the flap. The nuclease cleaves the flap making it too short to bind to the RPA, the flap being too short means it is available for FEN1 and ligation. This is known as the long flap method. DNA2 can act as FEN1 as a backup for nuclease activity but it is not an efficient process.

Alternate pathway Edit

Until recently, there were only two known pathways to process Okazaki fragments. However, current investigations have concluded that a new pathway for Okazaki fragmentation and DNA replication exists. This alternate pathway involves the enzymes Pol δ with Pif1 which perform the same flap removal process as Pol δ and FEN1. [8]

Primase Edit

Primase adds RNA primers onto the lagging strand, which allows synthesis of Okazaki fragments from 5' to 3'. However, primase creates RNA primers at a much lower rate than that at which DNA polymerase synthesizes DNA on the leading strand. DNA polymerase on the lagging strand also has to be continually recycled to construct Okazaki fragments following RNA primers. This makes the speed of lagging strand synthesis much lower than that of the leading strand. To solve this, primase acts as a temporary stop signal, briefly halting the progression of the replication fork during DNA replication. This molecular process prevents the leading strand from overtaking the lagging strand. [9]

DNA polymerase δ Edit

New DNA is made during this phase by enzymes which synthesize DNA in the 5’ to 3’ direction. DNA polymerase is essential for both the leading strand which is made as a continuous strand and lagging strand which is made in small pieces in DNA Synthesis. This process happens for extension of the newly synthesized fragment and expulsion of the RNA and DNA segment. Synthesis occurs in 3 phases with two different polymerases, DNA polymerase α-primase and DNA polymerase δ. This process starts with polymerase α-primase displacing from the RNA and DNA primer by the clamp loader replication Effect, this Effect leads the sliding clamp onto the DNA. After this, DNA polymerase δ begins to go into its holoenzyme form which then synthesis begins. The synthesis process will continue until the 5’end of the previous Okazaki fragment has arrived. Once arrived, Okazaki fragment processing proceeds to join the newly synthesized fragment to the lagging strand. Last function of DNA polymerase δ is to serve as a supplement to FEN1/RAD27 5’ Flap Endonuclease activity. The rad27-p allele is lethal in most combinations but was viable with the rad27-p polymerase and exo1. Both rad27-p polymerase and exo1 portray strong synergistic increases in CAN 1 duplication mutations. The only reason this mutation is viable is due to the double-strand break repair genes RAD50, RAD51 and RAD52. The RAD27/FEN1 creates nicks between adjacent Okazaki fragments by minimizing the amount of strand-expulsion in the lagging strand.

DNA ligase I Edit

During lagging strand synthesis, DNA ligase I connects the Okazaki fragments, following replacement of the RNA primers with DNA nucleotides by DNA polymerase δ. Okazaki fragments that are not ligated could cause double-strand-breaks, which cleaves the DNA. [10] Since only a small number of double-strand breaks are tolerated, and only a small number can be repaired, enough ligation failures could be lethal to the cell.

Further research implicates the supplementary role of proliferating cell nuclear antigen (PCNA) to DNA ligase I's function of joining Okazaki fragments. When the PCNA binding site on DNA ligase I is inactive, DNA ligase I's ability to connect Okazaki fragments is severely impaired. Thus, a proposed mechanism follows: after a PCNA-DNA polymerase δ complex synthesizes Okazaki fragments, the DNA polymerase δ is released. Then, DNA ligase I binds to the PCNA, which is clamped to the nicks of the lagging strand, and catalyzes the formation of phosphodiester bonds. [11] [12] [13]

Flap endonuclease 1 Edit

Flap endonuclease 1 (FEN1) is responsible for processing Okazaki fragments. It works with DNA polymerase to remove the RNA primer of an Okazaki fragment and can remove the 5' ribonucleotide and 5' flaps when DNA polymerase displaces the strands during lagging strand synthesis. The removal of these flaps involves a process called nick translation and creates a nick for ligation. Thus, FEN1's function is necessary to Okazaki fragment maturation in forming a long continuous DNA strand. Likewise, during DNA base repair, the damaged nucleotide is displaced into a flap and subsequently removed by FEN1. [14] [15]

Dna2 endonuclease Edit

Dna2 endonuclease does not have a specific structure and their properties are not well characterized, but could be referred as single-stranded DNA with free ends (ssDNA). Dna2 endonuclease is essential to cleave long DNA flaps that leave FEN1 during the Okazaki Process. Dna2 endonuclease is responsible for the removal of the initiator RNA segment on Okazaki Fragments. Also, Dna2 endonuclease has a pivotal role in the intermediates created during diverse DNA metabolisms and is functional in telomere maintenance. [16] [17] [18] [19] [20]

Dna2 endonuclease becomes active when a terminal RNA segment attaches at the 5’ end, because it translocates in the 5’ to 3’ direction. In the presence of a single stranded DNA-binding protein RPA, the DNA 5' flaps become too long, and the nicks no longer fit as substrate for FEN1. This prevents the FEN1 from removing the 5′-flaps. Thus, Dna2's role is to reduce the 3′ end of these fragments, making it possible for FEN1 to cut the flaps, and the Okazaki fragment maturation more efficient. During the Okazaki Process, Dna2 helicase and endonuclease are inseparable. Dna2 Endonuclease does not depend on the 5’-tailed fork structure of its activity. Unproductive binding has been known to create blocks to FEN1 cleavage and tracking. It is known that ATP reduces activity, but promotes the release of the 3’-end label. Studies have suggested that a new model of Dna2 Endonuclease and FEN1 are partially responsible in Okazaki fragment maturation. [19] [17] [16] [21]

Newly synthesized DNA, otherwise known as Okazaki fragments, are bound by DNA ligase, which forms a new strand of DNA. There are two strands that are created when DNA is synthesized. The leading strand is continuously synthesized and is elongated during this process to expose the template that is used for the lagging strand (Okazaki fragments). During the process of DNA replication, DNA and RNA primers are removed from the lagging strand of DNA to allow Okazaki fragments to bind to. Since this process is so common, Okazaki maturation will take place around a million times during one completion of DNA replication. For Okazaki maturation to occur, RNA primers must create segments on the fragments to be ligated. This is used as a building block for the synthesis of DNA in the lagging strand. On the template strand, polymerase will synthesize in the opposite direction from the replication fork. Once the template becomes discontinuous, it will create an Okazaki fragment. Defects in the maturation of Okazaki fragments can potentially cause strands in the DNA to break and cause different forms of chromosome abnormality. These mutations in the chromosomes can affect the appearance, the number of sets, or the number of individual chromosomes. Since chromosomes are fixed for each specific species, it can also change the DNA and cause defects in the genepool of that species.

Okazaki fragments are present in both prokaryotes and eukaryotes. [22] DNA molecules in eukaryotes differ from the circular molecules of prokaryotes in that they are larger and usually have multiple origins of replication. This means that each eukaryotic chromosome is composed of many replicating units of DNA with multiple origins of replication. In comparison, prokaryotic DNA has only a single origin of replication. In eukaryotes, these replicating forks, which are numerous all along the DNA, form "bubbles" in the DNA during replication. The replication fork forms at a specific point called autonomously replicating sequences (ARS). Eukaryotes have a clamp loader complex and a six-unit clamp called the proliferating cell nuclear antigen. [23] The efficient movement of the replication fork also relies critically on the rapid placement of sliding clamps at newly primed sites on the lagging DNA strand by ATP-dependent clamp loader complexes. This means that the piecewise generation of Okazaki fragments can keep up with the continuous synthesis of DNA on the leading strand. These clamp loader complexes are characteristic of all eukaryotes and separate some of the minor differences in the synthesis of Okazaki fragments in prokaryotes and eukaryotes. [24] The lengths of Okazaki fragments in prokaryotes and eukaryotes are different as well. Prokaryotes have Okazaki fragments that are quite longer than those of eukaryotes. Eukaryotes typically have Okazaki fragments that are 100 to 200 nucleotides long, whereas fragments in prokaryotic E. coli can be 2,000 nucleotides long. The reason for this discrepancy is unknown.

Each eukaryotic chromosome is composed of many replicating units of DNA with multiple origins of replication. In comparison, the prokaryotic E. coli chromosome has only a single origin of replication. Replication in prokaryotes occurs inside of the cytoplasm, and this all begins the replication that is formed of about 100 to 200 or more nucleotides. Eukaryotic DNA molecules have a significantly larger number of replicons, about 50,000 or more however, replication does not occur at the same time on all of the replicons. In eukaryotes, DNA replication takes place in the nucleus. A plethora replication form in just one replicating DNA molecule, the start of DNA replication is moved away by the multi-subunit protein. This replication is slow, and sometimes about 100 nucleotides per second are added.

We take from this that prokaryotic cells are simpler in structure, they have no nucleus, organelles, and very little of DNA, in the form of a single chromosome. Eukaryotic cells have nucleus with multiple organelles and more DNA arranged in linear chromosomes. We also see that the size is another difference between these prokaryotic and eukaryotic cells. The average eukaryotic cell has about 25 times more DNA than a prokaryotic cell does. Replication occurs much faster in prokaryotic cells than in eukaryotic cells bacteria sometimes only take 40 minutes, while animal cells can take up to 400 hours. Eukaryotes also have a distinct operation for replicating the telomeres at the end of their last chromosomes. Prokaryotes have circular chromosomes, causing no ends to synthesize. Prokaryotes have a short replication process that occurs continuously eukaryotic cells, on the other hand, only undertake DNA replication during the S-phase of the cell cycle.

The similarities are the steps for the DNA replication. In both prokaryotes and eukaryotes, replication is accomplished by unwinding the DNA by an enzyme called the DNA helicase. New strands are created by enzymes called DNA polymerases. Both of these follow a similar pattern, called semi-conservative replication, in which individual strands of DNA are produced in different directions, which makes a leading and lagging strand. These lagging strands are synthesized by the production of Okazaki fragments that are soon joined together. Both of these organisms begin new DNA strands which also include small strands of RNA.

Medical concepts associated with Okazaki fragments Edit

Although cells undergo multiple steps in order to ensure there are no mutations in the genetic sequence, sometimes specific deletions and other genetic changes during Okazaki fragment maturation go unnoticed. Because Okazaki fragments are the set of nucleotides for the lagging strand, any alteration including deletions, insertions, or duplications from the original strand can cause a mutation if it is not detected and fixed. Other causes of mutations include problems with the proteins that aid in DNA replication. For example, a mutation related to primase affects RNA primer removal and can make the DNA strand more fragile and susceptible to breaks. Another mutation concerns polymerase α, which impairs the editing of the Okazaki fragment sequence and incorporation of the protein into the genetic material. Both alterations can lead to chromosomal aberrations, unintentional genetic rearrangement, and a variety of cancers later in life. [25]

In order to test the effects of the protein mutations on living organisms, researchers genetically altered lab mice to be homozygous for another mutation in protein related to DNA replication, flap endonuclease 1, or FEN1. The results varied based on the specific gene alterations. The homozygous knockout mutant mice experienced a "failure of cell proliferation" and "early embryonic lethality" (27). The mice with the mutation F343A and F344A (also known as FFAA) died directly after birth due to complications in birth including pancytopenia and pulmonary hypoplasia. This is because the FFAA mutation prevents the FEN1 from interacting with PCNA (proliferating cell nuclear antigen), consequently not allowing it to complete its purpose during Okazaki fragment maturation. The interaction with this protein is considered to be the key molecular function in the FEN1’s biological function. The FFAA mutation causes defects in RNA primer removal and long-base pair repair, of which cause many breaks in the DNA. Under careful observation, cells homozygous for FFAA FEN1 mutations seem to display only partial defects in maturation, meaning mice heterozygous for the mutation would be able to survive into adulthood, despite sustaining multiple small nicks in their genomes. Inevitably however, these nicks prevent future DNA replication because the break causes the replication fork to collapse and causes double strand breaks in the actual DNA sequence. In time, these nicks also cause full chromosome breaks, which could lead to severe mutations and cancers. Other mutations have been implemented with altered versions of Polymerase α, leading to similar results. [25]


In many bacteria, the chromosome is a single covalently closed (circular) double-stranded DNA molecule that encodes the genetic information in a haploid form. The size of the DNA varies from 500,000 to several million base-pairs (bp) encoding from 500 to several thousand genes depending on the organism. The chromosomal DNA is present in cells in a highly condensed, organized form called nucleoid (nucleus-like), which is not encased by a nuclear membrane as in eukaryotic cells. The isolated nucleoid contains 80% DNA, 10% protein, and 10% RNA by weight [1, 2]. In this exposition, we review our current knowledge about (i) how chromosomal DNA becomes the nucleoid, (ii) the factors involved therein, (iii) what is known about its structure, and (iv) how some of the DNA structural aspects influence gene expression, using the gram-negative bacterium Escherichia coli as a model system. We also highlight some related issues that need to be resolved. This exposition is an extension of past reviews on the subject [3, 4].

There are two essential aspects of nucleoid formation condensation of a large DNA into a small cellular space and functional organization of DNA in a three-dimensional form [5, 6]. The haploid circular chromosome in E. coli consists of

4.6 x 10 6 bp. If DNA is relaxed in the B form, it would have a circumference of

1.5 millimeters (0.332 nm x 4.6 x 10 6 ) (Fig 1A). However, a large DNA molecule such as the E. coli chromosomal DNA does not remain a straight rigid molecule in a suspension. Brownian motion will generate curvature and bends in DNA. The maximum length up to which a double-helical DNA remains straight by resisting the bending enforced by Brownian motion is

50 nm or 150 bp, which is called the persistence length. Thus, pure DNA becomes substantially condensed without any additional factors at thermal equilibrium, it assumes a random coil form. The random coil of E. coli chromosomal DNA (Fig 1B) would occupy a volume (4/3 π r 3 ) of

523 μm 3 , calculated from the radius of gyration (Rg = (√N a)/√6) where a is the Kuhn length (2 x persistence length), and N is the number of Kuhn length segments in the DNA (total length of the DNA divided by a). Although DNA is already condensed in the random coil form, it still cannot assume the volume of the nucleoid which is less than a micron (Fig 1C). Thus, the inherent property of DNA is not sufficient: additional factors must help condense DNA further on the order of

10 3 (volume of the random coil divided by the nucleoid volume). The second essential aspect of nucleoid formation is the functional arrangement of DNA. Chromosomal DNA is not only condensed but also functionally organized in a way that is compatible with DNA transaction processes such as replication, recombination, segregation, and transcription (Fig 1C). Almost five decades of research beginning in 1971 [1], has shown that the final form of the nucleoid arises from a hierarchical organization of DNA. At the smallest scale (1 -kb or less), nucleoid-associated DNA architectural proteins condense and organize DNA by bending, looping, bridging or wrapping DNA. At a larger scale (10 -kb or larger), DNA forms plectonemic loops, a braided form of DNA induced by supercoiling. At the megabase scale, the plectonemic loops coalesce into six spatially organized domains (macrodomains), which are defined by more frequent physical interactions among DNA sites within the same macrodomain than between different macrodomains [7]. Long- and short-range DNA-DNA connections formed within and between the macrodomains contribute to condensation and functional organization. Finally, the nucleoid is a helical ellipsoid with regions of highly condensed DNA at the longitudinal axis [8–10]. We discuss these organizational features of the nucleoid and their molecular basis below.

A. An illustration of an open conformation of the circular genome of E. coli. Arrows represent bi-directional DNA replication. The genetic position of the origin of bi-directional DNA replication (oriC) and the site of chromosome decatenation (dif) in the replication termination region (ter) are marked. Colors represent specific segments of DNA as discussed in C. B. An illustration of a random coil form adopted by the pure circular DNA of E. coli at thermal equilibrium without supercoils and additional stabilizing factors [5, 6]. C. A cartoon of the chromosome of a newly born E. coli cell. The genomic DNA is not only condensed by 1000-fold compared to its pure random coil form but is also spatially organized. oriC and dif are localized in the mid-cell, and specific regions of the DNA indicated by colors in A organize into spatially distinct domains. Six spatial domains have been identified in E. coli. Four domains (Ori, Ter, Left, and Right) are structured and two (NS-right and NS-left) are non-structured (See section 4 of the main text for details). The condensed and organized form of the DNA together with its associated proteins and RNAs is called nucleoid. Drawings are not in scale with each other.

DNA condensation and organization by nucleoid-associated proteins (NAPs)

In eukaryotes, genomic DNA is condensed in the form of a repeating array of DNA-protein particles called nucleosomes [11–13].

146 bp of DNA wrapped around an octameric complex of the histone proteins. Although bacteria do not have histones, they possess a group of DNA binding proteins referred to as nucleoid-associated proteins (NAPs) that are functionally analogous to histones in a broad sense. NAPs are highly abundant and constitute a significant proportion of the protein component of the nucleoid [14].

A distinctive characteristic of NAPs is their ability to bind DNA in both a specific (either sequence- or structure-specific) and non-sequence specific manner. As a result, NAPs are dual function proteins. The specific binding of NAPs is mostly involved in gene-specific transcription, DNA replication, recombination, and repair. At the peak of their abundance, the number of molecules of many NAPs is several orders of magnitude higher than the number of specific binding sites in the genome. Therefore, it is reasoned that NAPs bind to the chromosomal DNA mostly in the non-sequence specific mode and it is this mode that is crucial for chromosome compaction. It is noteworthy that the so-called non-sequence specific binding of a NAP may not be completely random. There could be low-sequence specificity and or structural specificity due to sequence-dependent DNA conformation or DNA conformation created by other NAPs.

Although molecular mechanisms of how NAPs condense DNA in vivo are not well understood, based on the extensive in vitro studies it appears that NAPs participate in chromosome compaction via the following mechanisms: NAPs induce and stabilize bends in DNA, thus aid in DNA condensation by reducing the persistence length (Fig 2A). NAPs condense DNA by bridging, wrapping, and bunching that could occur between nearby DNA segments or distant DNA segments of the chromosome (Fig 2C, 2D and 2E). Another mechanism by which NAPs participate in chromosome compaction is by constraining negative supercoils in DNA thus contributing to the topological organization of the chromosome (see section 3).

DNA organization by nucleoid-associated proteins (NAPs). A straight or curved grey line depicts DNA, and blue sphere depicts a NAP. A. NAPs organize DNA by bending it. For example, IHF causes sharp DNA bending (bending angle > 160°) upon binding to a specific site, whereas HU introduces flexible bends (bend angles vary between 10–180°). IHF also induce flexible bends at non-sequence-specific sites similar to those induced by HU. Fis bends DNA between 60–75° angle. B. In contrast to bending, NAPs can also cause straightening or stiffening of DNA. For example, H-NS spreads along DNA, and as a result, DNA becomes stiff. HU also causes stiffening of DNA at high concentrations (μm range). C. Simultaneous binding of a contiguous tract of NAP molecules (left) or a single NAP molecule (right) to a pair of adjacent or distant DNA sites results in DNA bridging. In an example of DNA bridging, a tract of laterally-bound H-NS molecules bridges two adjacent DNA sites. D. DNA bunching or bundling refers to DNA organization in which lateral multimerization of HU triggered by the non-sequence-specific binding brings several parallel DNA segments together, like in a bunch of flowers. E. NAP molecules bound adjacent to each other can wrap DNA by coherent bending. Fis molecules bound at tandem sites may organize DNA in this manner.

There are at least 12 NAPs identified in E. coli [15]. Here, we focus on the most extensively studied NAPs, HU, IHF, H-NS, and Fis. Their abundance and DNA binding properties are summarized in Tables 1 and 2. Current models of how each NAP condenses and organizes DNA are discussed in detail below.

Histone-like protein from E. coli strain U93 (HU) is an evolutionarily conserved protein in bacteria [28, 29]. HU exists in E. coli as homo- and heterodimers of two subunits HUα and HUβ sharing 69% amino acid identity [30]. Although it is referred to as a histone-like protein, close functional relatives of HU in eukaryotes are high-mobility group (HMG) proteins, and not histones [31, 32]. HU is a non-sequence specific DNA binding protein. It binds with low-affinity to any linear DNA. However, it preferentially binds with high-affinity to structurally distorted DNA (Table 2) [19, 33–37]. Examples of distorted DNA substrates include cruciform DNA, bulged DNA, dsDNA containing a single-stranded break such as nicks, gaps, or forks. Furthermore, HU specifically binds and stabilizes a protein-mediated DNA loop [38]. In the structurally specific DNA binding mode, HU recognizes a common structural motif defined by bends or kinks created by distortion [17, 18, 39], whereas it binds to a linear DNA by locking the phosphate backbone [40]. While the high-affinity structurally-specific binding is required for specialized functions of HU such as site-specific recombination, DNA repair, DNA replication initiation, and gene regulation [41–43], it appears that the low-affinity general binding is involved in DNA condensation [40]. In chromatin-immunoprecipitation coupled with DNA sequencing ( ChIP-Seq), HU does not reveal any specific binding events. Instead, it displays a uniform binding across the genome presumably reflecting its mostly weak, non-sequence specific binding, thus masking the high-affinity binding in vivo (Fig 3).

A. The circular layout of the E. coli genome (as shown in Fig 1A) additionally depicting the genome occupancy of indicated NAPs in the growth phase. B. The genome occupancy of indicated NAPs in the stationary phase. The genome layout is the same as in A. The genome occupancy of each NAP, determined by ChIP-Seq, is plotted as a histogram (bin size 300 bp) in which the bar height is indicative of relative binding enrichment. The figures were prepared in Circos/0.69–6 using the data from [46, 47].

In strains lacking HU, the nucleoid is "decondensed", consistent with a role of HU in DNA compaction [44]. The following in vitro studies suggest possible mechanisms of how HU might condense and organize DNA in vivo. Not only HU binds stably to distorted DNA with bends, but it also induces flexible bends even in a linear DNA at less than 100 nM concentration (Fig 2A) [45]. In contrast, HU shows the opposite architectural effect on DNA at higher physiologically-relevant concentrations [40, 45]. It forms rigid nucleoprotein filaments causing the straitening of DNA and not the bending (Fig 2B). The filaments can further form a DNA network (DNA bunching) expandable both laterally and medially because of the HU-HU multimerization triggered by the non-sequence-specific DNA binding (Fig 2D) [40].

How are these behaviors of HU relevant inside the cell? The formation of filaments requires high-density binding of HU on DNA, one HU dimer per 9–20 bp DNA [40, 45]. But there is only one HU dimer every

150 bp of the chromosomal DNA based on the estimated abundance of 30,000 HU dimers per cell (4600000 bp /30,000) [16]. This indicates that flexible bends are more likely to occur in vivo. The flexible bending would cause condensation due to a reduction in the persistence length of DNA as shown by magnetic tweezers experiments [45], which allow studying condensation of a single DNA molecule by a DNA binding protein [48]. However, because of the cooperativity, the rigid filaments and networks could form in some regions in the chromosome. The filament formation alone does not induce condensation [45], but DNA networking or bunching can substantially contribute to condensation by bringing distant or nearby chromosome segments together [40].

Integration host factor (IHF) is structurally almost identical to HU [49] but behaves differently from HU in many aspects. Unlike HU, which preferentially binds to a structural motif regardless of the sequence, IHF preferentially binds to a specific DNA sequence even though the specificity arises through the sequence-dependent DNA structure and deformability. The specific binding of IHF at cognate sites bends DNA sharply by >160° bend angle [49]. An occurrence of the cognate sequence motif is about 3000 in the E. coli genome [47]. The estimated abundance of IHF in the growth phase is about 6000 dimers per cell (Table 1). Assuming that one IHF dimer binds to a single motif and nucleoid contains more than one genome equivalent during the exponential growth phase, most of the IHF molecules would occupy specific sites in the genome and likely only condense DNA by inducing sharp bending (Fig 2A).

Besides preferential binding to a specific DNA sequence, IHF also binds to DNA in a non-sequence specific manner with the affinities similar to HU (Table 2). The role of the non-specific binding of IHF in DNA condensation appears to be critical in the stationary phase because the IHF abundance increases by five-fold in the stationary phase (Table 1) and the additional IHF dimers would likely bind the chromosomal DNA non-specifically [16, 50, 51]. Unlike HU, IHF does not form thick rigid filaments at higher concentrations. Instead, its non-specific binding also induces DNA bending albeit the degree of bending is much smaller than that at specific sites and is similar to the flexible bending induced by HU in a linear DNA at low concentrations [52]. In vitro, the bending induced by non-specific binding of IHF can cause DNA condensation and promotes the formation of higher-order nucleoprotein complexes depending on the concentrations of potassium chloride and magnesium chloride [52]. Whether the higher-order DNA organization by IHF occurs in vivo needs further investigation.

A distinguishable feature of histone-like or heat-stable nucleoid structuring protein (H-NS) [53–56] from other NAPs is the ability to switch from the homodimeric form at relatively low concentrations (<1 x 10 −5 M) to an oligomeric state at higher levels [57, 58]. Because of oligomerization properties, H-NS spreads laterally along AT-rich DNA in a nucleation reaction, where high-affinity sites function as nucleation centers [21, 59, 60]. The spreading of H-NS on DNA results in two opposite outcomes depending on the magnesium concentration in the reaction (Fig 2C). At low magnesium concentration (< 2 mM), H-NS forms rigid nucleoprotein filaments whereas it forms inter- and intra-molecular bridges at higher magnesium concentrations (> 5 mM) [61–65]. The formation of rigid filaments results in the straightening of DNA (Fig 2B) with no condensation whereas the bridging causes substantial DNA folding [64]. Analysis of H-NS binding in the genome by ChIP-Seq assays provided indirect evidence for the spreading of H-NS on DNA in vivo. H-NS binds selectively to 458 regions in the genome [46]. Although H-NS has been demonstrated to prefer curved DNA formed by repeated A-tracks in DNA sequences [59, 66] the basis of the selective binding is the presence of a conserved sequence motif found in AT-rich regions (Table 2) [20]. More importantly, the frequent occurrence of the sequence motif within an H-NS binding region that can re-enforce the cooperative protein-protein interactions, and the unusually long length of the binding region are consistent with the spreading of the protein. Which of the two outcomes, the filament formation or DNA bridging, is prevalent in vivo? If the physiological concentration of magnesium inside cells is uniformly low (< 5 mM) [67], H-NS would form rigid nucleoprotein filaments in vivo. Alternatively, if there is an uneven distribution of magnesium in the cell, it could promote both DNA bridging and stiffening but in different regions of the nucleoid.

Furthermore, H-NS is best known as a global gene silencer that preferentially inhibits transcription of horizontally transferred genes and it is the rigid filament that leads to gene silencing [68, 69]. Taken together, it appears that the formation of rigid filaments is the most likely outcome of H-NS-DNA interactions in vivo that leads to gene silencing but does not induce DNA condensation. Consistently, the absence of H-NS does not change the nucleoid volume [70]. However, E. coli may experience high-magnesium concentration under some environmental conditions. In such conditions, H-NS can switch from its filament inducing form to the bridge inducing form that contributes to DNA condensation and organization.

Factor for Inversion Stimulation (Fis) is a sequence-specific DNA binding protein that binds to specific DNA sequences containing a 15-bp symmetric motif (Table 2) [24, 25, 71]. Like IHF, Fis induces DNA bending at cognate sites. The ability to bend DNA is apparent in the structure of Fis homodimer. A Fis homodimer possesses two helix-turn-helix (HTH) motifs, one from each monomer. An HTH motif typically recognizes the DNA major groove. However, the distance between the DNA recognition helices of the two HTH motifs in the Fis homodimer is 25 A°, which is

8 A° shorter than the pitch of a canonical B-DNA, indicating that the protein must bend or twist DNA to bind stably [72, 73]. Consistently, the crystal structure of Fis-DNA complexes shows that the distance between the recognition helices remains unchanged whereas DNA curves in the range of 60–75° bend angles [25]. There are 1464 Fis binding regions distributed across the E. coli genome and a binding motif, identified computationally, matches with the known 15-bp motif [46, 74]. Specific binding of Fis at such sites would induce bends in DNA, thus contribute to DNA condensation by reducing persistence length of DNA. Furthermore, many Fis binding sites occur in tandem such as those in the stable RNA promoters, e.g., P1 promoter of rRNA operon rrnB. The coherent bending by Fis at the tandem sites is likely to create a DNA micro-loop (Fig 2E) that can further contribute to DNA condensation [75].

Besides high-affinity specific binding to cognate sites, Fis can bind to a random DNA sequence (Table 2). The non-specific DNA binding is significant because Fis is as abundant as HU in the growth phase (Table 1). Therefore, most of Fis molecules are expected to bind DNA in a non-sequence specific manner. Magnetic tweezers experiments show that this non-specific binding of Fis can contribute to DNA condensation and organization [76, 77]. Fis causes mild condensation of a single DNA molecule at <1 mM but induces substantial folding through the formation of DNA loops of an average size of

800-bp at >1 mM. The loops in magnetic tweezers experiments are distinct from the micro-loops created by coherent DNA bending at cognate sites, as they require the formation of high-density DNA-protein complexes achieved by sequence-independent binding. Although occurrence of such loops in vivo remains to be demonstrated, high-density binding of Fis may occur in vivo through the concerted action of both specific and non-specific binding. The in-tandem occurrence of specific sites might initiate a nucleation reaction similar to that of H-NS, and then non-specific binding would lead to the formation of localized high-density Fis arrays. The bridging between these localized regions (Fig 2C) can create large DNA loops [77]. Fis is exclusively present in the growth phase and not in the stationary phase [78, 79]. Thus, any role in chromosomal condensation by Fis must be specific to growing cells.

DNA condensation and organization by nucleoid-associated RNAs (naRNAs)

Early studies examining the effect of RNase A treatment on isolated nucleoids indicated that RNA participated in the stabilization of the nucleoid in the condensed state [80]. Moreover, treatment with RNase A disrupted the DNA fibers into thinner fibers, as observed by atomic force microscopy of the nucleoid using the “on-substrate lysis procedure” [81]. These findings demonstrated the participation of RNA in the nucleoid structure, but the identity of the RNA molecule(s) remained unknown until recently [44]. Most of the studies on HU focused on its DNA binding. However, HU also binds to dsRNA and RNA-DNA hybrids with a lower affinity similar to that with a linear dsDNA [82]. Moreover, HU preferentially binds to RNA containing secondary structures and an RNA-DNA hybrid in which the RNA contains a nick or overhang [82, 83]. The binding affinities of HU with these RNA substrates are similar to those with which it binds to distorted DNA. An immunoprecipitation of HU-bound RNA coupled to reverse transcription and microarray (RIP-Chip) study as well as an analysis of RNA from purified intact nucleoids identified nucleoid-associated RNA molecules that interact with HU [44]. Several of them are non-coding RNAs, and one such RNA named naRNA4 (nucleoid-associated RNA 4), is encoded in a repetitive extragenic palindrome (REP325). In a strain lacking REP325, the nucleoid is decondensed as it is in a strain lacking HU [44]. naRNA4 most likely participate in DNA condensation by connecting DNA segments in the presence of HU [84]. Recent studies provide insights into the molecular mechanism of how naRNA4 establishes DNA-DNA connections. The RNA targets regions of DNA containing cruciform structures and forms an RNA-DNA complex that is critical for establishing DNA-DNA connections [85]. Surprisingly, although HU helps in the formation of the complex, it is not present in the final complex, indicating its potential role as a catalyst (chaperone). The nature of the RNA-DNA complex remains puzzling because the formation of the complex does not involve extensive Watson/Crick base pairing but is sensitive to RNase H, which cleaves RNA in an RNA-DNA hybrid. Moreover, the complex binds to an antibody specific to RNA-DNA hybrids.

DNA condensation and organization by supercoiling


Because of its helical structure, a double-stranded DNA molecule becomes topologically constrained in the covalently closed circular form which eliminates the rotation of the free ends [86]. The number of times the two strands cross each other in a topologically constrained DNA is called the linking number (Lk), which is equivalent to the number of helical turns or twists in a circular molecule (Fig 4). The Lk of a topological DNA remains invariant, no matter how the DNA molecule is deformed, as long as neither strand is broken.

A. A linear double-stranded DNA becomes a topologically constrained molecule if the two ends are covalently joined, forming a circle. Rules of DNA topology are explained using such a molecule (ccc-DNA) in which a numerical parameter called the linking number (Lk) defines the topology. Lk is a mathematical sum of two geometric parameters, twist (Tw) and writhe (Wr). A twist is the crossing of two strands, and writhe is coiling of the DNA double helix on its axis that requires bending. Lk is always an integer and remains invariant no matter how much the two strands are deformed. It can only be changed by introducing a break in one or both DNA strands by DNA metabolic enzymes called topoisomerases. B. A torsional strain created by a change in Lk of a relaxed, topologically constrained DNA manifests in the form of DNA supercoiling. A decrease in Lk (Lk<Lk0) induces negative supercoiling whereas an increase in Lk (Lk>Lk0) induces positive supercoiling. Only negative supercoiling is depicted here. For example, if a cut is introduced into a ccc-DNA and four turns are removed before rejoining the two strands, the DNA becomes negatively supercoiled with a decrease in the number of twists or writhe or both. Writhe can adopt two types of geometric structures called plectoneme and toroid. Plectonemes are characterized by the interwinding of the DNA double helix and an apical loop, whereas spiraling of DNA double helix around an axis forms toroids.

The Lk of DNA in the relaxed form is defined as Lk0. For any DNA, Lk0 can be calculated by dividing the length (in bp) of the DNA by the number of bp per helical turn. This is equal to 10.4 bp for the relaxed B-form DNA. Any deviation from Lk0 causes supercoiling in DNA. A decrease in the linking number (Lk<Lk0) creates negative supercoiling (Fig 4) whereas an increase in the linking number (Lk>Lk0) creates positive supercoiling (see [87, 88] for more detail of supercoiling).

The supercoiled state (when Lk is not equal to Lk0) results in a transition in DNA structure that can manifest as a change in the number of twists (negative <10.4 bp/turn, positive >10.4 bp per turn) and/or in the formation of writhes, called supercoils (Fig 4). Thus, Lk is mathematically defined as a sign dependent sum of the two geometric parameters, twist and writhe. A quantitative measure of supercoiling that is independent of the size of DNA molecules is the supercoiling density (σ) where σ = ΔLk/Lk0.

Writhes can adopt two structures plectoneme and solenoid or toroid (Fig 4). A plectonemic structure arises from the interwinding of the helical axis (Fig 4). Toroidal supercoils originate when DNA forms several spirals, around an axis and not intersecting with each other, like those in a telephone cord. The writhes in the plectonemes form are right- and left-handed in negatively or positively supercoiled DNA, respectively. The handedness of the toroidal supercoils is opposite to those of plectonemes. Both plectonemes and toroidal supercoils can be either in a free form or restrained in a bound form with proteins. The best example of the bound toroidal supercoiling in biology is the eukaryotic nucleosome in which DNA wraps around histones (Fig 5) [12].

A. A bacterial genome organizes as plectonemic supercoils. Half of the supercoils are present in free form, and nucleoid-associated proteins (NAPs), shown as colored spheres, restrain the remaining half. B. In contrast, a eukaryotic genome organizes as toroidal supercoils, induced by the wrapping of DNA around histone proteins (orange color). An octamer of histones with 146 wrapped DNA refers to as nucleosome, and the genome organizes into a repeating array of nucleosomes.

The E. coli genome is organized as plectonemic supercoils.

In most bacteria, DNA is present in a supercoiled form. The circular nature of the E. coli chromosome makes it a topologically constrained molecule that is mostly negatively supercoiled with an estimated average supercoiling density (σ) of -0.05 [89]. In the eukaryotic chromatin, DNA is found mainly in the toroidal form that is restrained and defined by histones through the formation of nucleosomes. In contrast, in the E. coli nucleoid, about half of the chromosomal DNA is organized in the form of free, plectonemic supercoils (Fig 5) [90–92]. The remaining DNA is restrained in either the plectonemic form or alternative forms (see section 5.3.3), including but not limited to the toroidal form, by interaction with proteins such as NAPs. Thus, plectonemic supercoils represent effective supercoiling of the E. coli genome that is responsible for its condensation and organization. Both plectonemic and toroidal supercoiling aid in DNA condensation. It is noteworthy that because of the branching of plectonemic structures, it provides less DNA condensation than does the toroidal structure. The same size DNA molecule with equal supercoiling densities is more compact in a toroidal form than in a plectonemic form. In addition to condensing DNA, supercoiling aids in DNA organization. It promotes DNA disentanglement by reducing the probability of catenation [93]. Supercoiling also helps bring two distant sites of DNA in proximity thereby promoting a potential functional interaction between different segments of DNA.

Sources of supercoiling in E. coli.

Three factors contribute to generating and maintaining chromosomal DNA supercoiling in E. coli: (i) activities of topoisomerases, (ii) the act of transcription, and (iii) NAPs.


Topoisomerases are a particular category of DNA metabolic enzymes that create or remove supercoiling by breaking and then re-ligating DNA strands [94]. E. coli possesses four topoisomerases (Table 3). DNA gyrase introduces negative supercoiling in the presence of ATP and removes positive supercoiling in the absence of ATP [95]. Across all forms of life, DNA gyrase is the only topoisomerase that can create negative supercoiling and it is because of this unique ability that bacterial genomes possess free negative supercoils DNA gyrase is found in all bacteria but absent from higher eukaryotes. In contrast, Topo I opposes DNA gyrase by relaxing the negatively supercoiled DNA [96, 97]. There is genetic evidence to suggest that a balance between the opposing activities of DNA gyrase and Topo I are responsible for maintaining a steady-state level of average negative superhelicity in E. coli [98]. Both enzymes are essential for E. coli survival. A null strain of topA, the gene encoding Topo I, survives only because of the presence of suppressor mutations in the genes encoding DNA gyrase. These mutations result in reduced gyrase activity, suggesting that excess negative supercoiling due to the absence of Topo I is compensated by reduced negative supercoiling activity of DNA gyrase. Topo III is dispensable in E. coli and is not known to have any role in supercoiling in E. coli [99]. The primary function of Topo IV is to resolve sister chromosomes. However, it has been shown to also contribute to the steady-state level of negative supercoiling by relaxing negative supercoiling together with Topo I [100, 101].


A twin supercoiling domain model proposed by Liu and Wang argued that unwinding of DNA double helix during transcription induces supercoiling in DNA as shown in Fig 6 [102]. According to their model, transcribing RNA polymerase (RNAP) sliding along DNA forces the DNA to rotate on its helical axis. A hindrance in the free rotation of DNA might arise due to a topological constraint, causing the DNA in front of RNAP to become over-twisted (positively supercoiled) and the DNA behind RNAP would become under-twisted (negatively supercoiled) (Fig 6). It has been found that a topological constraint is not needed because RNAP generates sufficient torque that causes supercoiling even in a linear DNA template [103]. If DNA is already negatively supercoiled, this action relaxes existing negative supercoils before causing a buildup of positive supercoils ahead of RNAP and introduces more negative supercoils behind RNAP. In principle, DNA gyrase and Topo I should remove excess positive and negative supercoils respectively but if the RNAP elongation rate exceeds the turnover of the two enzymes, transcription contributes to the steady-state level of supercoiling.

A. An example of topologically constrained DNA. A grey bar represents a topological constraint, e.g. a protein or a membrane anchor. B. Accommodation of RNA polymerase for transcription initiation results in the opening of the DNA double helix. C. An elongating RNA polymerase complex cannot rotate around the helical axis of DNA. Therefore, removal of helical turns by RNA polymerase causes overwinding of the topologically constrained DNA ahead and underwinding of the DNA behind, generating positively and negatively supercoiled DNA, respectively. Supercoiling can manifest as either change in the numbers of twists as shown in C or plectonemic writhe as shown in D.

Control of supercoiling by NAPs.

In the eukaryotic chromatin, DNA is rarely present in the free supercoiled form because nucleosomes restrain almost all negative supercoiling through the tight binding of DNA to histones. Similarly, in E. coli, nucleoprotein complexes formed by NAPs restrain half of the supercoiling density of the nucleoid [89, 92]. In other words, if a NAP dissociates from a nucleoprotein complex, the DNA would adopt the free, plectonemic form. DNA binding of HU, Fis, and H-NS has been experimentally shown to restrain negative supercoiling in a relaxed but topologically constrained DNA [104–108]. They can do so either by changing the helical pitch of DNA or generating toroidal writhes by DNA bending and wrapping (Fig 2). Alternatively, NAPs can preferentially bind to and stabilize other forms of the underwound DNA such as cruciform structures and branched plectonemes. Fis has been reported to organize branched plectonemes through its binding to cross-over regions (Fig 5) and HU preferentially binds to cruciform structures [108].

NAPs also regulate DNA supercoiling indirectly. Fis can modulate supercoiling by repressing the transcription of the genes encoding DNA gyrase [109]. There is genetic evidence to suggest that HU controls supercoiling levels by stimulating DNA gyrase and reducing the activity of Topo I [110, 111]. In support of the genetic studies, HU was shown to stimulate DNA gyrase-catalyzed decatenation of DNA in vitro [112]. It is unclear mechanistically how HU modulates the activities of the gyrase and Topo I. HU might physically interact with DNA gyrase and Topo I or DNA organization activities of HU such as DNA bending may facilitate or inhibit the action of DNA gyrase and Topo I respectively.

Plectonemic supercoils organize into multiple topological domains.

One of the striking features of the nucleoid is that plectonemic supercoils are organized into multiple topological domains (Fig 7) [113]. In other words, a single cut in one domain will only relax that domain and not the others. A topological domain forms because of a supercoiling-diffusion barrier. Independent studies employing different methods have reported that the topological domains are variable in size ranging from 10–400 kb [91, 113, 114]. Random placement of barriers commonly observed in these studies seems to explain the wide variability in the size of domains.

A. An illustration of a single topological domain of a supercoiled DNA. A single double-stranded cut anywhere would be sufficient to relax the supercoiling tension of the entire domain. B. An illustration of multiple topological domains in a supercoiled DNA molecule. A presence of supercoiling-diffusion barriers segregates a supercoiled DNA molecule into multiple topological domains. Hypothetical supercoiling diffusion barriers are represented as green spheres. As a result, a single double-stranded cut will only relax one topological domain and not the others. Plectonemic supercoils of DNA within the E. coli nucleoid are organized into several topological domains, but only four domains with a different number of supercoils are shown for simplicity.

Although identities of domain barriers remain to be established, possible mechanisms responsible for the formation of the barriers include: (i) A domain barrier could form when a protein with an ability to restrain supercoils simultaneously binds to two distinct sites on the chromosome forming a topologically isolated DNA loop or domain. It has been experimentally demonstrated that protein-mediated looping in supercoiled DNA can create a topological domain [115, 116]. NAPs such as H-NS and Fis are potential candidates, based on their DNA looping abilities and the distribution of their binding sites. (ii) Bacterial interspersed mosaic elements (BIMEs) also appear as potential candidates for domain barriers. BIMEs are palindromic repeats sequences that are usually found between genes. A BIME has been shown to impede supercoiling diffusion in a synthetically designed topological cassette inserted in the E. coli chromosome [117]. There are

600 BIMEs distributed across the genome, possibly dividing the chromosome into 600 topological domains [118]. (iii) Barriers could also result from the attachment of DNA to the cell membrane through a protein that binds to both DNA and membrane or through nascent transcription and the translation of membrane-anchored proteins (see section 5.2). (iv) Transcription activity can generate supercoiling-diffusion barriers. An actively transcribing RNAP has been shown to block the dissipation of plectonemic supercoils, thereby forming a supercoiling-diffusion barrier [119–121].

Spatial organization of the nucleoid

Chromosomal interaction domains.

In recent years, the advent of a molecular method called chromosome conformation capture (3C) has allowed studying a high-resolution spatial organization of chromosomes in both bacteria and eukaryotes [122]. 3C and its version that is coupled with deep sequencing (Hi-C) [123] determine physical proximity, if any, between any two genomic loci in 3D space (Fig 8A and 8B). A high-resolution contact map of bacterial chromosomes including the E. coli chromosome has revealed that a bacterial chromosome is segmented into many highly self-interacting regions called chromosomal interaction domains (CIDs) (Fig 8B) [124–126]. CIDs are equivalent to topologically associating domains (TADs) observed in many eukaryotic chromosomes [127], suggesting that the formation of CIDs is a general phenomenon of genome organization. Two characteristics define CIDs or TADs. First, genomic regions of a CID physically interact with each other more frequently than with the genomic regions outside that CID or with those of a neighboring CID. Second, the presence of a boundary between CIDs that prevents physical interactions between genomic regions of two neighboring CIDs.

A. Chromosome conformation capture (3C) methods probe 3D genome organization by quantifying physical interactions between genomic loci that are nearby in 3D-space but may be far away in the linear genome. A genome is cross-linked with formaldehyde to preserve physical contacts between genomic loci. Subsequently, the genome is digested with a restriction enzyme. In the next step, a DNA ligation is carried out under diluted DNA concentrations to favor intra-molecular ligation (between cross-linked fragments that are brought into physical proximity by 3D genome organization). A frequency of ligation events between distant DNA sites reflects a physical interaction. In the 3C method, ligation junctions are detected by the semi-quantitative PCR amplification in which amplification efficiency is a rough estimate of pairwise physical contact between genomic regions of interests and its frequency. The 3C method probes a physical interaction between two specific regions identified a priori, whereas its Hi-C version detects physical interactions between all possible pairs of genomic regions simultaneously. In the Hi-C method, digested ends are filled in with a biotinylated adaptor before ligation. Ligated fragments are sheared and then enriched by a biotin-pull down. Ligation junctions are then detected and quantified by the paired-end next-generation sequencing methods. B. Hi-C data are typically represented in the form of a two-dimensional matrix in which the x-axis and y-axis represent the genomic coordinates. The genome is usually divided into bins of a fixed size, e.g., 5-kb. The size of bins essentially defines the contact resolution. Each entry in the matrix, mij, represents the number of chimeric sequencing reads mapped to genomic loci in bins i and j. A quantification of the reads (represented as a heatmap) denotes the relative frequency of contacts between genomic loci of bins i and j. A prominent feature of the heatmap is a diagonal line that appears due to more frequent physical interaction between loci that are very close to each other in the linear genome. The intensity as we move away from this diagonal line represents the relative frequency of physical interaction between loci that are far away from each other in the linear genome. Triangles of high-intensity along the diagonal line represent highly self-interacting chromosomal interaction domains (CIDs) that are separated by a boundary region that consists of a smaller number of interactions. C. In many bacterial species including E. coli, it appears that supercoiled topological domains organize as CIDs. Plectonemic supercoiling promotes a high level of interaction among genomic loci within a CID, and a plectoneme-free region (PFR), created due to high transcription activity, acts as a CID boundary. Nucleoid-associated proteins, depicted as closed circles, stabilize the supercoiling-mediated interactions. The actively transcribing RNA polymerase (depicted as a green sphere) in the PFR blocks dissipation of supercoiling between the two domains thus acts as a supercoiling diffusion barrier. The size of the CIDs ranges between 30–400 kb. Several triangles (CIDs) merge to form a bigger triangle that represents a macrodomain. In other words, CIDs of a macrodomain physically interact with each other more frequently than with CIDs of a neighboring macrodomain or with genomic loci outside of that macrodomain. A macrodomain may comprise of several CIDs. For simplicity, a macrodomain comprising of only two CIDs is shown.

The E. coli chromosome was found to consist of 31 CIDs in the growth phase [124]. The size of the CIDs ranged from 40 to

300 kb. It appears that a supercoiling-diffusion barrier responsible for segregating plectonemic DNA loops into topological domains functions as a CID boundary in E. coli and many other bacteria. In other words, the presence of a supercoiling-diffusion barrier defines the formation of CIDs. Findings from the Hi-C probing of chromosomes in E. coli [124], Caulobacter crescentus [125], and Bacillus subtilis [126] converge on a model that CIDs form because plectonemic looping together with DNA organization activities of NAPs promotes physical interactions among genomic loci, and a CID boundary consists of a plectoneme-free region (PFR) that prevents these interactions (Fig 8C). A PFR is created due to high transcription activity because the helical unwinding of DNA by actively transcribing RNAP restrains plectonemic supercoils. As a result, supercoil dissipation is also blocked, creating a supercoiling-diffusion barrier. Indirect evidence for this model comes from an observation that CIDs of bacterial chromosomes including the E. coli chromosome display highly transcribed genes at their boundaries, indicating a role of transcription in the formation of a CID boundary [124, 125]. More direct evidence came from a finding that the placement of a highly transcribed gene at a position where no boundary was present created a new CID boundary in the C. crescentus chromosome [125]. However, not all CID boundaries correlated with highly transcribed genes in the E. coli chromosome suggesting that other unknown factors are also responsible for the formation of CID boundaries and supercoiling diffusion barriers.


Plectonemic DNA loops organized as topological domains or CIDs appear to coalesce further to form large spatially distinct domains called macrodomains [128, 129]. In E. coli, macrodomains were initially identified as large segments of the genome whose DNA markers localized together (co-localized) in fluorescence in situ hybridization (FISH) studies. A large genomic region (

1-Mb) covering oriC (origin of chromosome replication) locus co-localized and was called Ori macrodomain. Likewise, a large genomic region (

1-Mb) covering the replication terminus region (ter) co-localized and was called Ter macrodomain. Macrodomains were later identified based on how frequently pairs of lambda att sites that were inserted at various distant locations in the chromosome recombined with each other. In this recombination-based method, a macrodomain was defined as a large genomic region whose DNA sites can primarily recombine with each other, but not with outside of that macrodomain. The recombination-based method confirmed the Ori and Ter macrodomains that were identified in FISH studies and identified two additional macrodomains [7, 130].

The two additional macrodomains were formed by the additional

1-Mb regions flanking the Ter and were referred to as Left and Right. These four macrodomains (Ori, Ter, Left, and Right) comprised most of the genome, except for two genomic regions flanking the Ori (Fig 1). These two regions (NS-L and NS-R) were more flexible and non-structured compared to a macrodomain as DNA sites in them recombined with DNA sites located in macrodomains on both sides. The genetic position of oriC appears to dictate the formation of macrodomains, because repositioning of oriC by genetic manipulation results in the reorganization of macrodomains. For example, genomic regions closest to the oriC always behave as an NS regardless of DNA sequence and regions further away behave as macrodomains [131].

The Hi-C technique further confirmed a hierarchical spatial organization of CIDs in the form of macrodomains [124]. In other words, CIDs of a macrodomain physically interacted with each other more frequently than with CIDs of a neighboring macrodomain or with genomic loci outside of that macrodomain (Fig 8B and 8C). The Hi-C data showed that the E. coli chromosome was partitioning into two distinct domains. The region surrounding ter formed an insulated domain that overlapped with the previously identified Ter macrodomain. DNA-DNA contacts in this domain occurred only in the range of up to

280 kb. The rest of the chromosome formed a single domain whose genomic loci exhibited contacts in the range of >280-kb. While most of the contacts in this domain were restricted to a maximum distance of

500 kb, there were two loose regions whose genomic loci formed contacts at even greater distances (up to

1 Mb). These loose regions corresponded to the previously identified flexible and less-structured regions (NS). The boundaries of the insulated domain encompassing ter and the two loose regions identified by the Hi-C method segmented the entire chromosome into six regions that correspond with the four macrodomains and two NS regions defined by recombination-based assays. Thus, the two approaches were in good agreement with one another.

Proteins that drive macrodomain formation

A search for protein(s) responsible for macrodomain formation led to the identification of Macrodomain Ter protein (MatP) [27]. MatP almost exclusively binds in the Ter macrodomain by recognizing a 13-bp motif called the macrodomain ter sequence (matS) (Table 2). There are 23 matS sites present in the Ter macrodomain, on average there is one site every 35-kb. Further evidence of MatP binding in the Ter macrodomain comes from fluorescence imaging of MatP. Discrete MatP foci were observed that co-localized with Ter macrodomain DNA markers [27]. A strong enrichment of ChIP-Seq signal in the Ter macrodomain also corroborates the preferential binding of MatP to this macrodomain (Fig 9).

A circular layout of the E. coli genome depicting genome-wide occupancy of MatP and MukB in E. coli. The innermost circle depicts the E. coli genome. The regions of the genome which organize as spatial domains (macrodomains) in the nucleoid are indicated as colored bands. The genome occupancy of each protein, determined by ChIP-Seq, is plotted as a histogram in outside circles (bin size 300 bp) in which the bar height is indicative of relative binding enrichment. The figure was prepared in circos/0.69–6 using the processed ChIP-Seq data from [132].

MatP condenses DNA in the Ter macrodomain because the lack of MatP increased the distance between two fluorescent DNA markers located 100-kb apart in the Ter macrodomain. Furthermore, MatP is a critical player in insulating the Ter macrodomain from the rest of the chromosome [124]. It promotes DNA-DNA contacts within the Ter macrodomain but prevents contacts between the DNA loci of Ter domain and those of flanking regions. How does MatP condense DNA and promote DNA-DNA contacts? The experimental results are conflicting. MatP can form a DNA loop between two matS sites in vitro and its DNA looping activity depends on MatP tetramerization. Tetramerization occurs via coiled-coil interactions between two MatP molecules bound to DNA [133]. One obvious model based on in vitro results is that MatP promotes DNA-DNA contacts in vivo by bridging matS sites (Fig 10A). However, although MatP connected distant sites in Hi-C studies, it did not specifically connect the matS sites [124]. Furthermore, a MatP mutant that was unable to form tetramers behaved like wild-type. These results argue against the matS bridging model for Ter organization, leaving the mechanism of MatP action elusive. One possibility is that MatP spreads to nearby DNA segments from its primary matS binding site and bridge distant sites via a mechanism that does not depend on the tetramerization.

A. A matS-bridging model for DNA organization in the Ter macrodomain by MatP. MatP recognizes a 13-bp signature DNA sequence called matS that is present exclusively in the Ter macrodomain. There are 23 matS sites separated by one another by an average of 35-kb. MatP binds to a matS site as a dimer, and the tetramerization of the DNA-bound dimers bridges matS sites forming large DNA loops. B. The architecture of the E. coli MukBEF complex. The complex is formed by protein-protein interactions between MukB (blue), MukF (dark orange) and MukE (light orange). MukB, which belongs to the family of structural maintenance of chromosomes (SMCs) proteins, forms a dimer (monomers are shown by dark and light blue colors) consisting of an ATPase head domain and a 100 nm long intramolecular coiled-coil with a hinge region in the middle. Because of the flexibility of the hinge region, MukB adopts a characteristic V-shape of the SMC family. MukF also tends to exist as a dimer because of the strong dimerization affinity between monomers [134, 135]. The C-terminal domain of MukF can interact with the head domain of MukB while its central domain can interact with MukE. Two molecules of MukE and one molecule of MukF associate with each other independent of MukB to form a trimeric complex (MukE2F). Since MukF tends to exist in a dimeric form, the dimerization of MukF results in an elongated hexameric complex (MukE2F)2 [134]. In the absence of ATP, the (MukE2F)2 complex binds to the MukB head domains through the C-terminal domain of MukF to form a symmetric MukBEF complex (shown on the left). The stoichiometry of the symmetric complex is B2(E2F)2. The ATP binding between the MukB head domains forces the detachment of one MukF molecule and two MukE molecules [134, 136]. As a result, an asymmetric MukBEF complex of the stoichiometry B2(E2F)1 is formed. Since MukF readily dimerizes, the MukF dimerization can potentially join two ATP-bound asymmetric molecules resulting in the formation of a dimer of dimers with the stoichiometry of B4(E2F)2 (shown on the right). The stoichiometry of the MukBEF complex in vivo is estimated to be B4(E2F)2 suggesting that a dimer of dimers is the functional unit in vivo [137]. C. A model for loop extrusion by a MukBEF dimer of dimers. A dimer of dimer loads onto DNA (depicted as a grey line) through DNA binding domains of MukB. MukB has been shown to bind DNA via its hinge region and the top region of its head domain [134, 138]. The translocation of the complex away from its loading site then extrudes DNA loops. The loops are extruded in a rock-climbing manner by the coordinated opening and closing of the MukBEF ring through the MukB head disengagement that occurs due to coordinated ATP hydrolysis in the two dimers [137]. Dark and light blue circles represent ATP binding and hydrolysis events respectively. MukE is not shown in the complex for simplicity.


MukB belongs to a family of ATPases called structural maintenance of chromosome proteins (SMCs), which participate in higher-order chromosome organization in eukaryotes [139]. Two MukB monomers associate via continuous antiparallel coiled-coil interaction forming a 100-nm long rigid rod. A flexible hinge region occurs in the middle of the rod [140, 141]. Due to the flexibility of the hinge region, MukB adopts a characteristic V-shape of the SMC family (Fig 10B). The non-SMC subunits associating with MukB are MukE and MukF. The association closes the V formation, resulting in large ring-like structures (Fig 10B). MukE and MukF are encoded together with MukB in the same operon in E. coli [142]. The deletion of either subunit results in the same phenotype suggesting that the MukBEF complex is the functional unit in vivo [137]. DNA binding activities of the complex reside in the MukB subunit, whereas MukE and MukF modulate MukB activity.

MukBEF complex, together with Topo IV, is required for decatenation and repositioning of newly replicated oriCs [132, 143–146] The role of MukBEF is not restricted during DNA replication [147]. It organizes and condenses DNA even in non-replicating cells. The recent high-resolution chromosome conformation map of the MukB-depleted E. coli strain reveals that MukB participates in the formation of DNA-DNA interactions on the entire chromosome, except in the Ter macrodomain [124]. How is MukB prevented from acting in the Ter macrodomain? MatP physically interacts with MukB, thus preventing MukB from localizing to the Ter macrodomain [132]. This is evident in the DNA binding of MatP and MukB in the Ter macrodomain. MatP DNA binding is enriched in the Ter macrodomain, whereas MukB DNA binding is reduced compared to the rest of the genome (Fig 9). Furthermore, in a strain already lacking MatP, the absence of MukB causes a reduction in DNA contacts throughout the chromosome, including the Ter macrodomain [124]. This result agrees with the view that MatP displaces MukB from the Ter domain.

How does the MukBEF complex function to organize the E. coli chromosome? According to the current view, SMC complexes organize chromosomes by extruding DNA loops [148]. SMC complexes translocate along DNA to extrude loops in a cis-manner (on the same DNA molecule), wherein the size of loops depends on the processivity of the complex. SMC complexes from different organisms differ in the mechanism of loop extrusion [148]. Single molecule fluorescence microscopy of MukBEF in E. coli suggests that the minimum functional unit in vivo is a dimer of dimers (Fig 10B) [137]. This unit is formed by joining of two ATP-bound MukBEF complexes through MukF-mediated dimerization. MukBEF localizes in the cell as 1–3 clusters that are elongated parallel to the long axis of the cell. Each cluster contains an average

8–10 dimers of dimers. According to the current model, the MukBEF extrudes DNA loops in a “rock-climbing” manner (Fig 10C) [137, 149]. A dimer of the dimers releases one segment of DNA and capture a new DNA segment without dissociating from the chromosome. Besides DNA looping, a link between negative supercoiling and in vivo MukBEF function together with the ability of the MukB subunit to constrain negative supercoils in vitro suggests that MukBEF organizes DNA by generating supercoils [150–152]. A full understanding of a molecular mechanism of MukBEF action in vivo warrants further investigations.

Spatial organization of the nucleoid by NAPs and naRNAs.

In addition to contributing to the chromosome compaction by bending, bridging, and looping DNA at a smaller scale (

1-kb), NAPs participate in DNA condensation and organization by promoting long-rang DNA-DNA contacts. Two NAPs, Fis and HU, emerged as the key players in promoting long-range DNA-DNA contacts that occur throughout the chromosome [124]. It remains to be studied how DNA organization activities of Fis and HU that are well understood at a smaller scale (

1-kb) results in the formation of long-range DNA-DNA interactions. Nonetheless, some of the HU-mediated DNA interactions require the presence of naRNA4 [84]. naRNA4 also participates in making long-range DNA contacts. HU catalyzes some of the contacts, not all, suggesting that RNA participates with other NAPs in forming DNA contacts. HU also appears to act together with MukB to promote long-range DNA-DNA interactions. This view is based on observations that the absence of either HU or MukB caused a reduction in the same DNA-DNA contacts [124]. It is unclear how MukB and HU potentially act together in promoting DNA-DNA interactions. The two proteins may interact physically. Alternatively, while MukBEF extrudes large DNA loops, HU condenses and organizes those loops by mechanisms described in section 2.1.1.

Spatial organization of the nucleoid by functional relatedness of genes.

There are reports that functionally-related genes of E. coli are physically together in 3-D space within the chromosome even though they are far apart by genetic distance. Spatial proximity of functionally-related genes not only makes the biological functions more compartmentalized and efficient but would also contribute to the folding and spatial organization of the nucleoid. A recent study using fluorescent markers for detection of specific DNA loci examined pairwise physical distances between the seven rRNA operons that are genetically separated from each other (by as much as two million bp). It reported that all of the operons, except rrnC, were in physical proximity [153, 154]. Surprisingly, 3C-seq studies did not reveal the physical clustering of rrn operons [124], contradicting the results of the fluorescence-based study. Therefore, further investigation is required to resolve these contradicting observations. In another example, GalR forms an interaction network of GalR binding sites that are scattered across the chromosome [155]. GalR is a transcriptional regulator of the galactose regulon comprised of genes encoding enzymes for transport and metabolism of the sugar D-galactose [156]. GalR exists in only one to two foci in cells [155] and can self-assemble into large ordered structures [157]. Therefore, it appears that DNA-bound GalR multimerizes to form long-distance interactions.

Nucleoid global shape and structure

The nucleoid is a helical ellipsoid, radially confined in the cell.

Conventional transmission electron microscopy (TEM) of chemically fixed E. coli cells portrayed the nucleoid as an irregularly shaped organelle. However, wide-field fluorescence imaging of live nucleoids in 3D revealed a discrete, ellipsoid shape (Fig 11) [4, 8–10]. The overlay of a phase-contrast image of the cell and the fluorescent image of the nucleoid showed a close juxtaposition only in the radial dimension along its entire length of the nucleoid to the cell periphery. This finding indicates the radial confinement of the nucleoid [8]. A detailed examination of the 3D fluorescence image after cross-sectioning perpendicular to its long axis further revealed two global features of the nucleoid: curvature and longitudinal, high-density regions. Examining the chirality of the centerline of the nucleoid by connecting the center of intensity of each cross-section showed that the overall nucleoid shape is curved (Fig 11A) [158]. The fluorescence intensity distribution in the cross-sections revealed a density substructure, consisting of curved, high-density regions or bundles at the central core, and low-density regions at the periphery (Fig 11B) [8, 9]. One implication of the radial confinement is that it determines the curved shape of the nucleoid. According to one model, the nucleoid is forced to bend because it is confined into a cylindrical E. coli cell whose radius is smaller than its bendable length (persistence length) [8]. This model was supported by observations that removal of the cell wall or inhibition of cell wall synthesis increased the radius of the cell and resulted in a concomitant increase in the helical radius and a decrease in the helical pitch in the nucleoid [8].

A. A cartoon of E. coli cell with a curved nucleoid (dark grey). A curved centroids path, denoted by black line, emphasizes the curved shape of the nucleoid. The cartoon is based on a figure in [8]. B. Cross-sectioning of the E. coli nucleoid visualized by HU-mCherry. Fluorescence intensity is taken as a proxy for DNA density and is represented by blue to red in increasing order. Data were taken from [9].

Connections between nucleoid and cell membrane.

An expansion force due to connections between DNA within the nucleoid and cell membrane (DNA-membrane connections) appears to function in opposition to condensation forces to maintain an optimal condensation level of the nucleoid. Cell-fractionation and electron microscopy studies first indicated the possibility of DNA-membrane connections [159, 160]. There are now several known examples of DNA-membrane connections. Transertion is a mechanism of concurrent transcription, translation, and insertion of nascent membrane proteins that forms transient DNA-membrane contacts [161]. Transertion of two membrane proteins, LacY and TetA, has been demonstrated to cause the repositioning of chromosomal loci toward the membrane [162]. Another mechanism of nucleoid-membrane connections is through a direct contact between membrane-anchored transcription regulators and their target sites in the chromosome. One example of such as transcription regulator in E. coli is CadC. CadC contains a periplasmic sensory domain and a cytoplasmic DNA binding domain. Sensing of an acidic environment by its periplasmic sensory domain stimulates the DNA binding activity of CadC, which then activates transcription of its target genes [163]. The membrane-localization of genes regulated by a membrane-anchored transcription regulator is yet to be demonstrated. Nonetheless, activation of target genes in the chromosome by these regulators is expected to result in a nucleoid-membrane contact albeit it would be a dynamic contact. Besides these examples, the chromosome is also specifically anchored to the cell membrane through protein-protein interaction between DNA-bound proteins, e.g., SlmA and MatP, and the divisome [164, 165].

Since membrane-protein encoding genes are distributed throughout the genome, dynamic DNA-membrane contacts through transertion can act as a nucleoid expansion force. This expansion force would function in opposition to condensation forces to maintain an optimal condensation level. The formation of highly condensed nucleoids upon the exposure of E. coli cells to chloramphenicol, which blocks translation, provides support for the expansion force of transient DNA-membrane contacts formed through transertion. The round shape of overly-condensed nucleoids after chloramphenicol treatment also suggests a role for transertion-mediated DNA-membrane contacts in defining the ellipsoid shape of the nucleoid.

Growth-phase dependent nucleoid dynamics

The nucleoid reorganizes in stationary phase cells suggesting that the nucleoid structure is highly dynamic, determined by the physiological state of cells. A comparison of high-resolution contact maps of the nucleoid revealed that the long-range contacts in the Ter macrodomain increased in the stationary phase, compared to the growth phase [124]. Furthermore, CID boundaries in the stationary phase were different from those found in the growth phase. Finally, nucleoid morphology undergoes a massive transformation during prolonged stationary phase [166] the nucleoid exhibits ordered, toroidal structures [167].

Growth-phase specific changes in nucleoid structure could be brought about by a change in levels of nucleoid-associated DNA architectural proteins (the NAPs and the Muk subunits), supercoiling, and transcription activity. The abundance of NAPs and the Muk subunits changes according to the bacterial growth cycle. Fis and the starvation-induced DNA binding protein Dps, another NAP, are almost exclusively present in the growth phase and stationary phase respectively. Fis levels rise upon entry into the exponential phase and then rapidly decline while cells are still in the exponential phase, reaching levels that are undetectable in the stationary phase [168]. While Fis levels start to decline, levels of Dps start to rise and reach a maximum in the stationary phase [16]. A dramatic transition in the nucleoid structure observed in the prolonged stationary phase has been mainly attributed to Dps. It forms DNA/ crystalline assemblies that act to protect the nucleoid from DNA damaging agents present during starvation [167].

HU, IHF, and H-NS are present in both the growth phase and the stationary phase [16]. However, their abundance changes significantly such that HU and Fis are the most abundant NAPs in the growth phase, whereas IHF and Dps become the most abundant NAPs in the stationary phase [16]. HUαα is the predominant form in the early exponential phase, whereas the heterodimeric form HUαß predominates in the stationary phase, with minor amounts of homodimers [169]. This transition has functional consequences regarding nucleoid structure because HUαα and HUαß appear to organize and condense DNA differently both form filaments, but only HUαα can bring multiple DNA segments together (DNA bunching) to form a DNA network [40]. The copy number of MukB increases two-fold in the stationary phase [136, 147]. An increase in the number of MukB molecules could have an influence on the processivity of the MukBEF complex as a DNA loop extruding factor resulting in larger or a greater number of the loops.

Supercoiling can act in a concerted manner with DNA architectural proteins to reorganize the nucleoid. The overall supercoiling level decreases in the stationary phase, and supercoiling exhibits a different pattern at the regional level [170]. Changes in supercoiling can alter the topological organization of the nucleoid. Furthermore, because a chromosomal region of high transcription activity forms a CID boundary, changes in transcription activity during different growth phases could alter the formation of CID boundaries, and thus the spatial organization of the nucleoid. It is possible that changes in CID boundaries observed in the stationary phase could be due to the high expression of a different set of genes in the stationary phase compared to the growth phase [124].

Nucleoid structure and gene expression

NAPs and gene expression.

The E. coli chromosome structure and gene expression appear to influence each other reciprocally. On the one hand, a correlation of a CID boundary with high transcription activity indicates that chromosome organization is driven by transcription. On the other hand, the 3D structure of DNA within nucleoid at every scale may be linked to gene expression. First, it has been shown that a reorganization of the 3D architecture of the nucleoid in E. coli can dynamically modulate cellular transcription pattern [171]. A mutant of HUα made the nucleoid very much condensed by increased positive superhelicity of the chromosomal DNA. Consequently, many genes were repressed, and many quiescent genes were expressed. Besides, there are many specific cases in which protein-mediated local architectural changes (Fig 2) alter gene transcription. For example, the formation of rigid nucleoprotein filaments by H-NS blocks RNAP access to the promoter thus prevent gene transcription [172]. Through gene silencing, H-NS acts as a global repressor preferentially inhibiting transcription of horizontally transferred genes [20, 46]. In another example, the specific binding of HU at the gal operon facilitates the formation of a DNA loop that keeps the gal operon repressed in the absence of the inducer [173]. The topologically distinct DNA micro-loop created by coherent bending of DNA by Fis at stable RNA promoters activates transcription. DNA bending by IHF differentially controls transcription from the two tandem promoters of the ilvGMEDA operon in E. coli [174, 175]. It is noteworthy that specific topological changes by NAPs not only regulate gene transcription but are also involved in other processes such as DNA replication initiation, recombination, and transposition. In contrast to specific gene regulation, how higher-order chromosome structure and its dynamics influences gene expression globally at the molecular level remains to be worked out.

DNA supercoiling and gene expression.

A two-way interconnectedness exists between DNA supercoiling and gene transcription [176]. Negative supercoiling of the promoter region can stimulate transcription by facilitating the promoter melting and by increasing the DNA binding affinity of a protein regulator. Stochastic bursts of transcription appear to be a general characteristic of highly expressed genes, and supercoiling levels of the DNA template contributes to transcriptional bursting [177]. According to the twin supercoiling domain model, transcription-induced supercoiling can influence transcription of other nearby genes through a supercoiling relay. One such example is the activation of the leu-500 promoter [176]. Supercoiling not only mediates gene-specific changes, but it also mediates large-scale changes in gene expression [178]. Global gene expression analysis in response to a loss of supercoiling identified 306 supercoiling-sensitive genes (SSGs) in E. coli [178]. SSGs are present throughout the chromosome and encode proteins with diverse functions. The topological organization of the nucleoid could allow independent expression of SSGs in different topological domains. A genome-scale map of unrestrained supercoiling showed that genomic regions have different steady-state supercoiling densities, indicating that the level of supercoiling differs in individual topological domains [170]. As a result, a change in supercoiling can result in domain-specific expression levels of SSGs, depending on the degree of supercoiling in each domain.

The effect of supercoiling on gene expression can be mediated by NAPs that directly or indirectly influence supercoiling. The effect of HU on gene expression appears to involve a change in supercoiling and perhaps a higher-order DNA organization [179]. A positive correlation between DNA gyrase binding and upregulation of the genes caused by the absence of HU suggests that changes in supercoiling are responsible for differential expression [47]. HU was also found to be responsible for a positional effect on gene expression by insulating transcriptional units by constraining transcription-induced supercoiling [180]. Point mutations in HUα dramatically changed the gene expression profile of E. coli, altering its morphology, physiology, and metabolism [171]. As a result, the mutant strain was more invasive of mammalian cells [181]. This dramatic effect was concomitant with nucleoid compaction and increased positive supercoiling. In contrast to the wild-type dimer, the mutant protein is an octamer that wraps DNA on its surface in a right-handed manner, restraining positive supercoils as opposed to wild-type HU [182]. These studies show that amino acid substitutions in HU can have a dramatic effect on nucleoid structure, and consequently, on gene expression, which in turn results in significant phenotypic changes.

Although HU appears to control gene expression by modulating supercoiling density, the exact molecular mechanism remains unknown. Since MukB and HU have emerged as critical players in long-range DNA interactions, it will be worthwhile to compare the effect of each of these two proteins on global gene expression. The impact of MukB on gene expression is yet to be analyzed.

Current state and future perspective

Although studies of isolated nucleoids began in the 70s, a highly resolved and complete structure of the bacterial nucleoid is still not available. It remains a significant challenge even at 10 kb resolution, to accurately describe the hierarchical organization of chromosomal DNA resulting in the formation of a functional nucleoid. A single nucleotide resolution is even more difficult. However, we have significantly improved our understanding of the structural properties of the nucleoid. In summary, chromosomal DNA is a supercoiled molecule that folds into plectonemic loops. The supercoiling density is created and maintained by two topoisomerases (DNA gyrase and Topo I) with opposing functions. The NAPs organize DNA through DNA bending, bridging and looping activities. This organization together with transcriptional activities leads to higher-order organization of DNA into topologically and spatially distinct chromosomal interaction domains (microdomains) that are further organized as large size macrodomains. NAPs and macrodomain-specific proteins MukBEF and MatP appear to collaborate in the spatial organization. Supercoiling and topological domain formation also contribute to the proper regulation of gene transcription. The inherent bending of DNA due to Brownian motion and plectonemic supercoiling and DNA organization activities of NAPs and MukBEF provide condensation forces to sufficiently reduce the volume of the chromosomal DNA to fit within the cell volume. This process imparts a hierarchical organization, and the condensed chromosomal DNA results in a functional nucleoid with a helical ellipsoid shape.

Advances in imaging technologies can pave the way for direct visualization of higher-order nucleoid structure in vivo at high resolution. Recent electron microscopy methods that use high-pressure freezing and cryo-sectioning to preserve native ultrastructure [183], use improved DNA detection techniques for a sharp contrast [184], and allow 3D visualization with the cross-sectioning of cells using ion beams followed by 3D reconstruction [185] hold promises in driving new understanding of the 3D arrangement of the entire chromosomal DNA at high resolution. Although many issues regarding nucleoid fine structure and function remain to be resolved, some of the most interesting are the following:

  1. Is there a growth/environment dependent defined 3D structure of the DNA within the nucleoid?
  2. How are environmental cues transmitted to alter nucleoid structure?
  3. What is the nature of the supercoiling diffusion barriers that segregate the nucleoid into independent topological domains?
  4. What are the molecular mechanisms by which distant DNA-DNA contacts are made?
  5. What are the molecular mechanisms for the attachment of the membrane to nucleoid?

Crosslinking of elongation factor EF-G to the 50S ribosomal subunit of Escherichia coli

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

Note: In lieu of an abstract, this is the article's first page.

Immobilization of Escherichia coli Cells by Use of the Antimicrobial Peptide Cecropin P1

FIG. 1 . Immobilization of E. coli O157:H7 on cecropin P1-coated microplate wells. Cells were immobilized in wells containing 33 ± 1 pmol (○), 26 ± 3 pmol (•), and 20 ± 1 pmol (▪) of cecropin P1. FIG. 2 . Immobilization of E. coli K-12 on cecropin P1. Cells were immobilized in microplate wells containing 40 ± 1 pmol (○), 31 ± 1 pmol (•), and 25 ± 1 pmol (▪) of cecropin P1. FIG. 3 . Comparison of binding curves for E. coli O157:H7 (○) and K-12 (•) with cecropin P1. Data were plotted as absorbance at 450 nm per picomole of cecropin per slope versus log CFU. The number of picomoles of cecropin reflects the amount of peptide coupled to the plate well surface, whereas the slope is the slope of the center of the binding curve for each strain binding to anti-E. coli antibody-horseradish peroxidase conjugate. FIG. 4 . pH profiles for E. coli O157:H7 (A) and K-12 (B) immobilized on cecropin P1. FIG. 5 . Effect of ionic strength on the binding of E. coli O157:H7 (A) and K-12 (B) to cecropin P1.


At present, several SARS-CoV-2 vaccine candidates have been developed, using RBD of SARS-CoV-2 S protein as the antigen. The RBD antigen was mainly expressed by the eukaryotic cell expression systems, such as mammalian cells, insect cells, and yeast cells. However, RBD-2 derived from the eukaryotic cells suffered from high cost and low expression level. As compared with RBD-2, RBD-1 expressed by E. coli was greatly scalable at a low cost. In the present study, the product yield of RBD-1 was 13.3 mg/L by flask culture. In contrast, the product yield of RBD-2 in mammalian cells (HEK-293T) was 5 mg/L by cell culture [ 30 ]. However, the effectiveness of RBD-1 derived from E. coli was necessary to be evaluated.

In the present study, RBD-1 was expressed by E. coli in the form of inclusion bodies. The inclusion bodies were dissolved in 6 M guanidine hydrochloride and renatured in the presence of 0.5 M l -arginine. The renatured RBD was purified by a Ni Sepharose Fast Flow column. RBD was thus obtained by one-step affinity purification process with high purity and high purification yield. As compared with RBD expressed by the HEK293 cells (RBD-2), RBD expressed by E. coli (RBD-1) lacks the glycosylation. The absence of glycosylation correlated with the decreased size of RBD-1, which may shorten the serum duration of RBD. If RBD was formulated and used as nanoparticle vaccines, the size effect of RBD could be neglected.

The structure of RBD-1 was investigated by CD, fluorescence and FT-IR. CD suggested that the major β-sheet content of RBD-1 was almost unaltered. Fluorescence spectroscopy suggested that the tertiary structure of RBD-1 was slightly changed. FT-IR spectroscopy revealed that RBD-1 lacked the glycosylation with a slight structural alteration. SPR analysis suggested that RBD-1 could strongly bind ACE2 with a KD of 2.98 × 10 –8 M. Thus, our study was of practical significance to ensure the effectiveness of RBD for clinical application.

In summary, RBD-1 was successfully expressed in E. coli and purified by Ni affinity chromatography. RBD-1 was structurally characterized and compared with RBD expressed by the HEK293 cells (RBD-2). The secondary structure and tertiary structure of RBD-1 were largely maintained. Moreover, RBD-1 could strongly bind ACE2 with a KD of 2.98 × 10 –8 M. Thus, RBD-1 was expected to apply in the vaccine design, screening drugs and virus test kit.

Watch the video: Prokaryotic Vs. Eukaryotic Cells (October 2022).