The mechanisms underlying antigenic variation and maintenance of genomic integrity in Mycoplasma pneumoniae and Mycoplasma genitalium

Mycoplasma pneumoniae and Mycoplasma genitalium are important causative agents of infections in humans. Like all other mycoplasmas, these species possess genomes that are significantly smaller than that of other prokaryotes. Moreover, both organisms possess an exceptionally compact set of DNA recombination and repair-associated genes. These genes, however, are sufficient to generate antigenic variation by means of homologous recombination between specific repetitive genomic elements. At the same time, these mycoplasmas have likely evolved strategies to maintain the stability and integrity of their ‘minimal’ genomes. Previous studies have indicated that there are considerable differences between mycoplasmas and other bacteria in the composition of their DNA recombination and repair machinery. However, the complete repertoire of activities executed by the putative recombination and repair enzymes encoded by Mycoplasma species is not yet fully understood. In this paper, we review the current knowledge on the proteins that likely form part of the DNA repair and recombination pathways of two of the most clinically relevant Mycoplasma species, M. pneumoniae and M. genitalium. The characterization of these proteins will help to define the minimal enzymatic requirements for creating bacterial genetic diversity (antigenic variation) on the one hand, while maintaining genomic integrity on the other.


Introduction
Mycoplasma pneumoniae and Mycoplasma genitalium are pathogenic bacteria that cause significant health problems in the human population. These pathogens are genetically very similar (Himmelreich et al. 1997), and belong to the Mollicutes class of bacteria. While the genome of M. pneumoniae is significantly longer than that of M. genitalium (816 kb versus 580 kb) (Fraser et al. 1995;Himmelreich et al. 1996), the ~ 479 orthologous proteins encoded by these species on average display ~ 67% identity (Himmelreich et al. 1997).
M. pneumoniae causes both upper respiratory tract infections (RTIs), such as pharyngitis and tracheobronchitis, and lower RTIs, such as pneumonia. M. pneumoniae infection and transmission occurs during both endemic and epidemic settings in developed as well as developing countries. It has been reported that M. pneumoniae is responsible for up to 40% of the cases of community-acquired bacterial pneumonia (CABP) found during epidemic situations, many of which concern children (Waites et al. 2017). In addition, this bacterium was found to induce extrapulmonary infections Communicated by Michael Berney. as well as post-infectious, immune-mediated diseases, such as Guillain-Barre syndrome (Meyer Sauteur et al. 2014a. Interestingly, M. pneumoniae can also be found in the respiratory tract of asymptomatic children (Spuesens et al. 2016(Spuesens et al. , 2013, challenging both the diagnosis and treatment of M. pneumoniae infections in this patient group (Meyer Sauteur et al. 2014bSpuesens et al. 2014).
M. genitalium is an etiological agent of various diseases of the human reproductive tract, such as non-gonococcal urethritis (NGU) in men, and cervicitis, endometritis, pelvic inflammatory disease (PID) and tubal-factor infertility in women (Deborde et al. 2019;McGowin and Totten 2017). The prevalence of M. genitalium infections is about 1.3-3.9% in the general population and significantly higher in specific groups such as female commercial sex workers (15.9%) (Baumann et al. 2018). M. genitalium infections are also relatively common in human immunodeficiency virus (HIV)-infected patients (Deborde et al. 2019). Therefore, it was considered to include M. genitalium screening and treatment interventions as part of HIV prevention strategies (Napierala Mavedzenge et al. 2015).
As human pathogens, Mycoplasma spp. continuously evolve due to external pressures exerted by the host immune system as well as the use of antibiotic drugs. Among the most common strategies employed by pathogenic microorganisms to evade immune surveillance and control by the host is antigenic variation. Antigenic variation is successfully employed by various bacterial pathogens, including Neisseria spp., Mycoplasma spp., and Treponema pallidum . It is also effectively used by viral pathogens, such as influenza virus (Shao et al. 2017). Antigenic variation leads to continuous alterations or modifications of the surface molecules that are mainly targeted by the host immune systems. Consequently, the humoral (antibody) response generated against the previous ("old") surface molecules cannot effectively recognize and neutralize the modified ("new") molecules, allowing the pathogen to persist in the infected host for a prolonged period of time (Dehon and McGowin 2017;Qin et al. 2019). Antigenic variation is also hypothesized to be involved in the repeated epidemics caused by M. pneumoniae (Dumke et al. 2008). In this species, as well as in M. genitalium, this antigenic variation is predicted to be generated through homologous recombination between specific, repetitive DNA elements that are dispersed throughout their genomes (Rocha and Blanchard 2002). These repetitive elements encode the antigenic surface molecules P1 and MgPa of M. pneumoniae and M. genitalium, respectively (Fig. 1). There are ten variants of RepMP2/3 and each variant is labeled 'a' to 'j' in blue. For RepMP4, there are eight variants and each variant is labeled 'a' to 'h' in red. The drawing is a modification of figures that were previously published by Spuesens et al. (Spuesens et al. , 2011. b The MgPa operon of M. genitalium contains two variable genes, mgpB (ORF MG191) and mgpC (ORF MG192). There are three MgPar homologous regions within the MG191 gene, indicated as repeat regions B (orange), EF (yellow), and G (green), while there is only one, large MgPar homologous region within the MG192 gene, indicated as repeat region JKLM (light blue). Invariable, conserved regions within each gene are indicated in black. Nine homologous MgPar sequences, containing diverse copies of mgpB-and mgpCassociated homologous regions, are present in the genome of M. genitalium. Homologous sequences are indicated in the same color. The drawing is a modification of a figure that was previously published by McGowin and Totten (2017) While antigenic variation through homologous recombination may be crucial for the propagation of Mycoplasma species in human populations, a major issue for these organisms is the maintenance of the integrity of their genomes, in particular because these genomes are much more compact than the genomes of most other bacterial taxa. Interestingly, the DNA repair systems that are involved in maintaining genome integrity in M. pneumoniae and M. genitalium may also be involved in the aforementioned recombination between repetitive DNA elements. Consequently, the DNA repair machinery of these organisms may be directly involved in antigenic variation.
The aim of this paper is to review our current knowledge on the genes and proteins that likely form part of the DNA repair and recombination pathways of M. pneumoniae and M. genitalium. Several previous studies have indicated that considerable differences exist between mycoplasmas and other bacteria in the composition of their sets of DNA recombination and repair-related genes. Thus, characterization of these genes and the encoded proteins will help to define the minimal enzymatic requirements for creating bacterial genetic diversity (antigenic variation) on the one hand, while maintaining genomic integrity on the other.

Mechanism of antigenic variation and the role of repetitive element recombination
Mycoplasmas are the smallest known self-replicating organisms, both regarding cellular dimensions and genome size (Wilson and Collier 1976). As mentioned above, the genomes of M. pneumoniae (strain M129) and M. genitalium (strain G-37 T ) are only 816 kb and 580 kb in length, respectively (Fraser et al. 1995;Himmelreich et al. 1996). By comparison, the genome of 'model' bacterium Escherichia coli strain K12 is 4,639 kb in length (Blattner et al. 1997). It is remarkable that, in spite of their limited sizes, the genomes of M. pneumoniae and M. genitalium consist of a significant portion of repeated DNA elements, which constitute approximately 8% and 4% of the genome, respectively (Himmelreich et al. 1996;Peterson et al. 1993). These multiple repeated elements, which are dispersed throughout the genome, display a high level of sequence similarity, but are not identical. In M. pneumoniae, these are referred to as Rep MP (RepMP2/3 and RepMP4) elements; while in M. genitalium they are termed MgPa repeats (MgPars) (Fraser et al. 1995;Peterson et al. 1995;Ruland et al. 1990;Su et al. 1988).
Interestingly, some of the copies of the repeated DNA elements were found to form part of genes encoding antigenic surface proteins. Among these proteins are the cytadhesins P1 of M. pneumoniae (Himmelreich et al. 1996) and MgPa of M. genitalium (Aparicio et al. 2018;Fraser et al. 1995). It has been demonstrated that the repeated DNA elements (including those in the P1 and MgPa operon) can undergo homologous DNA recombination, which may result in sequence variation (i.e. antigenic variation) of the P1 and MgPa proteins (Ma et al. 2007;Peterson et al. 1995;Ruland et al. 1990;Spuesens et al. 2009).

Basic mechanisms of homologous DNA recombination
For general DNA editing and repair, bacteria can employ homologous DNA recombination, in which a stretch of DNA is eventually exchanged with an identical (or almost identical) sequence that originates either from another site of the genome or from extrachromosomal DNA. Much of our knowledge of homologous recombination is derived from studies performed in E. coli (Kowalczykowski et al. 1994). However, the basic mechanisms as well as genes or proteins involved in this process are conserved across eubacteria (Kuzminov 1999).
The initial step of homologous recombination requires single-stranded breaks or nicks at corresponding regions of two homologous DNA sequences by a specific endonuclease (Fig. 2). Subsequently, the two homologous single strands from each DNA molecule exchange positions in a reciprocal manner, resulting in basepair formation between the transferred (donor) and recipient DNA (heteroduplex formation). This reciprocal strand exchange produces a so-called Holliday junction (or Chi structure), a point at which two single DNA strands from two homologous double-stranded (ds) DNA molecules exchange. Migration of the Holliday junction (branch migration) allows further extension of the stretch of heteroduplex DNA. As the final step, resolution (cutting) of the Holliday junction by a resolvase enzyme, and subsequent ligation of the remaining single-stranded nicks by DNA ligase, will produce two intact, recombined DNA molecules ( Fig. 2a) (Bianco et al. 1998;Kowalczykowski et al. 1994).
In E. coli, homologous recombination requires a coordinated action of more than 20 different proteins. Of these, RecA is involved in the initial step of pairing of the two homologous DNA sequences and single-strand invasion onto the other DNA template. While RuvA and RuvB are proteins that are necessary for branch migration, the resolution of the Holliday junction is executed by RuvC (Eggleston and West 1996;Kowalczykowski et al. 1994).
Although the aforementioned process of homologous recombination involves the reciprocal, bidirectional 1 3 exchange of DNA strands, homologous recombination can also be non-reciprocal and unidirectional, through a process termed gene conversion. In gene conversion, the original recipient sequences are replaced by the donor sequences and are eventually lost from the bacterial genome ( Fig. 2b) (Bianco et al. 1998;Chen et al. 2007).

Evidence of homologous recombination-induced antigenic variation in M. pneumoniae and M. genitalium
For M. pneumoniae, it has been hypothesized that homologous recombination between RepMP elements may occur through gene conversion-like processes. This hypothesis was supported by analysis of the RepMP elements from a collection of 23 M. pneumoniae isolates. Sequence analysis of all 10 variants of RepMP2/3 and all 8 variants of RepMP4 from these isolates indicated that one or more 'donor' RepMP variants appeared to have been copied to other ('recipient') RepMP variants, including those located within the P1 operon ). The same phenomenon was also observed among RepMP5 variants, resulting in amino acid changes in another surface-exposed cytadhesin protein, named P40 (Spuesens et al. 2011). In addition, analysis of an additional set of 23 M. pneumoniae clinical isolates also showed a significant rate of reorganization through recombination events between these repetitive elements (Lluch-Senar et al. 2015).
Sequence analysis of the M. genitalium mgpB and mgpC genes (the second and third gene within the MgPa operon) provided strong support for the hypothesis that recombination occurs between MgPar sequences, and results in antigenically distinct MgPa variants both in vitro and in clinical isolates (Fookes et al. 2017;Iverson-Cabral et al. 2006, 2007Ma et al. 2007Ma et al. , 2014. In an experimental chimpanzee model of M. genitalium infection, sequence variation within the mgpC gene initially occurred within five weeks post infection and accumulated progressively (Ma et al. 2015). Interestingly, in the majority of the cases, MgPa recombination events in M. genitalium were found to be caused by reciprocal DNA recombination events, in contrast to the homologous recombination events found in M. pneumoniae, which all appeared to have resulted from gene conversion events. Another difference between the recombination processes in these mycoplasmas is that recombination events occur at a significantly higher frequency in M. genitalium than in M. pneumoniae .
Because both the M. pneumoniae P1 protein and the M. genitalium MgPa protein are highly immunogenic (Razin and Jacobs 1992), the recombination-induced antigenic variation of these proteins may play a crucial role in the pathogenicity of both pathogens and their evasion from the host's immune system . Consequently, antigenic variation may be a critical factor in allowing M. pneumoniae and M. genitalium to persist in infected humans for prolonged periods of time (Atkinson et al. 2008;Hardy et al. 2002;Iverson-Cabral et al. 2006;Vink et al. 2012). In support of this notion, persistent infections with these bacteria were observed in animal models of infection (Hardy et al. 2002;McGowin et al. 2010;Wood et al. 2017). In addition, the antigenic variation of P1 and MgPa may facilitate adaptation of the bacteria to different host microenvironments, because both P1 and MgPa proteins are surface-exposed proteins that function in the attachment of the bacteria to host cells (Ma et al. 2014).

The Mycoplasma genes and proteins involved in DNA recombination
Due to the compact nature of all metabolic pathways in mycoplasmas, it is likely that the enzymatic machinery that governs recombination between repeated DNA elements in M. pneumoniae and M. genitalium largely overlaps with the machinery involved in general DNA recombination and repair in these bacteria . As shown in Table 1, the predicted set of genes and proteins involved in DNA recombination and repair in these mycoplasmas is limited in number. Notably, M. pneumoniae and M. genitalium lack a significant number of enzymes known to be involved in DNA repair in other bacterial classes. These enzymes include LexA, PhrI, PhrII, RecBCD, AddAB, RecFOR, RecQ and RecJ, as well as proteins involved in mismatch repair, such as MutS, MutL, and MutH (Carvalho et al. 2005).
The functions of several of the proteins listed in Table 1 have been investigated. These proteins include SSB (Sluijter et al. 2008), RecA ), RuvA (Ingleston et al. 2002;Sluijter et al. 2012), RuvB , and RecU (Sluijter et al. , 2010. Remarkably, as detailed below, some of these proteins showed significantly Fig. 2 The basic mechanism of homologous DNA recombination. a Reciprocal homologous recombination. This process is initiated by generation of single-strand breaks or nicks at corresponding regions of two homologous sequences by a specific endonuclease. Subsequently, the two homologous single strands from each DNA molecule exchange positions in a reciprocal manner (crossing over), resulting in heteroduplex formation. This reciprocal strand exchange produces a Holliday junction. Branch migration of the Holliday junction allows further extension of the stretch of heteroduplex DNA. Resolution of the Holliday junction is executed by a resolvase enzyme. As the final step, ligation of the remaining single-strand nicks is performed by DNA ligase, resulting in two intact, recombined DNA molecules. b Non-reciprocal homologous recombination (gene conversion). This process is initiated by the generation of a double-strand break. The resulting 5′ ends are then resected, generating 3′ ssDNA tails. In the example shown, one of the 3′ tails transfers to -and basepairs with-another, homologous DNA molecule, forming a displacement (D)-loop. The newly formed heteroduplex DNA is extended by DNA polymerase. The invading strand, including the newly synthesized stretch of DNA, is then displaced form the template strand and reanneals to the DNA strand it was originally attached to (strand transfer). Remaining single-stranded gaps are subsequently filled by the combined action of DNA polymerase I and DNA ligase ◂ different activities than their homologs from other bacterial classes.

SSB Mpn
The MPN229 ORF of M. pneumoniae encodes a singlestranded DNA-binding protein (SSB Mpn ) consisting of 166 amino acids. Functional characterization of this ∼18-kDa protein revealed that it possesses similar activities as its counterpart from E. coli. SSB Mpn was reported to form tetramers that efficiently and selectively bind to singlestranded DNA (ssDNA) substrates. This activity was independent of the presence of divalent cations, such as Mg 2+ (Sluijter et al. 2008). Importantly, SSB Mpn efficiently supported E. coli Recombinase A (RecA Eco )-promoted homologous DNA recombination.
The M. genitalium counterpart of SSB Mpn , SSB Mge , is encoded by the MG091 gene. Although SSB Mpn and SSB Mge are 61% identical on the amino acid sequence level (Sluijter et al. 2008), the latter protein has not yet been subjected to functional analyses.

RecA Mpn and RecA Mge
The MPN490 and MG339 ORFs of M. pneumoniae and M. genitalium, respectively, were reported to encode homologs of Recombinase A (RecA) proteins. These ~ 37-kDa proteins, which were designated RecA Mpn and RecA Mge , respectively, were found to have a high similarity on the amino acid sequence level (79% identity) ). Functional in vitro studies demonstrated that both RecA Mpn and RecA Mge possess similar activities as their counterpart from E. coli, by mediating recombination events between homologous DNA substrates in an ATP-, pH-and divalent cation (Mg 2+ )-dependent manner. Also, the recombinase activity of RecA Mpn and RecA Mge was found to be dependent on the presence of SSB Mpn ). In vivo, it was shown that RecA Mge efficiently mediates MgpB and MgpC phase and antigenic variation in M. genitalium. However,  (Sluijter et al. 2010). Taken together, M. pneumoniae strains may not encode functional homologs of RecU, which may explain the relatively low level of recombination that is observed within the M. pneumoniae genome (Sluijter et al. 2010).
In conclusion, functional studies of the various proteins that are putatively involved in DNA recombination in Mycoplasma spp. revealed significantly different in vitro activities between these proteins and their homologs from other bacterial classes. These findings emphasize the important notion that functionalities or activities of certain genes or proteins can not be based solely on sequence homology. Moreover, these studies also showed that the DNA recombination machineries of M. pneumoniae and M. genitalium differ markedly from those of other bacteria (Sluijter et al. 2012).
Another important aspect of DNA recombination is its regulation. In 2014, it was shown that MgPar recombination in M. genitalium was positively regulated by a novel putative transcription factor encoded by ORF MG428 (Burgos and Totten 2014). Notably, mutants lacking MG428 were defective in generating mgpBC gene variants (Burgos and Totten 2014). Interestingly, overexpression of MG428 increased expression of ruvA, ruvB, recA, as well as other genes, and was associated with increased mgpBC gene variation (Burgos and Totten 2014;Torres-Puig et al. 2015). These findings highlight the complexity of the regulation of recombination events in M. genitalium, and possibly in M. pneumoniae, that should be extensively explored.
Additionally, it is important to note that homologous recombination can be employed for experimental genome alteration of mycoplasmas. Experimental gene modification through homologous recombination has previously been performed to analyze the function of specific genes or proteins of M. pneumoniae and M. genitalium (Dhandayuthapani et al. 1999;Krishnakumar et al. 2010).

The role of DNA repair machinery to maintain genomic integrity of Mycoplasma spp.
Because maintenance of the integrity of the genome is essential for all living organisms, they all have DNA repair systems in place that both detect and repair DNA lesions. These DNA lesions can arise due to physical triggers (e.g. UV light, extreme temperatures and desiccation) and various chemical agents (e.g. H 2 O 2 and cisplatin) causing chemical modifications of the DNA structure. The most important physical trigger of DNA damage is UV light, which is able to induce the covalent linkage of two adjacent pyrimidines (pyrimidine dimers), including cyclobutane pyrimidine dimers (CPD) and pyrimidine-pyrimidone (6-4) photoproducts [(6-4)PP] (Goosen and Moolenaar 2008). Chemical damage to DNA can be caused by compounds such as cisplatin (cis-platinum (II) diaminodichloride), methylmethanesulfonate, ethylmethanesulfonate, N-ethyl-N-nitrosourea, leading to lesions like cisplatin cross-links, thymine glycol products, psoralen monoadducts, O 6 -methyl guanine, and abasic sites. For pathogenic bacteria, including Mycoplasma spp., such lesions can also be induced through immune responses of the host (e.g., through reactive oxygen species [ROS] generated by macrophages) and the action of antibiotics.
In Mycoplasma spp., DNA repair is probably also involved in the recombination-induced antigenic variation. As the recombining repeated elements are not identical ), heteroduplex DNA formed at the sites of homologous DNA recombination between donor and recipient repeats will also include mismatched base pairs which have to be corrected. Therefore, a DNA repair system is required to correct mismatched bases after each recombination event. This repair system may not only be involved in the recombination of repeated DNA elements, but also in the maintenance of the integrity of the Mycoplasma genomes (Carvalho et al. 2005). This maintenance is especially important for the mycoplasmas, as their genomes (which are often designated as 'minimal genomes') have a limited length and gene content, and do not exhibit significant redundancy of gene or protein functions (Fraser et al. 1995;Glass et al. 2006;Himmelreich et al. 1996).
Various DNA repair systems have evolved in prokaryotes. Of these, nucleotide excision repair (NER) and base excision repair (BER) represent the major DNA repair pathways. Another mechanism includes direct repair by a photolyase (photoreactivation) and UV-damage endonuclease (UVDE) (Goosen and Moolenaar 2008;Hu et al. 2017).
A comparative analysis of the gene content of nine different Mycoplasma species has indicated that both M. pneumoniae and M. genitalium contain genes that are putatively involved in NER. This pathway may be the only 'complete' DNA repair pathway in these species. While these species also have genes potentially involved in BER and recombination repair, they do not encode the full set of BER proteins, as found in other groups of prokaryotes. These isolated genes could complement the NER activity in the mycoplasmas (Carvalho et al. 2005).

The NER pathway in E. coli and Mycoplasma spp.
NER is a universal process that is involved in the repair of a wide variety of DNA lesions produced by different DNA damaging agents. These lesions include single-base modifications (caused by ionizing radiation, psoralen, etc.), intraand inter-strand cross-links (caused by UV irradiation, mitomycin C, cisplatin, etc.), mismatched bases, and backbone modifications (Van Houten et al. 2005).
Basically, NER is a multi-step process, involving (1) damage recognition, (2) dual incision of a damaged DNA-containing oligomer, (3) removal of the incised oligomer, and (4) repair DNA synthesis, followed by (5) ligation (Hu et al. 2017). In Homo sapiens, NER requires the participation of at least 16 proteins (Mu et al. 1995). Prokaryotes, by contrast, require only three proteins, UvrA, UvrB, and UvrC, which are collectively termed the UvrABC system. The UvrABC system of E. coli has been studied extensively and serves as a model for NER (Van Houten 1990;Van Houten et al. 2005).
In E. coli, the DNA repair pathway is initiated through the formation of a DNA damage-recognition protein complex containing the UvrA and UvrB proteins (Cordone et al. 2011;Verhoeven et al. 2002) (Fig. 3). UvrA plays a key role in damage recognition, since it preferentially binds to damaged DNA in the absence of other NER components (Stracy et al. 2016). UvrA seems to recognize the unwinding and bending of the 'damaged' DNA, as well as aberrations of the global conformation of the double helix, but does not seem to probe the stability of the base interactions. This mechanism allows UvrA to detect various DNA lesions and achieves broad specificity (Jaciuk et al. 2011). UvrA is also required for the loading of UvrB to form the preincision complex at the site of a DNA lesion (Truglio et al. 2006).
After damage identification, the UvrAB complex unwinds the DNA around the lesion, allowing direct access of UvrB to the lesion (Zou and Van Houten 1999). UvrA then loads UvrB onto the damaged DNA site (Stracy et al. 2016). Following dissociation of UvrA from the complex, UvrB forms a stable UvrB-DNA preincision complex (Theis et al. 2000). The pre-incision complex is subsequently bound by UvrC, which makes incisions around the damaged DNA. It first makes a cut at the fourth or fifth nucleotide from the 3′ side of the damage, followed by incision at the eighth nucleotide from the 5′ side of the damage (Verhoeven et al. 2000). The resulting 12-to 13-nt fragment is ultimately removed by UvrD (helicase II) and the resulting single-nucleotide gap is filled by DNA polymerase I. In the final step, DNA ligase joins the two ends of the nick (Fig. 3).
The importance of the proteins involved in NER was previously shown in several studies using mutant bacteria that were unable to express these proteins (LeCuyer et al. 2010). Missing one component of the system causes reduced survival rates in the bacteria that were exposed to DNA-damaging agents. For example, Mycobacterium tuberculosis strains that lack either uvrA or uvrB are more sensitive than the wild-type strain to UV light and other DNAdamaging agents, such as mitomycin C, reactive oxygen, and nitrogen intermediates (Rossi et al. 2011). Similar results were reported for a Mycobacterium smegmatis uvrA mutant (Cordone et al. 2011). Furthermore, E. coli uvrA − , uvrB − or uvrC − mutants were unable to excise UV-induced pyrimidine dimers from their genomes, and were highly mutable (Howard-Flanders and Boyce 1966; Ishii and Kondo 1975;Kato 1972).

UvrA homologs from M. pneumoniae and M. genitalium
The UvrA proteins belong to the ATP-binding cassette (ABC) superfamily of ATPases (ABC-type ATPases).
Proteins that belong to this superfamily use energy derived from ATP hydrolysis to catalyze a variety of biochemical reactions (Thomas et al. 1986;Wilkens 2015). ABC ATPases share several conserved functional regions in their structures, such as the Walker A/P loop, Q loop, ABC signature, Walker B, D loop, and H loop (Hopfner and Tainer 2003;Thomas et al. 1986;Wilkens 2015). As shown in Fig. 4, those regions are highly conserved among the predicted UvrA proteins from several bacterial species, including the representatives from M. genitalium and M. pneumoniae, UvrA Mge and UvrA Mpn , respectively. The latter two proteins are encoded by ORFs MG421 and MPN619, respectively (Fraser et al. 1995;Himmelreich et al. 1996), and are 85% identical on the amino acid level. To date, the functions of these predicted mycoplasma proteins have not yet been determined.

UvrB homologs from M. pneumoniae and M. genitalium
As explained above, E. coli UvrB is a specific DNA damagebinding protein with helicase and strand-separating activities that plays a central role in the multistep process of DNA damage recognition and incision. It interacts first with UvrA, then with UvrC, and finally with UvrD and DNA polymerase I to complete the excision repair (Theis et al. 1999). UvrB proteins contain five domains, termed 1a, 1b, 2, 3 and 4, with ATP-binding sites located between domains 1a and 3. These proteins have six helicase motifs, three of which [helicase motif I (Walker A), II (Walker B), and III] are located within domain 1a, whereas the other three (helicase motif IV, V, and VI) are located within domain 3 (Linton 2007; Theis et al. 1999Theis et al. , 2000, indicating that UvrB is a member of the helicase superfamily. As depicted in Fig. 5, the six helicase motifs are more conserved among bacterial species than are the other regions, indicating the important functional role of these motifs. UvrB uses its helicase-like activity to locally unwind DNA at the site of DNA damage (Linton 2007;Machius et al. 1999). The multiple sequence alignment shown in Fig. 5 also indicates the relatively conserved β-hairpin region of UvrB proteins. It has been shown for E. coli UvrB that this region is involved in DNA damage recognition and UvrC-mediated incision (Gordienko and Rupp 1997).
The MG073 and MPN211 ORFs of M. genitalium of M. pneumoniae, respectively, have previously been predicted as genes that encode UvrB homologs (Fraser et al. 1995;Himmelreich et al. 1996). These homologs, termed UvrB Mpn and UvrB Mge , respectively, display 76% identity, while the similarity between UvrB Mge and UvrB Eco is 45%. These Mycoplasma genes have not yet been subjected to functional analyses.

UvrC homologs from M. pneumoniae and M. genitalium
As described above, E. coli UvrC can produce single-strand cuts on either side of a DNA lesion. These cuts are executed by two endonuclease domains located at the N-terminal and C-terminal parts of the protein. These domains are separated by a highly variable linker region (Goosen and Moolenaar 2008). While UvrC can bind the DNA-containing lesions alone, the binding efficiency is increased significantly in complex with UvrB (Uphoff and Sherratt 2017).
Although M. genitalium MG206 was annotated as an ORF that has the potential to encode a homolog of UvrC proteins (Fraser et al. 1995), the MG206-derived amino acid sequence demonstrated only 25% similarity with the UvrC protein from E. coli (UvrC Eco ). A significantly higher similarity was observed between UvrC Eco and the UvrC homolog from M. pneumoniae M129 (59%) (Fig. 6). Interestingly, a M. genitalium mutant with a deletion of MG206 was reported to have a growth deficiency as well as an increased sensitivity to UV-induced DNA damage (Burgos et al. 2012), indicating that the MG206-encoded protein (UvrC Mge ) has a potential role in DNA repair in M. genitalium.

Functional UvrD homologs from M. pneumoniae and M. genitalium
Interestingly, both M. pneumoniae and M. genitalium do not possess obvious uvrD gene homologs in their genomes (Table 1) (Carvalho et al. 2005). Instead, these bacteria possess ORFs (MPN340 and MPN341 of M. pneumoniae, and MG244 of M. genitalium; Table 1) that encode PcrA helicases that belong to the same family (superfamily 1, SF1) as UvrD (Estevao et al. 2013). PcrA homologs are found in all gram-positive bacteria, including Bacillus subtilis and Staphylococcus aureus, as well as in bacteria belonging to the Firmicutes and Mollicutes classes (Petit and Ehrlich 2002;Singleton et al. 2007).
M. genitalium MG244 encodes a single PcrA helicase (PcrA Mge ) that represents the ortholog of the M. pneumoniae MPN341-encoded protein termed PcrA Mpn . The second ORF encoding a PcrA homolog in M. pneumoniae, MPN340, is not found in other Mycoplasma spp. Interestingly, the length of this ORF (1,590 bp) is considerably shorter than that of MPN341 (2,148 bp). Sequence analysis of the MPN340encoded proteins (PcrA2 M129 and PcrA2 FH from subtype 1 and subtype 2 strains, respectively) showed that they lacked a so-called 2B subdomain that is found in most SF1 DNA helicases. Surprisingly, all four proteins were found to have divalent cation-and ATP-dependent DNA helicase activity (Estevao et al. 2013). It is therefore possible that these proteins may be involved in UvrD-like activities in NER in both M. pneumoniae and M. genitalium. Fig. 4 Multiple alignment of UvrA(-like) amino acid sequences. The multiple alignment was generated with the amino acid sequences predicted to be encoded by the following ORFs [The code in parentheses represents the GenBank accession numbers (https ://www. ncbi.nlm.nih.gov/)]: Mycoplasma genitalium G37 (NP_073092), Mycoplasma pneumoniae M129 (NP_110308.1), Mycobacterium tuberculosis (P63380.1), Escherichia coli str. K-12 substr. MG1655 (NP_418482.1), Bacillus caldotenax (AAK29748.1), and Thermotaga maritima (Q9WYV0.1). The program Clustal W (https ://www.ebi. ac.uk/Tools /msa/clust alw2) was used to generate a multiple alignment of the amino acid sequences. The program BOXSHADE, version 3.21 (https ://www.ch.embne t.org/softw are/BOX_form.html), was used to generate white letters on black boxes (for residues that are identical in at least four out of eight sequences) and white letters on gray boxes (for similar residues). Secondary structure elements are based on the crystal structure of UvrA from T. maritima (Jaciuk et al., 2011 [41]) and indicated as colored-lines above the sequences (ATP-binding I, red; signature I, pink; UvrB binding, yellow; insertion (DNA binding), purple; linker, black; ATP-binding II, blue; and signature II, cyan) ◂ 1 3 Although the E. coli UvrD protein only has functional counterparts (but not orthologs) in both Mycoplasma spp., the conservation of the other actors of the UvrABC system suggests that the NER pathway in the mycoplasmas may function in a similar way as in E. coli (Carvalho et al. 2005). (POA8F8.2). The program Clustal W (https ://www.ebi.ac.uk/Tools / msa/clust alw2) was used to generate the multiple alignment of amino acid sequences. The program BOXSHADE, version 3.21 (https :// www.ch.embne t.org/softw are/BOX_form.html), was used to generate white letters on black boxes (for residues that are identical in at least four out of seven sequences) and white letters on gray boxes (for similar residues). The annotation of the helicase motifs I-VI (HM I-VI) and β-hairpin (β-H) is based on the crystal structure of UvrB from B. caldotenax (Theis et al. 1999) conserved during evolution. BER fixes small base lesions that do not induce large distortions in the DNA helix structure (Krokan and Bjoras 2013). In short, BER involves four steps of repair: (1) recognition and incision at the abasic site, (2) gap generation, (3) repair synthesis, and (4) DNA ligation.
The pathway is initiated by the search of DNA lesions by specific DNA glycosylases (for example, uracil DNA glycosylase of E. coli). After recognition, damage-specific DNA glycosylases remove the damaged bases by cleaving the N-glycosyl bond between the base and the sugar, which results in an abasic or apurinic/apyrimidinic (AP) site in the DNA. Generation of the gap at this specific lesion is performed by class II AP endonucleases (endonuclease IV and exonuclease III), which specifically cleave at abasic sites, and RecJ protein, which excises a 5′-terminal deoxyribose-phosphate residue. Class I AP lysases can also participate in this process by making incisions at the 3′ side of AP sites. Finally, repair synthesis and ligation are performed by DNA polymerase I and DNA ligase, respectively (Dianov and Lindahl 1994;Kow 1994) (Fig. 7). Bacillus mycoides Rock 3-17 (ZP_04158875.1) and Escherichia coli str. K-12 (POA860.1). The program Clustal W (https ://www.ebi. ac.uk/Tools /msa/clust alw2) was used to generate multiple alignment of amino acid sequences. The program BOXSHADE, version 3.21 (https ://www.ch.embne t.org/softw are/BOX_form.html), was used to generate white letters on black boxes (for residues that are identical in at least four out of seven sequences) and white letters on gray boxes (for similar residues) Failure to remove AP sites in genomic DNA will result in blockade of DNA replication or mutation of the genome.

Nfo Mpn and Nfo Mge
In contrast to other bacterial classes, which are known to harbor multiple enzymes coordinately involved in BER, M. pneumoniae and M. genitalium were hypothesized to possess only a single BER-associated enzyme, i.e. a homolog of Nfo (or EndoIV) proteins (Fraser et al. 1995;Himmelreich et al. 1996). The best characterized Nfo protein is the one derived from E. coli (Nfo Eco ). Nfo Eco is a multifunctional protein that is known to participate in BER by recognizing and removing AP sites at 5′ of damaged residues. It also has an intrinsic 3′ → 5′ exonuclease activity (Kerins et al. 2003). In addition to BER, the Nfo protein has been demonstrated to be involved in an alternative, overlapping pathway of DNA repair, termed nucleotide incision repair (NIR). In this system, Nfo functions in the initial incision step of various types of oxidative stress-induced DNA damage to provide target sites for DNA polymerase to finalize the repair process (Golan et al. 2010;Ischenko and Saparbaev 2002). Characterization of Nfo homologs derived from other distantly related species, such as Thermus thermophilus (Nfo Tth ), Thermotoga maritima (Nfo Tma ), and Chlamydophila pneumoniae (Nfo Cpn ), demonstrated that they all have similar activities as Nfo Eco (Back et al. 2006;Kerins et al. 2003;Liu et al. 2007).
The Nfo homologs from M. pneumoniae (Nfo Mpn ) and M. genitalium (Nfo Mge ) are encoded by ORFs MPN328 and MG235, respectively. Both proteins possess a high degree of similarity (65% identity) and are capable of removing AP sites at the phosphodiester bond immediately 5′ to the damaged DNA. Nfo Mpn and Nfo Mge were also found to possess 3′ → 5′ exonuclease activity in the presence of Mg 2+ . In addition, both proteins were shown to recognize and remove larger DNA lesions, such as cholesteryl-modified bases in the DNA (Estevao et al. 2014).

Conclusion and future perspectives
Antigenic variation in M. pneumoniae and M. genitalium is likely generated through homologous recombination between specific, repetitive DNA elements that are dispersed throughout the bacterial genomes. Characterization of the complete set of proteins involved in homologous DNA recombination in M. pneumoniae and M. genitalium has indicated that the functional activities of at least some of these proteins are different compared to those of other bacteria.
Additionally, mycoplasmas have evolved strategies to maintain the integrity of their 'minimal' genomes through an efficient DNA repair system. Comparative genomic analysis indicated that the NER pathway may be the only 'complete' DNA repair pathway in these species. While both human mycoplasmas also harbor genes potentially involved in BER, they do not encode a full set of BER-associated proteins, as found in other bacterial taxa.
Further characterization of the protein repertoire involved in homologous DNA recombination and repair in M. pneumoniae and M. genitalium is important, as it will identify the minimal enzymatic requirements for both generating bacterial genetic diversity (antigenic variation) and maintaining genomic integrity.

Fig. 7
Recognition and repair of damaged DNA by the Base Excision Repair (BER) system. The pathway is initiated by the recognition of a damaged base by DNA glycosylases. Subsequently, the damaged base is removed, resulting in an abasic or apurinic/apyrimidinic (AP) site in the DNA. Then, the deoxyribosyl phosphate backbone is cleaved by AP endonucleases, followed by repair synthesis and ligation (by DNA polymerase I and DNA ligase, respectively)