Remarkable Evolutionary Plasticity of Centromeric Chromatin

Centromeres were familiar to cell biologists in the late 19th century, but for most eukaryotes the basis for centromere speci- fication has remained enigmatic. Much attention has been focused on the cenH3 (CENP-A) histone variant, which forms the foundation of the centromere. To investigate the DNA sequence requirements for centromere specification, we applied avariety of epigenomic approaches, which have revealed surprising diversity in centromeric chromatin properties. Whereas each point centromere of budding yeast is occupied by a single precisely positioned tetrameric nucleosome with one cenH3 molecule, the “ regional ” centromeres of fission yeast contain unphased presumably octameric nucleosomes with two cenH3s. In Caenorhabditis elegans , kinetochores assemble all along the chromosome at sites of cenH3 nucleosomes that resemble budding yeast point centromeres, whereas holocentric insects lack cenH3 entirely. The “ satellite ” centromeres of most animals and plants consist of cenH3-containing particles that are precisely positioned over homogeneous tandem repeats, but in humans, different α -satellite subfamilies are occupied by CENP-A nucleosomes with very different conformations. We suggest that this extraor-dinary evolutionary diversity of centromeric chromatin architectures can be understood in terms of the simplicity of the task of equal chromosome segregation that is continually subverted by selfish DNA sequences.

ing proteins. These considerations led us to identify and characterize cenH3s from diverse model organisms, including Caenorhabditis elegans (Buchwitz et al. 1999), Drosophila melanogaster (Henikoff et al. 2000), Arabidopsis thaliana (Talbert et al. 2002), rice , and maize (Jin et al. 2004). Indeed, we found that evolutionary divergence is a general feature of cenH3s, despite the fact that H3 and its general replacement variant, H3.3, are among the most highly conserved proteins known. We referred to the remarkable divergence of both centromeric satellite DNA and its dedicated histone as the centromere paradox: rapid evolution with a conserved function that is essential through every cell division .
Harmit Malik in the laboratory made the key observation that Drosophila cenH3 (Cid) is evolving adaptively using standard population genetic measures, suggesting an arms race . Reasoning that the arms race would be between centromeric satellites and cenH3, we posited that centromeres compete for inclusion into the egg at female meiosis I . Our "centromere drive" hypothesis combined the concept of female meiotic drive described by Rhoades for the selfish segregation behavior of distal heterochromatic knobs in maize (Rhoades 1952) with that of centromere "strength" for the inferred female meiotic orientation of heterochromatinrich centromeres by Novitski (1955). Centromere drive as an explanation for the centromere paradox has since become the standard explanation for centromere divergence based on evidence from humans (Daniel 2002), monkey flowers (Fishman and Willis 2005), and mice (Chmátal et al. 2014) and an attractive general mechanism for postzygotic reproductive isolation as species diverge (Henikoff and Malik 2002;Burt and Trivers 2006).
The realizations that satellite centromeres are selfish and that cenH3s and other CCAN proteins may act to suppress drive ) raise the question as to how these CCAN proteins interact with their DNA substrates. We started by studying S. cerevisiae centromeres, with simple point centromeres that provide a basis for comparison to the more complex regional centromeres of Schizosaccharomyces pombe, the holocentromeres of C. elegans and holocentric insects, and the satellite centromeres of humans. In each case we have applied high-resolution mapping technologies to address the basic problem of centromere specificity. We find that the evolutionary divergence of centromeric chromatin is reflected in the plasticity of CCAN complex composition between different organisms and sometimes of different centromeres of the same individual.

BUDDING YEAST CENTROMERES ARE OCCUPIED BY HEMISOMES
Based on a comparison of centromeres from different budding yeast chromosomes, Clarke and Carbon (1985) identified consensus motifs that they termed centromeredetermining element (CDE) I, II, and III. Subsequent work identified the ∼120-bp segment spanning CDEI-II-III as the sequence-specific core of each of the 16 S. cerevisiae centromeres. The 8-bp CDEI consensus is the binding site for the Cbf1 sequence-specific transcription factor, and the 26-bp CDEIII consensus is the binding site for the centromere-specific multisubunit CBF3 complex. These elements flank the 82 ± 4-bp CDEII, which lacks a specific consensus, but consists of ≥90% AT-rich DNA. Micrococcal nuclease (MNase) mapped a single Cse4 nucleosome to the functional centromere (Furuyama and Biggins 2007). This observation is difficult to reconcile with the availability of only ∼80 bp of DNA for wrapping, suggesting the presence of a particle with fewer than the eight subunits in an H3 nucleosome, which wraps 147 bp of DNA.
To address this conundrum, Takehito Furuyama determined the chirality of the DNA superhelical writhe around the Cse4-containing core, using superhelical density mapping of circular chromosomes. This analysis revealed that the budding yeast centromere induces positive DNA supercoils, indicative of a right-handed writhe around the particle, opposite to that of left-handed H3 nucleosomes (Furuyama and Henikoff 2009). Our supercoiling measurements were later confirmed and extended in a study suggesting that the right-handed configuration is enforced by the formation of a DNA loop held together by the CBF3 complex, excluding CDEI (Díaz-Ingelmo et al. 2015). A possible rationale for the right-handed superhelical wrap is that DNA overwinds when stretched (Gore et al. 2006), so that pulling on the centromere at anaphase would be expected to cause the (right-handed) DNA to tighten around the particle.
Using a new high-resolution native chromatin immunoprecipitation (ChIP) method with V-plot analysis (Henikoff et al. 2011), Krassovsky et al. (2012 succeeded in mapping the Cse4-containing particle directly over CDEII. Meanwhile, some cytological studies detected more fluorescent Cse4-GFP particles over centromeres than could plausibly fit within CDEII (Coffman et al. 2011;Lawrimore et al. 2011), challenging the "point" interpretation of yeast centromeres (Furuyama and Biggins 2007). However, our quantitative ChIP-seq data could exclude the possibility that additional molecules are incorporated outside of the genetically defined centromere , rather suggesting that unincorporated Cse4 molecules are closely associated with functional centromeres. Thus, only two plausible models for the budding yeast centromere remained: a (Cse4/H4) 2 "tetrasome" or a (Cse4/H4/H2A/H2B) "hemisome," either of which could wrap the ∼80-bp CDEII DNA with right-handed chirality.
To determine whether hemisomes can be assembled on short segments of DNA, Takehito Furuyama adopted a classical salt-assembly method (Tatchell and Van Holde 1979), demonstrating that Cse4 but not H3 hemisomes that are stable in 4 M urea can be readily produced on CDEII DNA (Furuyama et al. 2013). Our finding suggested that the extreme AT richness of CDEII is an adaptation for exclusion of conventional nucleosomes from the functional centromere, as poly(dA:dT) tracts in the yeast genome are known to exclude nucleosomes (Struhl and Segal 2013).
To definitively determine the composition and conformation of the budding yeast centromeric nucleosome in vivo, we applied the Widom chemical cleavage mapping method, in which histone H4 is derivatized such that it cleaves DNA next to the dyad (Brogaard et al. 2012). By mapping the sites of cleavage, we could determine the precise location of H4 with base-pair resolution and also determine whether there is one H4 or two H4s. Results were unequivocal: Mapping of H4 cleavages at all 16 yeast centromeres showed only a single H4 with cleavage sites within CDEII consistent with a hemisome and only with a hemisome among all proposed structures   (Fig. 1, middle panel). More recently, we have used our novel CUT&RUN chromatin profiling method (Skene and Henikoff 2017) to show the presence of histone H2A in the CDEII particle (Fig. 2), definitively excluding the (Cse4/H4) 2 tetrasome model (Mizuguchi et al. 2007;Wisniewski et al. 2014).

FISSION YEAST CENTROMERES LACK POSITIONING AND ARE POPULATED BY OCTAMERIC NUCLEOSOMES
Examination of the fungal phylogeny argues that point centromeres, which are exclusive to the Saccharomycetes, have evolved from "regional" centromeres that lack obvious sequence specificity (Malik and Henikoff 2009). For example, centromeres of the parasitic budding yeast Candida albicans show no sequence similarity to those of its close relative Candida dubliniensis (Padmanabhan et al. 2008), and deletion or replacement results in efficient formation of cenH3-enriched neocentromeres nearby (Baum et al. 2006;Ketel et al. 2009;Thakur and Sanyal 2013). Centromeres of the fission yeast S. pombe consist of a series of heterochromatic outer repeats flanking a cenH3-rich core that lacks obvious sequence-specific features. Thus it would appear that budding yeast point centromeres have emerged from a fungal ancestor that lacks sequence-specific centromeres, perhaps the result of colonization by the 2-µm parasitic element that segregates autonomously within the Saccharomycetes (Malik and Henikoff 2009). Consistent with an independent origin of point centromeres from regional centromeres, we have found that the cenH3 nucleosomes at the centromeric core of S. pombe regional centromeres are completely different from those of S. cerevisiae. Fission yeast cenH3 nucleosomes show no detectable positioning within the central core, and H4 chemical cleavage mapping reveals that there are two H4s per particle (Fig. 1, bottom panel), indicative of octameric nucleosomes . Thus, the extreme diversity of centromere architecture is reflected in the diversity of centromeric chromatin within the fungal lineage, where point centromeres are bound by well-phased hemisomes and regional centromeres are bound by unphased octasomes.

C. ELEGANS HOLOCENTROMERES ARE POLYCENTRIC
The most fundamental distinction between centromeres is the distinction between familiar monocentromeres and holocentromeres, in which microtubule attachments occur throughout the length of the chromosome. Our cloning and cytological characterization of cenH3 from the nematode worm C. elegans revealed that it occupies the leading edge of each sister chromosome as it segregates to the pole at mitosis, and absence of cenH3 within the bulk of chromatin implied that holocentromeres are discontinuous (Buchwitz et al. 1999). Standard ChIP-seq uncovered a large domain structure of low-density cenH3 (Gassmann et al. 2012), and Florian Steiner's high-resolution native ChIP revealed the presence of approximately 100 well-positioned cenH3 loci dispersed throughout each chromosome (Steiner and Henikoff 2014). Remarkably, these centromeric loci resembled yeast point centromeres with particle sizes of ∼80 bp flanked by well-phased H3 nucleosomes (Fig. 3). Motif analysis showed that holocentromeres correspond to previously described GA-rich transcription factor hotspots (Gerstein et al. 2010), leading us to speculate that low-affinity binding of transcription factors at these sites during interphase prevents encroachment by flanking nucleosomes, thus maintaining holocentromere sites accessible at mitosis throughout development (Steiner and Henikoff 2014).
The mapping of dispersed point centromeres at transcription factor hotspots throughout C. elegans chromosomes addressed a long-standing question in chromosome biology by showing that holocentromeres are polycentric as opposed to being diffuse (Schrader 1935). More recently, native ChIP-seq of centromeres of the parasitic nematode Ascaris suum suggested that these centromeres are also polycentric, but they do not appear to be point centromeres in that cenH3-rich regions consist of high-density arrays of 1-15 kb, with particle sizes of ∼140 bp, consistent with conventional octasomes (Kang et al. 2016).

INSECT HOLOCENTROMERES LACK cenH3
Holocentricity has evolved in multiple lineages of animals and plants (Melters et al. 2012). For example, holocentromeres have evolved independently in insects at least four times (Drinnenberg et al. 2014). When Ines Drinnenberg was a postdoctoral fellow jointly with the Malik laboratory, she discovered that cenH3 and other proteins of the CCAN are absent from holocentric Lepidopteran lineages. Remarkably, loss of these CCAN proteins coincided with all four known transitions from monocentricity to holocentricity in insects (Fig. 4). In contrast, proteins of the outer kinetochore are nearly universally conserved, including in the kinetoplastid Trypanosoma, which also lacks canonical CCAN proteins (Akiyoshi and Gull 2014;Senaratne and Drinnenberg 2017). Identifying the chromatin counterparts of the CCAN in these lineages and determining whether insect holocentromeres are polycentric or diffuse are issues that remain to be addressed (Drinnenberg et al. 2016).

PRECISE POSITIONING OF cenH3 NUCLEOSOMES AT SATELLITE CENTROMERES
The identification of CENP-A nucleosomes as marking functional mammalian centromeres (Palmer and Margolis 1985) was an important advance in centromere biology. However, it was not until 1997 that the equivalent of the Clarke and Carbon construction of a functional centromere was accomplished for a satellite centromere (Harrington et al. 1997). Most human artificial centromeres require hundreds of kilobases of an ∼170-bp α-satellite repeat array organized as higher-order repeats (HORs), including ∼17-bp consensus binding sites for CENP-B protein (Hayden et al. 2013). Several studies have used ChIP-seq to show that CENP-A nucleosomes occupy α-satellite sequences, although the exact composition and conformation of the particles continue to be debated (Bui et al. 2012;Hasson et al. 2013;Lacoste et al. 2014;Athwal et al. 2015;Henikoff et al. 2015;Thakur and Henikoff 2016;Nechemia-Arbely et al. 2017). We have found that much of the disagreement stems from the difficulty of mapping ChIP-seq reads to tandemly repeated DNA sequences, which have proven to be intractable using standard tools for sequence assembly. To circumvent this problem, we clustered CENP-A ChIP-seq reads de novo, identifying two families of homogeneous dimeric repeats with CENP-B boxes that dominate human centromeres . This extended earlier work of Alexandrov et al. (2001), who originally identified these two "suprachromosomal" α-satellite families, SF1 and SF2, as homogeneous dimeric arrays present on 20 of the 24 different human centromeres. Indeed, we identified a preponderance of precisely positioned 100-bp MNase protected particles for these two dominant families, but we also identified larger particles for other HORs, such as the DXZ1 SF3 subfamily (Fig. 5), that are less homogeneous in CENP-A ChIP-seq data . Our findings suggested that particle conformation detected by native MNase ChIP-seq can differ greatly between different human α-satellite families, with precise positioning a characteristic of the most homogeneous repeats. Precise rotational positioning of cenH3 nucleosomes is a feature of homogeneous satellite centromeres in rice, as determined in a collaborative ChIP-seq study from the Jiming Jiang laboratory (Zhang et al. 2013).

A COHERENT INNER KINETOCHORE COMPLEX OCCUPIES YOUNG HUMAN CENTROMERIC REPEATS
Whereas we and others had observed 100-bp α-satellite particles using native MNase-based ChIP-seq (Hasson et al. 2013;Henikoff et al. 2015), we also found that formaldehyde cross-linking resulted in protection of particles that were larger than nucleosome size (Thakur and Henikoff 2016). This observation suggested that MNase cleavage under the conditions we used for native ChIPseq was disrupting particle integrity, but that cross-linking held together a particle consisting of additional components. In our ChIP-seq study of fission yeast centromeres, we had identified CENP-T as an integral component of CENP-A and CENP-C enriched chromatin . However, we initially failed to identify human CENP-T by native MNase-based ChIP-seq, consistent with the prevailing view that in mammals CENP-T makes connections with H3 but not CENP-A nucleosomes (Mc-Kinley and Cheeseman 2016). As connections to the outer kinetochore are made independently by both CENP-C and CENP-T, the concept of separate anchors for the kinetochore on the DNA was a central issue in understanding centromere biology. The possibility of independent connections between centromeric DNA and the outer kinetochore seemed plausible, insofar as CENP-T and its three partner proteins in the CENP-TWSX complex should be sufficient to directly anchor the outer kinetochore: All four are histone-fold proteins that can be assembled in vitro into nucleosome-like particles that stably wrap DNA in a right-handed orientation. But we wondered: Could it be that the CENP-TWSX particle accounts for the difference in protection by CENP-A/CENP-C par-ticles using native versus formaldehyde cross-linking MNase-ChIP?
To directly address this possibility, we reasoned that the conditions that are used for MNase ChIP-seq might leave behind more condensed particles. Differential salt solubility is a feature of conventional nucleosomes, which can be fractionated into classical "active" (low-NaCl), histone H1-rich (high-NaCl), and "nuclear matrix" (insoluble) components (Sanders 1978;Henikoff et al. 2009). Indeed, using cross-linking and light sonication after MNase digestion, we found that CENP-T containing α-satellite particles could be recovered in robust amounts, and CENP-T particles precisely co-mapped with CENP-A and CENP-C (Thakur and Henikoff 2016). It is possible that estimated particle size differences reported in studies using native MNase ChIP-seq (Hasson et al. 2013;Lacoste et al. 2014;Henikoff et al. 2015;Nechemia-Arbely et al. 2017) can be explained by the lability of CENP-TWSX-containing particles when subjected to MNase treatment.
To confirm that a coherent CCAN particle containing CENP-A, CENP-C, and CENP-T occupies functional human centromeres, we performed tandem ChIP-seq on FLAG-CENP-A particles subjected to cross-linking and heavy digestion with MNase to produce single coherent particles. After first immunoprecipitating with an anti-FLAG antibody and eluting with FLAG peptide, we immunoprecipitated with CENP-A, CENP-B, CENP-C, and CENP-T antibodies, obtaining strong enrichment for each component (Fig. 6). Enrichment was seen for consensus sequences representing SF1, SF2, and SF3 subfamilies and for the Y centromere, which indicates that at all human centromeres a single CCAN complex contains four of these centromere-specific DNA-binding components. . Insect holocentromeres lack cenH3. Four separate transitions from monocentric to holocentric kinetochores have occurred within insect evolution (red H). In each case, the transition involves loss of the inner kinetochore proteins cenH3 and CENP-C, except in Odonata, where CENP-C is retained but no longer has the CENPC motif that recognizes cenH3. Outer kinetochore components, notably Ndc80, which attaches directly to microtubules, are generally preserved. (Adapted from Drinnenberg et al. 2014.) C A B Figure 5. Precise positioning of cenH3 nucleosomes at human α-satellite centromeres. Young α-satellite dimers precisely position ∼100-bp CENP-A nucleosome particles. (A-C ) Size distributions of fragments mapping to an SF1 consensus dimer (A) and an SF2 consensus dimer (B) and to the most proximal 6-kb region of DXZ1 (C ), which belongs to SF3. Graphs on the right are expansions of graphs on the left (indicated by brackets). The y-axis scale is for input normalized counts, and the areas under the other curves were equalized to that for input. (Reprinted from Henikoff et al. 2015.)

UNDERSTANDING THE EVOLUTIONARY PLASTICITY OF CENTROMERIC CHROMATIN
These studies of centromeric chromatin in a wide variety of organisms have uncovered an astonishing diversity that is difficult to reconcile with the conserved function of centromeric chromatin, which is to anchor the outer kinetochore complex (Steiner and Henikoff 2015). What accounts for such remarkable centromeric chromatin plasticity?
Evolutionary plasticity of centromeric chromatin might arise from selfish processes. Centromere drive is a cyclical process, in which satellite sequences expand and compete for the egg pole at meiosis I but are suppressed by mutations in host CCAN and other proteins . Successful drive of a satellite centromere may eventually result in fixation of that satellite sequence in the species evolving into a genetically defined centromere, whereas successful suppression of drive over evolutionary time tends to reduce centromere sequence specificity evolving into an epigenetically defined centromere (Dawe and Henikoff 2006). Interestingly, the cenH3s of the holocentric plant genus Luzula show no evidence of centromere drive, in contrast to the cenH3s of monocentric plants (Zedek and Bureš 2016), which suggests that elimination of a fixed position on the chromosome that can be colonized by selfish DNA may be an effective defense against centromere drive ( Fig. 7; Talbert et al. 2008). In contrast, the holocentromeres in nematode oocyte meiosis are axially oriented so that the chromosome end functionally resembles a monocentromere (Albertson and Thomson 1993). This telokinetic form of chromosome disjunction may be less effective at avoiding centromere drive than the holokinetic disjunction of equatorially orientated Luzula chromosomes (Heckmann et al. 2014). Although the role of cenH3 in the cup-like kinetochore of Caenorhabditis meiosis is disputed (Chan et al. 2004;Monen et al. 2005), it differs between oocyte and spermatocyte meiosis (Shakes et al. 2009), and cenH3 shows evidence of positive selection (Zedek and Bureš 2012), suggesting that centromere drive may affect telokinetic holocentromeres. This may explain the massive accumulation of heterochromatic sequences at the ends of the single chromosome in Parascaris univalens (Talbert et al. 2008).
Whereas native human centromeres are genetically defined in that they are dominated by specific α-satellite arrays on each centromere, rare cases of epigenetically defined human neocentromeres suggests that centromeric chromatin plasticity is an inevitable consequence of an ever-changing DNA sequence substrate for CCAN assembly. Other selfish processes may be responsible for transitions between different fungal centromere types, such as the proposed colonization of Saccharomycetes centromeres by its 2-µm plasmid (Malik and Henikoff 2009). Figure 6. A coherent inner kinetochore complex occupies young human centromeric repeats. Sequential ChIP-seq profiles of CCAN components are nearly identical. A single α-satellite dimer from each array is shown, and the relative scale is the area of the indicated profile divided by the area of the D5Z1 profile, in which the numbers reflect the product of the total sequence abundance and enrichment. Because DXZ1, DYZ3, D19Z1, Xmono, and D5Z1 are not dimeric units, we chose pairs of tandem monomers as representatives. (Based on data from Thakur and Henikoff 2016.) One common feature of genetically defined centromeres, whether the point centromeres of Saccharomycetes or the satellite centromeres of animals and plants, is that they are well-positioned. In contrast, the central core of fission yeast regional centromeres lacks obvious sequence specificity and shows random positioning of nucleosomes . Precise positioning of nucleosomes, which is especially evident in the satellite centromeres of rice (Zhang et al. 2013), is a sequence-specific adaptation of satellite DNA that has been exploited for the first successful crystallization of H3 nucleosomes, which were produced using a human α-satellite DNA derivative (Harp et al. 1996;Luger et al. 1997). Precise phasing, together with the action of satellite-specific DNA-binding proteins, such as CENP-B, may represent the means whereby selfish centromeres perpetuate themselves. CENP-B is derived from the transposase encoded by the pogo DNA transposon, and it is attractive to imagine that its original domestication as a centromere protein was via recruitment by a driving centromere.
The domestication of selfish elements appears to be responsible for the most remarkable example of centromere evolution, the partial or complete replacement of the satellite centromeres of the wild ancestor of maize by a selfish element in maize inbred lines (Schneider et al. 2016). During domestication of maize from teosinte over the past ∼9000 years, selection for agronomic traits was accompanied by dense insertion of CR2 centromerespecific retrotransposons and frequent loss by small inversions and deletions of the ancestral 155-bp CentC sat-ellite in all 10 maize centromeres, 57 times independently in 26 inbred lines. Domestication and inbreeding would thus appear to be a powerful driver of centromere plasticity and could account for the near-complete loss of satellite arrays at several rice ) and potato (Gong et al. 2012) centromeres and the complete loss of repetitive sequences from horse Centromere 11 (Purgato et al. 2015). Thus, whether the direct result of centromere competition, the colonization by selfish elements, or the indirect result of domestication, strong selective forces appear to be drivers of the remarkable plasticity of centromeric chromatin.

CONCLUSION
Our investigations into the molecular basis for centromere function have uncovered an unexpected plasticity in chromatin architecture, with evidence for histone-containing particles that are unlike any nucleosomes observed on chromosome arms. These studies have led to evidence for centromeres with right-handed writhes, hemisomes, and coherent CCAN particles containing multiple centromere-specific histone-fold and nonhistone proteins. Such structural diversity might seem surprising considering the simplicity of the conserved function of centromeric chromatin to connect to the outer kinetochore. However, the centromere's control of its chromosome's fate at every cell division has unleashed powerful selfish and selective forces over evolutionary time and even during human domes- Figure 7. Holokinetic meiosis suppresses centromere drive. In female meiosis, three meiotic products (−) degenerate and one (+) survives. With random segregation each chromatid in the meiotic tetrad has an equal opportunity to be oriented toward the egg pole. However, selfish DNA that influences centromere "strength" through greater assembly of kinetochore components may be preferentially transmitted (centromere drive), while simultaneously creating deleterious centromere imbalance in males. Adaptation of kinetochore proteins to restore centromere parity may suppress drive, but cycles of drive and suppression will repeat. In the holokinetic chromosomes of Luzula, the equatorial orientation of many broadly distributed kinetochore sites makes it difficult for any localized selfish element to gain a transmission advantage, effectively ending cycles of drive and suppression (Zedek and Bureš 2016). tication that have resulted in such interesting molecular complexity.

ACKNOWLEDGMENTS
We thank Mitchell Smith, who approximately 20 years ago pointed out that yeast Cse4 and human CENP-A are no more similar to one another than to H3, a curious fact that led to our subsequent focus on histone variants and centromere evolution. We also thank former members of our laboratory whose work and insights over the years have contributed to this narrative.