Small RNA Function in Plants: From Chromatin to the Next Generation

Small RNA molecules can target a particular virus, gene, or transposable element (TE) with a high degree of specificity. Their ability to move from cell to cell and recognize targets in trans also allows building networks capable of regulating a large number of related targets at once. In the case of epigenetic silencing, small RNA may use the widespread distribution of TEs in eukaryotic genomesto coordinate manyloci across developmentalandgenerational time.Here, we discussthe intriguingrole of plant small RNA in targeting transposons and repeats in pollen and seeds. Epigenetic reprogramming in the germline and early seed development provides a mechanism to control genome dosage, imprinted gene expression, and incompatible hybridizations via the “ triploid block. ”

Small RNA molecules can target a particular virus, gene, or transposable element (TE) with a high degree of specificity. Their ability to move from cell to cell and recognize targets in trans also allows building networks capable of regulating a large number of related targets at once. In the case of epigenetic silencing, small RNA may use the widespread distribution of TEs in eukaryotic genomes to coordinate many loci across developmental and generational time. Here, we discuss the intriguing role of plant small RNA in targeting transposons and repeats in pollen and seeds. Epigenetic reprogramming in the germline and early seed development provides a mechanism to control genome dosage, imprinted gene expression, and incompatible hybridizations via the "triploid block." Epigenetic phenomena have long been recognized and even harnessed in plants. Examples include early studies from Barbara McClintock on how transposons act as controlling elements through the production of a "repressor substance" (McClintock 1961), imprinting and the nonreciprocal fate of hyperploid seeds (Blakeslee et al. 1920;Kermicle 1970), transgenerational paramutation (Coe 1966;Chandler 2010), and somaclonal variation (Ong-Abdullah et al. 2015). It is now widely accepted that small RNA molecules (sRNAs) (21-to 24-nt-long in plants) actively participate in all these phenomena, but how these molecules themselves influence cell fate in subsequent generations remains mysterious (Heard and Martienssen 2014). Exciting recent studies in plants and in other eukaryotes now point to new roles for these small RNAs that are demanding further investigation.
From developmental signaling to virus resistance, sRNAs have many important functions in plant cells, yet the majority of them target repeats and transposable elements (TEs) previously considered largely inert (Borges and Martienssen 2015). Active transposons present the cell with an important challenge given their propensity to copy themselves and "move" within the genome, as well as causing chromosomal instability and thereby necessitating tight control. What makes sRNAs particularly well-suited to silence these elements is their ability to recognize sequence homology in trans. Indeed, host defense against invasive species appears to be a conserved function of sRNAs and associated factors going all the way back to prokaryotes (Swarts et al. 2014 (Chuong et al. 2017). Indeed, the massive expansion of specific transposon families in different eukaryotic genomes provides abundant regulatory opportunities for sRNA-mediated control of gene expression at a genome-wide scale. Here, we summarize recent findings on how plant sRNAs regulate cell fate and transgenerational inheritance of epigenetic states.

GENERATING SMALL RNA MOLECULES
sRNAs in plants have different sizes owing to a variety of DICER-LIKE (DCL) proteins each specializing in a given size class (Borges and Martienssen 2015). In flowering plants, repeated regions and TEs typically produce small-interfering RNA molecules (siRNAs) that are 23-to 24-nt-long, although abundant 21-nt and 22-nt siRNAs are also produced in some species or specific developmental contexts and are known as epigenetically activated siRNA (easiRNA) (Nuthikattu et al. 2013;Creasey et al. 2014). easiRNA precursors can be transcribed by the RNA polymerase II (Pol II) or the plant-specific RNA polymerase IV, which shares many subunits with Pol II and is generally required for TE siRNA production. Pol IV generates short transcripts (Blevins et al. 2015;Zhai et al. 2015) that are immediately converted into double-stranded RNA by the RNA DEPENDENT RNA POLYMER-ASE2 (RDR2), which forms a complex with Pol IV and is required for polymerase activity (Singh et al. 2019).
Importantly, the emergence of Pol IVallowed transcription of heterochromatic regions that are refractory to Pol II, thus allowing abundant production of siRNAs from transcriptionally silent transposons and repeats. One important factor required for Pol IV recruitment to a subset of heterochromatic loci is the SAWADEE HOMOEDO-MAIN HOMOLOG1 (SHH1) protein, which binds to the dimethylated lysine 9 on histone 3 (H3K9me2) and resembles homeobox transcription factors (Law et al. 2013). Targeting of transcription to heterochromatin also occurs at Piwi-interacting RNA ( piRNA) clusters in Drosophila, via the methylated H3K9 binding protein Rhino, presumably for much the same reason (Mohn et al. 2014;Yu et al. 2018). More recently, it was shown that the different members of the SWI2/SNF2-type chromatin remodeling protein family CLASSY (CLSY) also assist in Pol IV recruitment at specific targets (Zhou et al. 2018). Although CLSY1 and 2 appear to act together with SHH1, the recruiting partner(s) of CLSY3 and 4 are still unknown.
Paradoxically, RNA polymerase is required for expression, whereas sRNAs are associated with silencing. Perhaps for this reason, there is a constant need for surveillance of Pol II transcripts to intercept potentially problematic transcripts derived from transposons. In Arabidopsis thaliana, Pol II-dependent TE transcripts are rapidly targeted by a variety of endogenous microRNAs (miRNAs) that trigger production of 21-to 22-nt siRNAs via RDR6, DCL2, and DCL4 (Creasey et al. 2014). Interestingly, certain miRNA families in plants evolved to target long terminal repeat (LTR) retrotransposons specifically. Most notably the miR845 family targets retrotransposons at the conserved primer-binding site (PBS) where transfer RNAs (tRNAs) bind to initiate reverse transcription (Borges et al. 2018). Targeting the PBS with small RNA is a common mechanism for transposon control in both mammals and plants, via 3′ fragments of mature tRNA in mammals, and via miRNA derived from rearranged tRNA in plants (Šurbanovski et al. 2016;Schorn et al. 2017). At least in mammals, 3′CCA-tRF are potent inhibitors of retrotransposition and might provide a uniquely sensitive means to monitor transposon activity in eukaryotic genomes .
In plants, miRNAs are produced by the enzyme DCL1, whereas secondary siRNAs can be the product of either DCL2, DCL3, or DCL4 (or any combination of the three) working in combination with RDR6 (Borges and Martienssen 2015). Although most sRNAs from TEs are 24 nt in length, there are a few repeated loci in Arabidopsis that generate 21-and 22-nt secondary siRNAs ). These siRNAs resemble easiRNAs, which arise in specific genetic, cellular, and/or temporal contexts and depend on DCL2 and DCL4. For instance, the massive activation of transposons in DEFECTIVE IN DNA METHYLATION1 (DDM1) mutants causes easiRNA accumulation in Arabidopsis, maize, and tomato (Slotkin et al. 2009;Creasey et al. 2014;Corem et al. 2018;Fu et al. 2018). easiRNAs also arise in support cells within gametophytes and in the seed, where TEs are epigenetically reactivated during reprogramming (Slotkin et al. 2009;Calarco et al. 2012;Ibarra et al. 2012). Intriguingly, easiRNAs in pollen depend on a noncanonical pathway involving components of both the siRNA and secondary siRNA pathways in plants, including RNA Pol IV, DCL2, and DCL4 (Borges et al. 2018;Martinez et al. 2018). This is especially interesting as heterochromatin is lost from the vegetative nucleus (VN) (the "nurse cell" in pollen) as well as from the microspore. In seeds the same pathway is required for the biogenesis of DCL4 isoform-dependent siRNAs (disiRNAs). These 21-nt small RNAs depend on a DCL4 isoform found in Arabidopsis and potentially in other Brassicaceae like Capsella rubella, which includes a nuclear localization signal (NLS) and depends on loss of DNA methylation for expression (Pumplin et al. 2016). Loss of DNA methylation and expression of this isoform is found in pollen and in endosperm, consistent with these observations (Pumplin et al. 2016;our unpublished results). A recent study also showed that sRNA molecules of 21, 22, and 24 nt are also dependent on Pol IV in the microspores of Arabidopsis and C. rubella ( Fig. 1A) (Wang et al. 2020), indicating that biogenesis of gametophytic easiRNA is conserved at least in Brassicaceae.
Importantly, easiRNAs in pollen accumulate in sperm cells, but arise in the microspore and the VN, suggesting that they might move from cell to cell (Slotkin et al. 2009) (Fig. 1B), together with other small RNAs. This was recently showed in Arabidopsis pollen as transgene small RNA products made by DCL2 and DCL4 moved from the VN to the sperm cells to silence target transcripts (Martínez et al. 2016). One question that remains is whether these molecules diffuse passively or if they are actively transported by a protein factor, such as a member of the ARGONAUTE (AGO) family. Regardless, there is significant potential for sRNAs loaded in sperm cells to be delivered to the egg and central cells and contribute to early embryo and endosperm development (Fig. 1B,C), as has been shown for at least one miRNA (Zhao et al. 2018). One idea is that easiRNAs mediate the transition from cytoplasmic posttranscriptional gene silencing (PTGS) to nuclear transcriptional gene silencing (TGS) during epigenetic reprogramming, to establish or reinforce transgenerational silencing at imprinted loci (Teixeira and Colot 2010;Borges and Martienssen 2015).

REPROGRAMMING CHROMATIN WITH SMALL RNA
Epigenetic reprogramming refers to the erasure and resetting of epigenetic marks acquired during the life of the parent, and in mammals it occurs in the gametes and in the embryo (Heard and Martienssen 2014). The extent of reprogramming in plants is much less pronounced, and epigenetic inheritance far more common, as the plant germline maintains high levels of DNA methylation during the sporophytic (somatic) to gametophytic (germline) transition (Calarco et al. 2012;Ibarra et al. 2012;Ingouff et al. 2017;Walker et al. 2018). In plants, sRNAs can trigger cytosine methylation (mC) in all sequence contexts (CG, CHG, and CHH, where H = A, C, or T) via RNA-directed DNA methylation (RdDM) (Matzke and Mosher 2014;Borges and Martienssen 2015;Matzke et al. 2015). CG and CHG methylation are symmetric on both strands and can be maintained during DNA replication by chromatin remodeling and histone modification, but mCHH is asymmetric and requires RdDM (Law and Jacobsen 2010;Borges and Martienssen 2015). Although high levels of mCG are found in male meiocytes, microspores, and differentiated sperm nuclei in the pollen, mCHH is almost completely erased in the male germline (Calarco et al. 2012;Ibarra et al. 2012;Walker et al. 2018), only to be restored to high levels in the mature embryo Bouyer et al. 2017;Kawakatsu et al. 2017). The role of easiRNA during this form of reprogramming is still poorly understood, but might contribute to the restoration of CHH methylation during embryogenesis. In C. rubella, it was shown that Pol IV is essential for pollen development (Wang et al. 2020). The lack of severe phenotype in Arabidopsis Pol IV mutants may be due to a lower content of active transposons, although fertile Pol IV mutants in maize suggest that other mechanisms may be involved. In maize, Pol IV is required for paramutation, a classical epigenetic phenomenon in which epigenetic marks can be acquired from another allele and maintained in following generations (Hollick 2017). How those epialleles are formed and how they acquire the ability to be propagated through cell divisions remains a tantalizing mystery, as DNA methylation seems to play only a minor role. It has, however, been shown in Arabidopsis that some components of small RNA pathways are both required and sufficient to initiate de novo silencing of naive alleles (Fultz and Slotkin 2017;Gallego-Bartolomé et al. 2019).
Small RNA-loaded AGO4 (and likely its homologs AGO6 and AGO9 in the case of A. thaliana) is recruited to chromatin via noncoding RNA scaffolds produced by another plant-specific DNA-dependent RNA Polymerase, Pol V. Target recognition triggers the slicing of the RNA molecule (Liu et al. 2018) and subsequently, recruitment of factors such as INVOLVED IN DE NOVO METHYL-ATION2 (IDN2) (Böhmdorfer et al. 2014). IDN2 binds to long (10-to 11-nt) 5′ overhangs, presumably reflecting the cleaved RNA molecule still bound to the small RNA, and in turn recruits a variety of chromatin remodeling proteins responsible for silencing (Zhu et al. 2013;Liu et al. 2016), including DOMAINS REARRANGED METHYL-TRANSFERASE2 (DRM2) that methylates corresponding DNA in all sequence contexts (Böhmdorfer et al. 2014). Such methylation is thought to be recognized by members of the SU(VAR)3-9 HOMOLOG (SUVH) through their SRA methylated DNA-binding domains ) that can then establish H3K9 methylation in the case of SUVH4, 5, and 6 (Stroud et al. 2014) or propagate DNA methylation by recruiting additional Pol V, in the case of SUVH2 and 9 ). In doing so, the RdDM pathway perpetuates an environment that is refractory to Pol II transcription thereby keeping TEs silent.
Pol IV mutations lead to genome-wide depletion of 24nt sRNAs (Zhou et al. 2018), but CHH methylation is only lost at a subset of those targets. The difference between the small RNA-dependent and -independent effects on asymmetric DNA methylation levels appears to be related to the presence or absence of Pol Vat the locus (Wierzbicki et al. 2012). Indeed, the precise extent of the regions silenced by sRNAs appears to be dictated solely by the activity of Pol V (Böhmdorfer et al. 2016;Liu et al. 2018). It has been proposed that Pol V recruitment is guided by epigenetic marks like mC and H3K9me2, but the correlation is far from perfect (Böhmdorfer et al. 2016). Importantly, if it simply relies on preexisting marks, Pol V recruitment could not account for de novo establishment of silencing.  (Zhong et al. 2012) whose partial structure has recently come to light (Wongpalee et al. 2019). Interestingly, the targeting of either DMS3 or RDM1 to a naive locus with a zinc finger is sufficient to trigger strong de novo silencing, but only in the presence of Pol V (Gallego-Bartolomé et al. 2019), consistent with an early recruiting role. Moreover, RDM1 has the capacity to bind to DNA (Gao et al. 2010), and DRD1 is the only protein associated with Pol V that has ATPase activity and chromatin remodeling potential for transcription (Wongpalee et al. 2019) as do other Rad54 homologs (Amitani et al. 2006). All in all, the activity of the DDR complex appears as a key to small RNA-guided epigenetic silencing.

FUNCTIONAL ROLE OF SMALL RNA IN THE GERMLINE
In Arabidopsis, the RdDM pathway targets mostly small repeats, the long terminal repeats of retrotransposons and certain families of DNA transposons (Matzke et al. 2015). Interestingly, these targets are distributed not only in pericentromeric regions where these elements are most abundant, but also in intergenic regions along the chromosome arms in Arabidopsis, tomato, and maize (Gouil and Baulcombe 2016). The proximity between these targeted regions and neighboring genes, therefore, creates the opportunity for sRNAs to influence their transcriptional activity. This type of control is observed in the seed where small RNA-directed epigenetic silencing activity is more pronounced Bouyer et al. 2017;Kawakatsu et al. 2017). Imprinted genes, or alleles differentially expressed in the endosperm depending on the parent of origin are a striking example of such control. The maternally expressed imprinted genes FWA, SDC, MOP9.5, DOG1-LIKE 4 (DOGL4), and ALLANTOINASE (ALN) were all shown to be methylated in the endosperm on the paternal side, and this was dependent on sRNA targeting a repeated element found in the promoter (Fig.  1D) (Vu et al. 2013;Zhu et al. 2018;Iwasaki et al. 2019). Interestingly, DOGL4 and ALN are involved in seed dormancy, thereby linking sRNA pathways to this agriculturally important trait. Similar examples of small RNAs regulating imprinted gene expression on the maternal side have also been shown to influence seed development and ultimately seed size (Kirkbride et al. 2019).
It is clear that sRNAs target transcriptionally active regions to silence them, but several questions remain as to how this mechanism is scaled up to hundreds or thousands of genes that must be expressed or silenced in a coordinated fashion (Chuong et al. 2017). This invokes the genetic concept of "balance" to explain phenotypic variation associated with genome dosage (Birchler and Veitia 2007). Early studies of distinctive phenotypes dependent on the dosage of parental chromosomes in plants and flies trace back to the 1920s (Blakeslee et al. 1920;Belling and Blakeslee 1923;Bridges 1925). A few years later in maize, McClintock also reported the spontaneous appearance of "a triploid individual notably more vigorous than its diploid sibs" (McClintock 1929). Albert Blakeslee, working at Cold Spring Harbor Laboratory, first reported the inability to hybridize closely related species or parents with different ploidy, which is often associated with failure in endosperm function and seed collapse. For example, interploidy hybridizations with paternal excess result in endosperm overproliferation and seed abortion, sometimes known as the "triploid block" (Köhler et al. 2010). sRNAs and TEs were recently implicated in the triploid block response in Arabidopsis, as the loss of miR845 and Pol IV-dependent easiRNA in diploid pollen restored viability of triploid seeds almost completely (Erdmann et al. 2017;Borges et al. 2018;Martinez et al. 2018). Strikingly, this is highly reminiscent of hybrid dysgenesis in Drosophila, although in this case piRNAs protect the hybrid (Malone et al. 2015;Martienssen 2010). Thus, plants and animals use similar small RNA guides to control transposon activity and dosage in hybrid genomes.
The triploid block response led to the endosperm balance number (EBN) hypothesis, which was developed in the early 1980s in potato and then extended to many other crop species, to explain the ratio between maternal and paternal chromosomes via genetic factors required for development of a normal seed (Johnston and Hanneman 1982;Ehlenfeldt and Hanneman 1984;Carputo et al. 1997Carputo et al. , 1999. We can now propose that such genetic factors may include small RNA loci, imprinted genes, and TEs that were found up-regulated in abortive interploid and interspecific hybrid seeds (Josefsson et al. 2006;Lu et al. 2012;Stoute et al. 2012;Kirkbride et al. 2015;Florez-Rueda et al. 2016;Roth et al. 2018). This includes maternally expressed genes (MEGs) that encode important components of the Polycomb repressor complex (PRC2), such as MEDEA and FERTILIZATION-INDE-PENDENT SEED2 (FIS2), that silence the paternal alleles via deposition of H3K27me3 (Lafon-Placette and Köhler 2015;Satyaki and Gehring 2017). Small RNAs and the RdDM pathway are also required for genomic imprinting (Vu et al. 2013;Erdmann et al. 2017;Zhu et al. 2018;Iwasaki et al. 2019;Kirkbride et al. 2019), whereas mutations in PRC2 result in endosperm defects closely resembling the triploid block. These observations lend support to the old idea that endosperm failure in interploidy crosses results from disruption of genomic imprinting (Köhler et al. 2010). Indeed, previous studies in Arabidopsis have shown that loss-of-function mutations in PEGs are able to suppress the triploid block, allowing the formation of viable triploid seeds (Kradolfer et al. 2013;Wolff et al. 2015). Taken together, current models propose that paternally inherited easiRNAs mediate "endosperm balance" by targeting TEs flanking imprinted genes, thus regulating their expression this way. For instance, if MEGs are direct targets of easiRNA regulation, this could result in PRC2 depletion and up-regulation of PEGs. However, there is still no clear evidence that PRC2 activity is impaired in abortive triploid seeds. On the other hand, targeting TEs flanking PEGs could result in their direct up-regulation in a dosage-dependent manner (Martinez et al. 2018), though target loci involved in this response have not been identified either.
Although defects in the RdDM pathway typically have a minor impact on somatic phenotypes in Arabidopsis (Sasaki et al. 2012), mutants losing both 24-nt siRNA and maintenance of DNA and histone methylation show various developmental phenotypes because of the misregulation of imprinted genes (Zemach et al. 2013). For example, SDC, an imprinted gene normally expressed maternally in endosperm cells, is up-regulated during the somatic developmental stage in that context, resulting in curly leaf and short stature phenotypes (Henderson and Jacobsen 2008;Zemach et al. 2013). In addition, reduced fertility and floral defects are also observed in the mutant defective in DDM1 and RDR6-dependent 21-nt and 22-nt sRNAs (Creasey et al. 2014), suggesting an important role of epigenetically activated 21-nt and 22-nt sRNAs. These defects are more severe when 24-nt siRNA is further depleted (Sasaki et al. 2012), and triple mutants suffer from chromosome missegregation (A Shimada and RA Martienssen, unpubl. data), reminiscent of the regulation of chromosome segregation by RNA interference in Schizosaccharomyces pombe (Gutbrod and Martienssen 2020). Thus in somatic tissues, sRNA molecules cooperatively work as a backup system for DNA methylation to maintain plant fitness and chromosome integrity.
It is revealing that the RdDM pathway is only found intact in the angiosperms-flowering plants (Lee et al. 2011;Huang et al. 2015). For example, key factors appear to be absent or incomplete in the gymnosperm Norway spruce, Pices abies Matzke et al. 2015;Ausin et al. 2016;Pei et al. 2019). It is tempting to speculate that RdDM evolved with the appearance of double fertilization and genomic imprinting in the endosperm of flowering plants, a tissue that has no equivalent in gymnosperms. It has been speculated that this pathway evolved as a facilitator of polyploidization, which has had a crucial role in the evolution of angiosperms (Matzke et al. 2015), but is relatively rare in gymnosperms (Ickert-Bond et al. 2020). This is also consistent with a key role for sRNA in mediating genome dosage responses during interploidy hybridization barriers in the endosperm (Erdmann et al. 2017;Borges et al. 2018;Martinez et al. 2018;Moreno-Romero et al. 2019;Satyaki and Gehring 2019). sRNA molecules may indeed target all the different copies of a given TE and mark them with DNA methylation, insuring their transcriptional control and potentially accelerating their decay and elimination. In line with this hypothesis, it appears that concerted changes in sRNA abundance and epigenetic changes occur mostly at transposons in autotetraploid rice  and in rapeseed following whole genome duplication (Cheng et al. 2016). There is also evidence of a transient increase in sRNA levels in newly synthesized allopolyploid Brassica napus (Martinez Palacios et al. 2019), whereas CHH methylation levels at TEs flanking genes are anticorrelated with their expression and may, therefore, be related to the establishment of subgenome dominance in interspecies crosses (Edger et al. 2017). All this points to an active role for sRNAs in unexpected "genomic shocks" that occur when distantly related genomes meet (McClintock 1984).

CONCLUSION
Since their discovery, sRNAs have been implicated in a plethora of biological processes ranging from gene regulation to defense against invaders. Among these important functions, their ability to be inherited by the next generation and influence it remains one of the most intriguing. Flowering plants present a fantastic opportunity to study such phenomenon because of their propensity to transmit epigenetic information to their offspring. The knowledge gained is already being applied to address emerging agricultural and environmental challenges.