Analysis Name | Aegilops tauschii Aet5.0 Assembly & Annotation |
Sequencing technology | PacBio |
Assembly method | FALCON v. 2 |
Release Date | 2021-09-30 |
Wang L, Zhu T, Rodriguez JC, Deal KR, Dubcovsky J, McGuire PE, Lux T, Spannagl M, Mayer KFX, Baldrich P, Meyers BC, Huo N, Gu YQ, Zhou H, Devos KM, Bennetzen JL, Unver T, Budak H, Gulick PJ, Galiba G, Kalapos B, Nelson DR, Li P, You FM, Luo MC, Dvorak J. Aegilops tauschii genome assembly Aet v5.0 features greater sequence contiguity and improved annotation. G3 (Bethesda). 2021 Dec 8;11(12):jkab325. doi: 10.1093/g3journal/jkab325.
AbstractAegilops tauschii is the donor of the D subgenome of hexaploid wheat and an important genetic resource. The reference-quality genome sequence Aet v4.0 for Ae. tauschii acc. AL8/78 was therefore an important milestone for wheat biology and breeding. Further advances in sequencing acc. AL8/78 and release of the Aet v5.0 sequence assembly are reported here. Two new optical maps were constructed and used in the revision of pseudomolecules. Gaps were closed with Pacific Biosciences long-read contigs, decreasing the gap number by 38,899. Transposable elements and protein-coding genes were reannotated. The number of annotated high-confidence genes was reduced from 39,635 in Aet v4.0 to 32,885 in Aet v5.0. A total of 2245 biologically important genes, including those affecting plant phenology, grain quality, and tolerance of abiotic stresses in wheat, was manually annotated and disease-resistance genes were annotated by a dedicated pipeline. Disease-resistance genes encoding nucleotide-binding site domains, receptor-like protein kinases, and receptor-like proteins were preferentially located in distal chromosome regions, whereas those encoding transmembrane coiled-coil proteins were dispersed more evenly along the chromosomes. Discovery, annotation, and expression analyses of microRNA (miRNA) precursors, mature miRNAs, and phasiRNAs are reported, including miRNA target genes. Other small RNAs, such as hc-siRNAs and tRFs, were characterized. These advances enhance the utility of the Ae. tauschii genome sequence for wheat genetics, biotechnology, and breeding.
Assembly statistics
Genome size | 4.2 Gb |
Number of chromosomes | 7 |
Number of scaffolds | 109,196 |
Scaffold N50 | 576.2 Mb |
Scaffold L50 | 4 |
Number of contigs | 176,978 |
Contig N50 | 211.2 kb |
Contig L50 | 5,744 |
Assembly level | Chromosome |
The Aegilops tauschii Aet5.0 Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | GCF_002575655.2_Aet_v5.0_genomic.fna.gz |
The Aegilops tauschii Aet5.0 genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | GCF_002575655.2_Aet_v5.0_genomic.gff.gz |
CDS sequences (FASTA file) | GCF_002575655.2_Aet_v5.0_translated_cds.faa.gz |
Protein sequences (FASTA file) | GCF_002575655.2_Aet_v5.0_protein.faa.gz |
Functional annotation for the Aegilops tauschii Aet5.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Aegilops_tauschii_Aet5.0.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | tBLASTn Hit | tBLASTn %ID | Domain |
DUF247I-SΨ | NC_053035.2 | 501967303 | 88829131-88830318 | LpSDUF247-I_chromosome1 | 74 | DUF247 |
DUF247II-S | NC_053035.2 | 501967303 | 88546346-88547986 | LpSDUF247-II_chromosome1 | 76 | DUF247 |
HPS10-S | NC_053035.2 | 501967303 | 88827072-88827192,88827278-88827411 | LpsS_contig11029 | 57 | - |
DUF247I-Z | NC_053036.2 | 650458083 | 605344109-605345734 | LpZDUF247-I_chromosome2 | 61 | DUF247 |
HPS10-Z | NC_053036.2 | 650458083 | 605242037-605242203,605242283-605242433 | Bhybridum_HPS10-Z | 46 | - |
Nucleotide
Protein