Aegilops tauschii Aet5.0 Assembly & Annotation

Overview

Analysis Name Aegilops tauschii Aet5.0 Assembly & Annotation
Sequencing technology PacBio
Assembly method FALCON v. 2
Release Date 2021-09-30
Reference Publication(s)

Wang L, Zhu T, Rodriguez JC, Deal KR, Dubcovsky J, McGuire PE, Lux T, Spannagl M, Mayer KFX, Baldrich P, Meyers BC, Huo N, Gu YQ, Zhou H, Devos KM, Bennetzen JL, Unver T, Budak H, Gulick PJ, Galiba G, Kalapos B, Nelson DR, Li P, You FM, Luo MC, Dvorak J. Aegilops tauschii genome assembly Aet v5.0 features greater sequence contiguity and improved annotation. G3 (Bethesda). 2021 Dec 8;11(12):jkab325. doi: 10.1093/g3journal/jkab325.

Abstract

Aegilops tauschii is the donor of the D subgenome of hexaploid wheat and an important genetic resource. The reference-quality genome sequence Aet v4.0 for Ae. tauschii acc. AL8/78 was therefore an important milestone for wheat biology and breeding. Further advances in sequencing acc. AL8/78 and release of the Aet v5.0 sequence assembly are reported here. Two new optical maps were constructed and used in the revision of pseudomolecules. Gaps were closed with Pacific Biosciences long-read contigs, decreasing the gap number by 38,899. Transposable elements and protein-coding genes were reannotated. The number of annotated high-confidence genes was reduced from 39,635 in Aet v4.0 to 32,885 in Aet v5.0. A total of 2245 biologically important genes, including those affecting plant phenology, grain quality, and tolerance of abiotic stresses in wheat, was manually annotated and disease-resistance genes were annotated by a dedicated pipeline. Disease-resistance genes encoding nucleotide-binding site domains, receptor-like protein kinases, and receptor-like proteins were preferentially located in distal chromosome regions, whereas those encoding transmembrane coiled-coil proteins were dispersed more evenly along the chromosomes. Discovery, annotation, and expression analyses of microRNA (miRNA) precursors, mature miRNAs, and phasiRNAs are reported, including miRNA target genes. Other small RNAs, such as hc-siRNAs and tRFs, were characterized. These advances enhance the utility of the Ae. tauschii genome sequence for wheat genetics, biotechnology, and breeding.

Assembly statistics

Genome size 4.2 Gb
Number of chromosomes 7
Number of scaffolds 109,196
Scaffold N50 576.2 Mb
Scaffold L50 4
Number of contigs 176,978
Contig N50 211.2 kb
Contig L50 5,744
Assembly level Chromosome

Assembly

The Aegilops tauschii Aet5.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCF_002575655.2_Aet_v5.0_genomic.fna.gz

Gene Predictions

The Aegilops tauschii Aet5.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GCF_002575655.2_Aet_v5.0_genomic.gff.gz
CDS sequences (FASTA file) GCF_002575655.2_Aet_v5.0_translated_cds.faa.gz
Protein sequences (FASTA file) GCF_002575655.2_Aet_v5.0_protein.faa.gz

Functional Analysis

Functional annotation for the Aegilops tauschii Aet5.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Aegilops_tauschii_Aet5.0.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SΨNC_053035.250196730388829131-88830318LpSDUF247-I_chromosome174DUF247
DUF247II-SNC_053035.250196730388546346-88547986LpSDUF247-II_chromosome176DUF247
HPS10-SNC_053035.250196730388827072-88827192,
88827278-88827411
LpsS_contig1102957-
DUF247I-ZNC_053036.2650458083605344109-605345734LpZDUF247-I_chromosome2 61DUF247
HPS10-ZNC_053036.2650458083605242037-605242203,
605242283-605242433
Bhybridum_HPS10-Z46-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences