Echinochloa crus-galli ec_v3 Assembly & Annotation

Overview

Analysis Name Echinochloa crus-galli ec_v3 Assembly & Annotation
Sequencing technology PacBio
Assembly method canu v1.8
Release Date 2022-01-10
Reference Publication(s)

Wu D, Shen E, Jiang B, Feng Y, Tang W, Lao S, Jia L, Lin HY, Xie L, Weng X, Dong C, Qian Q, Lin F, Xu H, Lu H, Cutti L, Chen H, Deng S, Guo L, Chuah TS, Song BK, Scarabel L, Qiu J, Zhu QH, Yu Q, Timko MP, Yamaguchi H, Merotto A Jr, Qiu Y, Olsen KM, Fan L, Ye CY. Genomic insights into the evolution of Echinochloa species as weed and orphan crop. Nat Commun. 2022 Feb 3;13(1):689. doi: 10.1038/s41467-022-28359-9.

Abstract

As one of the great survivors of the plant kingdom, barnyard grasses (Echinochloa spp.) are the most noxious and common weeds in paddy ecosystems. Meanwhile, at least two Echinochloa species have been domesticated and cultivated as millets. In order to better understand the genomic forces driving the evolution of Echinochloa species toward weed and crop characteristics, we assemble genomes of three Echinochloa species (allohexaploid E. crus-galli and E. crus-galli, and allotetraploid E. oryzicola) and re-sequence 737 accessions of barnyard grasses and millets from 16 rice-producing countries. Phylogenomic and comparative genomic analyses reveal the complex and reticulate evolution in the speciation of Echinochloa polyploids and provide evidence of constrained disease-related gene copy numbers in Echinochloa. A population-level investigation uncovers deep population differentiation for local adaptation, multiple target-site herbicide resistance mutations of barnyard grasses, and limited domestication of barnyard millets. Our results provide genomic insights into the dual roles of Echinochloa species as weeds and crops as well as essential resources for studying plant polyploidization, adaptation, precision weed control and millet improvements.

Assembly statistics

Genome size (bp)1,339,828,847
GC content45.91%
Genome sequence No.2,066
Maximum genome sequence length (bp)66,322,096
Minimum genome sequence length (bp)1,425
Average genome sequence length (bp)648,513
Genome sequence N50 (bp)49,730,318
Genome sequence N90 (bp)36,391,468
Assembly levelChromosome

Assembly

The Echinochloa crus-galli ec_v3 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHBDNR00000000.genome.fasta.gz

Gene Predictions

The Echinochloa crus-galli ec_v3 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHBDNR00000000.gff.gz
CDS sequences (FASTA file) GWHBDNR00000000.CDS.fasta.gz
Protein sequences (FASTA file) GWHBDNR00000000.Protein.faa.gz

Functional Analysis

Functional annotation for the Echinochloa crus-galli ec_v3 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Echinochloa_crus-galli.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SGWHBDNR000000243730640319955457-19957121Shybrid62DUF247
DUF247II-SΨGWHBDNR000000243730640319465551-19465817Shybrid58DUF247
HPS10-SGWHBDNR000000243730640319481296-19481399,
19481549-19481687
ShybridS163-
DUF247I-Z1ΨGWHBDNR000000093878867035955639-35957087Shybrid61DUF247
DUF247I-Z2GWHBDNR000000183639146833666360-33667985Shybrid60DUF247
DUF247I-Z3GWHBDNR000000273908677135924516-35926126Shybrid54DUF247
DUF247II-ZGWHBDNR000000273908677135931885-35933543Shybrid55DUF247
HPS10-Z1GWHBDNR000000183639146833671417-33671573,
33671654-33671760
ShybridZ576-
HPS10-Z2GWHBDNR000000273908677135929892-35929995,
35930107-35930233
Orufipogon35-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences