Analysis Name | Oryza coarctata Oco Assembly & Annotation |
Sequencing technology | PacBio |
Assembly method | hifiasm v0.12 |
Release Date | 2023-08-15 |
Zhao H, Wang W, Yang Y, Wang Z, Sun J, Yuan K, Rabbi SMHA, Khanam M, Kabir MS, Seraj ZI, Rahman MS, Zhang Z. A high-quality chromosome-level wild rice genome of Oryza coarctata. Sci Data. 2023 Oct 14;10(1):701. doi: 10.1038/s41597-023-02594-1.
AbstractOryza coarctata (2n = 4X = 48, KKLL) is an allotetraploid, undomesticated relative of rice and the only species in the genus Oryza with tolerance to high salinity and submergence. Therefore, it contains important stress and tolerance genes/factors for rice. The initial draft genome published was limited by data and technical restrictions, leading to an incomplete and highly fragmented assembly. This study reports a new, highly contiguous chromosome-level genome assembly and annotation of O. coarctata. PacBio high-quality HiFi reads generated 460 contigs with a total length of 573.4 Mb and an N50 of 23.1 Mb, which were assembled into scaffolds with Hi-C data, anchoring 96.99% of the assembly onto 24 chromosomes. The genome assembly comprises 45,571 genes, and repetitive content contributes 25.5% of the genome. This study provides the novel identification of the KK and LL genome types of the genus Oryza, leading to valuable insights into rice genome evolution. The chromosome-level genome assembly of O. coarctata is a valuable resource for rice research and molecular breeding.
Assembly statistics
Genome size (bp) | 573,362,877 |
GC content | 42.06% |
Chromosomes sequence No. | 24 |
Genome sequence No. | 450 |
Maximum genome sequence length (bp) | 37,520,647 |
Minimum genome sequence length (bp) | 16,801 |
Average genome sequence length (bp) | 1,274,140 |
Genome sequence N50 (bp) | 23,112,565 |
Genome sequence N90 (bp) | 16,161,634 |
Assembly level | Chromosome |
The Oryza coarctata Oco Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | GWHCBHR00000000.genome.fasta.gz |
The Oryza coarctata Oco genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | GWHCBHR00000000.gff.gz |
CDS sequences (FASTA file) | GWHCBHR00000000.RNA.fasta.gz |
Protein sequences (FASTA file) | GWHCBHR00000000.Protein.faa.gz |
Functional annotation for the Oryza coarctata Oco is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Oryza_coarctata.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | tBLASTn Hit | tBLASTn %ID | Domain |
DUF247I-S1Ψ | GWHCBHR00000009 | 23765581 | 4727597-4728520 | Olongistaminata | 80 | DUF247 |
DUF247I-S2Ψ | GWHCBHR00000010 | 23414180 | 5599897-5599947 | Olongistaminata | 64 | DUF247 |
HPS10-S1 | GWHCBHR00000009 | 23765581 | 4723566-4723707,4723782-4723888 | LpsS_chromosome1 | 42 | - |
HPS10-S2 | GWHCBHR00000010 | 23414180 | 5602105-5602229,5602322-5602394 | Olongistaminata | 47 | - |
DUF247I-Z | GWHCBHR00000007 | 23112565 | 20674591-20676168 | LpZDUF247-I_chromosome2 | 61 | DUF247 |
DUF247II-Z | GWHCBHR00000007 | 23112565 | 20688482-20690182 | Scereale | 67 | DUF247 |
HPS10-Z1 | GWHCBHR00000007 | 23112565 | 20681595-20681757,20681830-20681882 | Telongatum | 38 | - |
HPS10-Z2 | GWHCBHR00000008 | 22654099 | 20165556-20165665,20165760-20165922 | LpsZ_chromosome2 | 40 | - |
Nucleotide
Protein