Solanum neorickii 'ZY58 (cultivar)' ASM2770486v1 Assembly & Annotation

Overview

Analysis Name Solanum neorickii 'ZY58 (cultivar)' ASM2770486v1 Assembly & Annotation
Sequencing technology PacBio
Assembly method Canu v. 1.5
Release Date 2023-01-11
Reference Publication(s)

Li N, He Q, Wang J, Wang B, Zhao J, Huang S, Yang T, Tang Y, Yang S, Aisimutuola P, Xu R, Hu J, Jia C, Ma K, Li Z, Jiang F, Gao J, Lan H, Zhou Y, Zhang X, Huang S, Fei Z, Wang H, Li H, Yu Q. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat Genet. 2023 May;55(5):852-860. doi: 10.1038/s41588-023-01340-y.

Abstract

Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon, the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop.

Assembly statistics

Genome size 777.9 Mb
Number of scaffolds 374
Scaffold N50 60.3 Mb
Scaffold L50 6
Number of contigs 815
Contig N50 2.1 Mb
Contig L50 110
Assembly level Scaffold

Assembly

The Solanum neorickii 'ZY58 (cultivar)' ASM2770486v1 assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) S.neorickii.genomic.fa.gz

Gene Predictions

The Solanum neorickii 'ZY58 (cultivar)' ASM2770486v1 genome gene prediction files are available in FASTA format.

Downloads

Genes (GFF3 file) -
CDS sequences (FASTA file) S.neorickii.cds.fa.gz
Protein sequences (FASTA file) S.neorickii.pep.fa.gz

Functional Analysis

Functional annotation for the Solanum neorickii 'ZY58 (cultivar)' ASM2770486v1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Solanum_neorickii_ZY58_ASM2770486v1.Pfam.tsv.gz

S genes

Summary

QueryChrSize(bp)CoordinatesBLASTn HitBLASTn %IDDomain
SLF15Sly01960128301630168-1628915Solanum lycopersicum SL2.31, SLF1598.1 F-box domain
SLF16Sly01960128301798582-1797401Solanum lycopersicum SL2.31, SLF1699.3 F-box domain
SLF22Sly019601283044727825-44726692Solanum peruvianum KU987616.1, SLF2299.6 F-box domain
S-RNaseSly019601283047155048-47155281,
47155362-47155772
Solanum neorickii MG266264.1,
SRNase
100.0 Ribonuclease T2 family
SLF21Sly019601283051214535-51215836Solanum peruvianum KU960914.1, SLF2198.9 F-box domain
SLF20ΨSly019601283051847016-51848180Solanum peruvianum KU960913.1, SLF2098.2 -
SLF7ΨSly019601283051936597-51937737Solanum lycopersicum SL2.31, SLF788.9 -
SLF7-2Sly019601283051955146-51956318Solanum peruvianum KJ814851.1, SLF798.9 F-box domain
S-RNase-2Sly019601283052855155-52855364,
52855480-52855905
Solanum peruvianum Z26583.1,
SpS6-RNase
99.1 Ribonuclease T2 family
SLF23Sly019601283053838360-53839517Solanum neorickii MG266242.1, SLF2399.7 F-box domain
SLF17ΨSly019601283053947452-53948633Solanum peruvianum KU987615.1, SLF1796.8 -
SLF9ΨSly019601283054939128-54937986Solanum pimpinellifolium KJ814875.1, SLF998.3 -
SLF12Sly019601283058171405-58170242Solanum pimpinellifolium KJ814878.1, SLF1299.3 F-box domain
SLF13Sly019601283059070480-59069278Solanum pimpinellifolium KJ814879.1, SLF1398.9 F-box domain
SLF14ΨSly019601283062222481-62221312Solanum lycopersicum KJ814903.1, SLF1498.0 -
SLF18Sly019601283072785417-72786532Solanum lycopersicum SL2.31, SLF1898.9 F-box domain
SLF19Sly019601283072804295-72803186Solanum lycopersicum SL2.31, SLF1998.9 F-box domain

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences