Solanum lycopersicum 'M82 (cultivar)' ASM2770488v1 Assembly & Annotation

Overview

Analysis Name Solanum lycopersicum 'M82 (cultivar)' ASM2770488v1 Assembly & Annotation
Sequencing technology PacBio
Assembly method Canu v. 1.5
Release Date 2023-01-11
Reference Publication(s)

Li N, He Q, Wang J, Wang B, Zhao J, Huang S, Yang T, Tang Y, Yang S, Aisimutuola P, Xu R, Hu J, Jia C, Ma K, Li Z, Jiang F, Gao J, Lan H, Zhou Y, Zhang X, Huang S, Fei Z, Wang H, Li H, Yu Q. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat Genet. 2023 May;55(5):852-860. doi: 10.1038/s41588-023-01340-y.

Abstract

Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon, the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop.

Assembly statistics

Genome size 880.3 Mb
Number of scaffolds 5,156
Scaffold N50 54.6 Mb
Scaffold L50 8
Number of contigs 6,409
Contig N50 600 kb
Contig L50 402
Assembly level Scaffold

Assembly

The Solanum lycopersicum 'M82 (cultivar)' ASM2770488v1 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_027704885.1_ASM2770488v1_genomic.fna.gz

Gene Predictions

The Solanum lycopersicum 'M82 (cultivar)' ASM2770488v1 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) -
CDS sequences (FASTA file) S.lycopersicum.M82.cds.fa.gz
Protein sequences (FASTA file) S.lycopersicum.M82.pep.fa.gz

Functional Analysis

Functional annotation for the Solanum lycopersicum 'M82 (cultivar)' ASM2770488v1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan -

S genes

Summary

QueryChrSize(bp)CoordinatesBLASTn HitBLASTn %IDDomain
SLF15JALGYV01
0005145.1
814654272239093-2237834SL2.31ch01:2198500-2196501_SLF15100F-box domain
SLF16JALGYV01
0005145.1
814654272763360-2762179SL2.31ch01:2723400-2721301_SLF16100F-box domain
SLF17ΨJALGYV01
0005145.1
8146542735184227-35183142SL2.31ch01:40853100-40851001_SLF17Ψ100-
SLF1JALGYV01
0005145.1
8146542737706188-37707357NM_001301439.2, SLF1100F-box domain
S-RNaseJALGYV01
0005145.1
8146542738318180-38317941,
38317843-38317418
XM_004229015.1,
Ribonuclease S-3
100Ribonuclease T2 family
SLF2ΨJALGYV01
0005145.1
8146542738787133-38785952KJ814870.1, SLF2100-
SLF12ΨJALGYV01
0005145.1
8146542738843710-38844841SL2.31ch01:45516501-45518600_SLF12Ψ100-
SLF4ΨJALGYV01
0005145.1
8146542738911008-38909842KJ814943.1, SLF4100-
SLF5ΨJALGYV01
0005145.1
8146542738991724-38990556KJ814872.1, SLF5100-
SLF6ΨJALGYV01
0005145.1
8146542739009259-39008114KJ814944.1, SLF6100-
SLF8ΨJALGYV01
0005145.1
8146542739566604-39565436SL2.31ch01:46243000-46240701_SLF8Ψ100-
SLF7ΨJALGYV01
0005145.1
8146542739591376-39590279SL2.31ch01:46267800-46265701_SLF7Ψ100-
SLF9JALGYV01
0005145.1
8146542741756801-41755737NM_001329461.2, SLF9100F-box domain
SLF10ΨJALGYV01
0005145.1
8146542742199796-42201027KJ814899.1, SLF10100-
SLF11JALGYV01
0005145.1
8146542744144229-44145401KJ814877.1, SLF11100F-box associated
SLF12JALGYV01
0005145.1
8146542745897944-45896781NM_001301441.1, SLF12100F-box associated
SLF13JALGYV01
0005145.1
8146542746621357-46620164NM_001301435.1, SLF13100F-box associated
SLF14ΨJALGYV01
0005145.1
8146542749503017-49501847KJ814903.1, SLF14100-
SLF18JALGYV01
0005145.1
8146542759494816-59495931SL2.31ch01:67739501-67741500_SLF18100F-box domain
SLF19JALGYV01
0005145.1
8146542759513833-59512724SL2.31ch01:67757501-67759600_SLF19100F-box domain

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences