Prunus armeniaca Genome_v1.0 Assembly & Annotation

Overview

Analysis Name Prunus armeniaca Genome_v1.0 Assembly & Annotation
Sequencing technology PacBio
Assembly method Canu
Release Date 2019-11-20
Reference Publication(s)

Jiang F, Zhang J, Wang S, Yang L, Luo Y, Gao S, Zhang M, Wu S, Hu S, Sun H, Wang Y. The apricot (Prunus armeniaca L.) genome elucidates Rosaceae evolution and beta-carotenoid synthesis. Hortic Res. 2019 Nov 18;6:128. doi: 10.1038/s41438-019-0215-6.

Abstract

Apricots, scientifically known as Prunus armeniaca L, are drupes that resemble and are closely related to peaches or plums. As one of the top consumed fruits, apricots are widely grown worldwide except in Antarctica. A high-quality reference genome for apricot is still unavailable, which has become a handicap that has dramatically limited the elucidation of the associations of phenotypes with the genetic background, evolutionary diversity, and population diversity in apricot. DNA from P. armeniaca was used to generate a standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on Sequel SMRT Cells, generating a total of 16.54 Gb of PacBio subreads (N50 = 13.55 kb). The high-quality P. armeniaca reference genome presented here was assembled using long-read single-molecule sequencing at approximately 70× coverage and 171× Illumina reads (40.46 Gb), combined with a genetic map for chromosome scaffolding. The assembled genome size was 221.9 Mb, with a contig NG50 size of 1.02 Mb. Scaffolds covering 92.88% of the assembled genome were anchored on eight chromosomes. Benchmarking Universal Single-Copy Orthologs analysis showed 98.0% complete genes. We predicted 30,436 protein-coding genes, and 38.28% of the genome was predicted to be repetitive. We found 981 contracted gene families, 1324 expanded gene families and 2300 apricot-specific genes. The differentially expressed gene (DEG) analysis indicated that a change in the expression of the 9-cis-epoxycarotenoid dioxygenase (NCED) gene but not lycopene beta-cyclase (LcyB) gene results in a low β-carotenoid content in the white cultivar "Dabaixing". This complete and highly contiguous P. armeniaca reference genome will be of help for future studies of resistance to plum pox virus (PPV) and the identification and characterization of important agronomic genes and breeding strategies in apricot.

Assembly statistics

AssemblyPseudomolecules
Size (bp)221,901,797206,096,285
Number4448
NG50 (bp)1,020,06325,125,992
N50 (bp)1,018,04425,125,992
GC content (%)37.6%37.42%
Maximum size (bp)5,999,22842,984,470
Minimum size (bp)115918,857,615
Mean size (bp)499,72425,762,035

Assembly

The Prunus armeniaca Genome_v1.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) apricot.genome.fa.gz

Gene Predictions

The Prunus armeniaca Genome_v1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) apricot.gff3.gz
CDS sequences (FASTA file) apricot.cds.fa.gz
Protein sequences (FASTA file) apricot.prot.fa.gz

S genes

Summary

QueryChrSize(bp)CoordinatesDomain
ParSLF7.2LG12425075120865952-20867421F-box; F_box_assoc
ParSLF7.1LG12425075121200783-21202279F-box
ParSLF4LG12425075121646081-21644714F-box; F_box_assoc
ParSLF9ψLG12425075121707975-21709463-
ParSLF2LG12425075121719117-21720445F-box; F_box_assoc
ParSFBLG12425075121748849-21749970F-box; F_box_assoc
ParSLF1LG12425075121763367-21762138F-box; F_box_assoc
ParSLF3LG12425075121782779-21784002F-box; F_box_assoc
ParSLF5LG12425075121827408-21826230F-box; F_box_assoc
ParSLF8ψLG12425075122043268-22042381-
ParSLF6LG12425075123196566-23195367F-box; F_box_assoc
ParSLF10ψLG12425075123684343-23685960-
S-RNaseLG12425075121754247-21754172,21754003-21753819,21753650-21753216RNase_T2

Prunus armeniaca Genome_v1.0 S genes Nucleotide

Prunus armeniaca Genome_v1.0 S genes Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences