Info

NCBI taxon id: 1594315
Order: Lepidoptera
Family: Tortricidae
NCBI lineage: Eukaryota;Metazoa;Ecdysozoa;Arthropoda;Hexapoda;Insecta;Pterygota;Neoptera;Endopterygota;Lepidoptera;Glossata;Ditrysia;Tortricoidea;Tortricidae;Olethreutinae;Eucosmini;Notocelia;
GoaT genome size (M): 600 (ancestor)
GoaT asm span (M): 794 (direct)
GoaT chr no.: 28 (ancestor)
ToLID prefix: ilNotUddm

Specimens

Below is information about specimens collected for this species retrieved from the Golden Record manifest.

public_name specimen_id gal collector_affiliation date_of_collection sex organism_part biosample biospecimen lifestage symbiont family order_or_group genus taxon_id scientific_name common_name tube_id

Spectra estimates

Below are estimates of genome size, repeat size, heterozygosity based on k-mer specta analysis with GenomeScope.

source specimen k-mer k-cov haploid size repeat (%) heterozygosity (%) model fit (%) model error (%) histogram

Sequence data


PacBio run data

Below are stats for each PacBio seqeuncing run collected for this species.

pipeline specimen sample date instrument run id movie well movie length tag tag sequence library load name reads yield N50 A (%) C (%) G (%) T (%) sample accession run accession exp accession study accession species barcode

Illumina run data

Illumina run stats.

pipeline source specimen date run id read pairs yield avg qual avg length sample accession run accession exp accession study accession sample tag sequence tag2 sequence run status npg status species barcode

Cobionts

Below are results from a screen of the PacBio data using Mash screen against RefSeq assemblies. Only results with identity over 90% are displayed.

identity shared-hashes median-multiplicity p-value query info

Species composition by small subunit (SSU) presence in the assembly.

specimen contig SSU length attributed taxonomy by SSU cluster

Re-assembly of reads classified under each identified SSU Marker family.

specimen family classified reads original assembly re-assembly
count (%) BUSCO BUSCO contigs contig length reads BUSCO contigs contig length additional reads circos

Visualisation of a classification of the PacBio reads using a variation autoencoder on the k-mer counts.

specimen visualisation

Canonical tetranucleotide counts for each contig or scaffold reduced to two dimensions with UMAP to allow visualisation.

Features (colours represent quantile bins):

  • Hexamer: Estimated coding density (expected to be higher in microbes than in animals).
  • FastK: The median number of times each 60-mer in the sequence occures across the whole assembly (illustrates repetitiveness)
  • Unique_15mers: Number of unique 15-mers per base pair (illustrates sequence diversity)
  • Is_Connected: Presence of at least one Hi-C connection to another scaffold (absence of connections can indicate contamination)
  • Connections_Base: Number of Hi-C connections per base pair


Assemblies

In-progress assembly QC.

specimen asm date contig N50 contigs scaffold N50 scaffolds length BUSCO BUSCO lineage merqury

Organelles

In-progress organelle results from MitoHiFi2.

specimen asm date length genes frameshifts is circular reference