Cory's shearwater
NCBI taxon id: | 1323832 NCBI; ENA; GoaT |
---|---|
Order: | Procellariiformes |
Family: | Procellariidae |
NCBI lineage: | Eukaryota;Metazoa;Chordata;Craniata;Vertebrata;Euteleostomi;Archelosauria;Archosauria;Dinosauria;Saurischia;Theropoda;Coelurosauria;Aves;Neognathae;Neoaves;Aequornithes;Procellariiformes;Procellariidae;Calonectris; |
GoaT genome size (M): | 1,481 (ancestor) |
GoaT asm span (M): | 1,211 (direct) |
GoaT chr no.: | 80 (ancestor) |
GoaT haploid no.: | 40 (ancestor) |
GoaT ploidy: | 3 (ancestor) |
ToLID prefix: | bCalBor |
Below is information about specimens collected for this species retrieved from the Sample Tracking System (STS).
tolid | specimen_id | gal | sex | organism_part | biosample | biospecimen |
---|---|---|---|---|---|---|
bCalBor1 | SAN0001297 | SANGER INSTITUTE | FEMALE | BLOOD | SAMEA8228677 | SAMEA8228664 |
bCalBor1 | SAN0001297 | SANGER INSTITUTE | FEMALE | BLOOD | SAMEA8228678 | SAMEA8228664 |
bCalBor10 | SAN00002920 | SANGER INSTITUTE | FEMALE | BLOOD | SAMEA114294385 | SAMEA114294359 |
bCalBor11 | SAN00002921 | SANGER INSTITUTE | NOT_COLLECTED | BLOOD | SAMEA114294386 | SAMEA114294360 |
bCalBor11 | SAN00002921 | SANGER INSTITUTE | NOT_COLLECTED | BLOOD | SAMEA114294387 | SAMEA114294360 |
Below are estimates of genome size, repeat size, heterozygosity based on k-mer specta analysis with GenomeScope2.
source | specimen | k-mer | k-cov | haploid size | repeat (%) | heterozygosity (%) | model fit (%) | model error (%) | histogram |
---|---|---|---|---|---|---|---|---|---|
pacbio | bCalBor6 | 31 | 35.88 | 2,256,001 | 87.71 | 4.16 | 84.44 | 8.50 | ![]() ![]() histogram.txt |
hic-arima2 | bCalBor6 | 31 | 19.07 | 2,420,299,336 | 0.00 | 100.00 | 15.36 | 0.23 | ![]() ![]() histogram.txt |
illumina | bCalBor12 | 31 | 10.67 | 1,231,675,593 | 10.30 | 0.50 | 99.70 | 0.14 | ![]() ![]() histogram.txt |
illumina | bCalBor8 | 31 | 8.363 | 1,248,208,393 | 9.90 | 0.35 | 99.85 | 0.14 | ![]() ![]() histogram.txt |
atac-seq | bCalBor13 | 31 | 21.89 | 1,434,276,370 | 74.14 | 5.58 | 95.61 | 0.55 | ![]() ![]() histogram.txt |
pacbio | bCalBor7 | 31 | 33.05 | 1,228,509,624 | 9.63 | 0.45 | 96.88 | 0.10 | ![]() ![]() histogram.txt |
hic-arima2 | bCalBor7 | 31 | 24.75 | 953,894,144 | 42.90 | 19.77 | 97.99 | 0.46 | ![]() ![]() histogram.txt |
Below are stats for each PacBio seqeuncing run collected for this species.
pipeline | specimen | date | run id | movie | well | tag | yield | N50 | sample accession | run accession | barcode |
---|---|---|---|---|---|---|---|---|---|---|---|
PacBio - HiFi | bCalBor6 | 2021-05-07 | 82086 | m64174e_210507_153628 | A01 | 1021 | 2,551,456,256 | 10,234 | SAMEA8228683 | ERR13071485 | |
PacBio - HiFi | bCalBor7 | 2024-02-19 | TRACTION-RUN-1130 | m84098_240219_121126_s3 | C01 | 2077 | 83,840,993,996 | 13,757 | SAMEA114294382 | ERR13071484 |
Below are stats for each ONT seqeuncing run collected for this species.
pipeline | specimen | date | run id | flowcell | type | yield | N50 | sample accession | report |
---|---|---|---|---|---|---|---|---|---|
ONT_PromethIon | bCalBor7 | 2025-01-07 | ONTRUN-243 | PBA77664 | bam_pass | 66,412,969,479 | 27,988 | SAMEA114294382 | |
ONT_PromethIon | bCalBor7 | 2025-01-07 | ONTRUN-243 | PBA77664 | fastq_dorado_7.2.13_sup_simplex_normal_pass | 71,124,804,686 | 28,066 | SAMEA114294382 |
Below are stats for each Illumina run collected for this species. Click on a row to see associated plots from samtools stats.
pipeline | specimen | date | run id | read pairs | yield | sample accession | run accession | run status | barcode |
---|---|---|---|---|---|---|---|---|---|
Hi-C - Arima v2 | bCalBor6 | 2021-08-31 | 40520_4#2 | 817,905,092 | 123,503,668,892 | SAMEA8228683 | ERR13093660 | qc complete | Calonectris diomedea (1.00) |
Standard | bCalBor12 | 2024-06-24 | 49053_8#1 | 226,981,458 | 34,274,200,158 | SAMEA114294389 | qc complete | ||
Standard | bCalBor8 | 2024-06-24 | 49053_8#2 | 180,286,058 | 27,223,194,758 | SAMEA114294383 | qc complete | ||
Custom | bCalBor13 | 2024-10-17 | 49689_2#4 | 109,549,706 | 16,542,005,606 | SAMEA114498660 | ERR13866162 | qc complete | Calonectris diomedea (1.00) |
Custom | bCalBor13 | 2024-10-17 | 49689_2#6 | 134,573,100 | 20,320,538,100 | SAMEA114498672 | ERR13866163 | qc complete | Calonectris diomedea (1.00) |
Custom | bCalBor13 | 2024-10-14 | 49682_2#1 | 124,338,232 | 18,775,073,032 | SAMEA114498669 | ERR13866159 | qc complete | Calonectris diomedea (1.00) |
Custom | bCalBor13 | 2024-10-14 | 49682_2#6 | 137,679,616 | 20,789,622,016 | SAMEA114498681 | ERR13866160 | qc complete | Calonectris diomedea (1.00) |
Custom | bCalBor13 | 2024-10-17 | 49689_2#2 | 109,083,234 | 16,471,568,334 | SAMEA114498659 | ERR13866161 | qc complete | Calonectris diomedea (1.00) |
Hi-C - Arima v2 | bCalBor7 | 2024-03-01 | 48526_3-4#4 | 451,211,178 | 68,132,887,878 | SAMEA114294382 | ERR13093661 | qc complete | Calonectris diomedea (1.00) |
RNA PolyA | bCalBor13 | 2025-03-04 | 50171_1#80 | 61,623,964 | 9,305,218,564 | SAMEA114498659 | ERR14792832 | qc complete | Calonectris diomedea (1.00) |
Below are results from a screen of the PacBio data using Mash screen against RefSeq assemblies. Only results with identity over 90% are displayed.
identity | info |
---|---|
0.970654 | [66785 seqs] NW_010509587.1 Phaethon lepturus isolate BGI_N335 unplaced genomic scaffold, ASM68728v1 scaffold910, whole genome shotgun sequence [...] |
0.961137 | [57389 seqs] NW_009185582.1 Fulmarus glacialis isolate BGI_N327 unplaced genomic scaffold, ASM69083v1 Scaffold120, whole genome shotgun sequence [...] |
0.918304 | [124 seqs] NZ_ACNO01000124.1 Rhodococcus erythropolis SK121 contig00004, whole genome shotgun sequence [...] |
0.915049 | NC_018417.1 Candidatus Carsonella ruddii HT isolate Thao2000, complete genome |
0.907083 | NC_001866.1 Avian myelocytomatosis virus, complete genome |
Species composition by small subunit (SSU) presence in the assembly with MarkerScan.
specimen | contig | SSU length | attributed taxonomy by SSU |
---|---|---|---|
bCalBor7 | atg001696l | 1827 |
|
MarkerScan cobiont assembly by read separation based on observed families (see above). These reads are both aligned to the assembly and independently re-assembled. The quality of these assemblies is assessed by their completeness according to BUSCO, their span and the number of reads they encompass. For more information here.
specimen | family | classified reads | original assembly | re-assembly | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
count | (%) | BUSCO | BUSCO | contigs | contig length | number of reads | BUSCO | contigs | contig length | number of reads | ||
bCalBor7 | Columbidae | 216 | 0 | C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:8338 | - | 0.00Mb | C:0.0%[S:0.0%,D:0.0%],F:0.0%,M:100.0%,n:8338 | 3 | 0.08Mb | 790 |
Canonical tetranucleotide counts for each contig or scaffold reduced to two dimensions with UMAP to allow visualisation.
Features (colours represent quantile bins):
In-progress assembly QC.
specimen | asm | date | contig N50 | contigs | scaffold N50 | scaffolds | length | BUSCO | merqury |
---|---|---|---|---|---|---|---|---|---|
bCalBor7 | hifiasm.purging | 2024-02-22 | 5,339,729 | 501 | 1,336,276,138 | C:97.4%[S:96.4%,D:1.0%],F:0.6%,M:2.0%,n:8338 | Q65.5-C99.7(HiFi) | ||
bCalBor7 | hifiasm | 2024-02-22 | 5,326,180 | 625 | 1,359,910,343 | C:97.5%[S:96.4%,D:1.1%],F:0.6%,M:1.9%,n:8338 | Q64.6-C99.8(HiFi) | ||
bCalBor7 | hifiasm.scaffolding.yahs | 2024-02-22 | 4,864,066 | 644 | 67,375,374 | 239 | 1,336,357,138 | C:97.4%[S:96.5%,D:0.9%],F:0.5%,M:2.1%,n:8338 | Q65.5-C99.7(HiFi) |
bCalBor7 | hifiasm-hic.purging | 2024-02-22 | 4,655,358 | 576 | 1,289,655,409 | C:97.4%[S:97.0%,D:0.4%],F:0.6%,M:2.0%,n:8338 | Q66.9-C99.8(HiFi) | ||
bCalBor7 | hifiasm-hic.scaffolding_hap1.yahs | 2024-02-22 | 4,224,317 | 846 | 86,023,716 | 380 | 1,311,168,604 | C:97.4%[S:97.0%,D:0.4%],F:0.5%,M:2.1%,n:8338 | Q66.4-C99.8(HiFi) |
bCalBor7 | hifiasm-hic.scaffolding_hap2.yahs | 2024-02-22 | 4,103,308 | 784 | 52,332,842 | 318 | 1,266,568,577 | C:94.5%[S:94.1%,D:0.4%],F:0.7%,M:4.8%,n:8338 | Q66.6-C99.8(HiFi) |
bCalBor7 | hifiasm-hic.hap2 | 2024-02-22 | 4,398,455 | 651 | 1,266,475,377 | C:94.5%[S:94.1%,D:0.4%],F:0.8%,M:4.7%,n:8338 | Q66.6-C99.8(HiFi) | ||
bCalBor7 | hifiasm-hic.hap1 | 2024-02-22 | 4,425,772 | 706 | 1,311,075,404 | C:97.3%[S:96.9%,D:0.4%],F:0.6%,M:2.1%,n:8338 | Q66.4-C99.8(HiFi) |
In-progress organelle results from MitoHiFi or Oatk.
specimen | asm | organelle | date | length | genes | frameshifts | is circular | seqs | reference |
---|---|---|---|---|---|---|---|---|---|
bCalBor7 | mitohifi.reads | mito | 2024-02-22 | 17,605 | 34 | None | False | 1 | NC_057528.1; 16,434 bp; 37 genes |
bCalBor7 | mitohifi.hifiasm | mito | 2024-02-22 | 19,945 | 40 | None | True | 1 | NC_057528.1; 16,434 bp; 37 genes |
bCalBor7 | mitohifi.hifiasm-hic | mito | 2024-02-22 | 16,483 | 33 | None | False | 1 | NC_085213.1; 17,288 bp; 37 genes |