------ ConvertNCBI ./data/seq/cabbN ------
./data/seq/cabbN/sequence exists - remove existing .fna and .fa files
./data/seq/cabbN/annotation exists - remove existing .gff and .gff3 files
Log file to ./data/seq/cabbN/xConvertNCBI.log
Parameters:
Project directory: ./data/seq/cabbN/
Include any sequence with the 'NT_' or 'NW_' prefix
Uses prefixes Chr 'C' and Scaffold 's'
Include protein-id in attributes
Verbose
Processing ./data/seq/cabbN/ncbi_dataset/data/GCF_000695525.1/GCF_000695525.1_BOL_genomic.fna
C1 NC_027748.1 43,764,888
C2 NC_027749.1 52,886,895
C3 NC_027750.1 64,984,695
C4 NC_027751.1 53,719,093
C5 NC_027752.1 46,902,585
C6 NC_027753.1 39,822,476
C7 NC_027754.1 48,366,697
C8 NC_027755.1 41,758,685
C9 NC_027756.1 54,679,868
s01 NW_013617415.1 550,871
s02 NW_013617416.1 360,705
s03 NW_013617417.1 343,593
s04 NW_013617418.1 324,463
s05 NW_013617419.1 246,008
s06 NW_013617420.1 215,938
s07 NW_013617421.1 215,108
s08 NW_013617422.1 214,022
s09 NW_013617423.1 213,535
s10 NW_013617424.1 213,381
s11 NW_013617425.1 193,719
s12 NW_013617426.1 192,988
s13 NW_013617427.1 192,287
s14 NW_013617428.1 187,934
s15 NW_013617429.1 187,313
s16 NW_013617430.1 176,243
s17 NW_013617431.1 174,647
s18 NW_013617432.1 159,200
s19 NW_013617433.1 156,667
s20 NW_013617434.1 154,937
Suppressing further scaffold outputs
Unk NC_016118.1 360,271 *
Sequences not output: 1 (*)
Finish writing ./data/seq/cabbN/sequence/genomic.fna
A 137,528,552 a 4,845,085
T 137,527,795 t 4,834,468
C 78,230,546 c 2,192,767
G 78,237,014 g 2,192,880
N 43,004,782 n 0
Gaps >= 30000: 39
Finish writing ./data/seq/cabbN/annotation/gap.gff
Processing ./data/seq/cabbN/ncbi_dataset/data/GCF_000695525.1/genomic.gff
Use Gene 44,305 from 49,563
Use mRNA 44,305 from 56,687
Use Exon 227,091 from 398,922
Finish writing ./data/seq/cabbN/annotation/anno.gff
>>Sequence
10 Output 9 Chromosomes 446,885,882
32,876 Output 32,876 Scaffolds 41,708,007 (32,284 < 10,000bp)
>>All Types (col 3) (+ are processed keywords)
CDS 305,897 +
cDNA_match 5,906
direct_repeat 4
exon 398,922 +
gene 49,563 +
inverted_repeat 2
lnc_RNA 7,942
mRNA 56,687 +
pseudogene 4,495
rRNA 3
region 32,886
sequence_feature 76
tRNA 995
transcript 2,089
>>All Gene Source (col 2)
Gnomon 48,484
RefSeq 102
tRNAscan-SE 977
>>All gene_biotype= (col 8)
lncRNA 4,179
protein_coding 44,386 +
rRNA 3
tRNA 995
>>Chromosome gene count 42,414
C1 NC_027748.1 4,198
C2 NC_027749.1 4,326
C3 NC_027750.1 6,668
C4 NC_027751.1 4,890
C5 NC_027752.1 4,656
C6 NC_027753.1 3,608
C7 NC_027754.1 4,523
C8 NC_027755.1 4,430
C9 NC_027756.1 5,115
>>Scaffold gene count 1,891 (list scaffolds with #genes>3)
s01 NW_013617415.1 37
s02 NW_013617416.1 20
s03 NW_013617417.1 5
s04 NW_013617418.1 5
s05 NW_013617419.1 13
s06 NW_013617420.1 11
s07 NW_013617421.1 24
s08 NW_013617422.1 15
s10 NW_013617424.1 10
s11 NW_013617425.1 8
s12 NW_013617426.1 27
s14 NW_013617428.1 6
s15 NW_013617429.1 24
s16 NW_013617430.1 7
s17 NW_013617431.1 5
s18 NW_013617432.1 12
s19 NW_013617433.1 6
s20 NW_013617434.1 4
s21 NW_013617435.1 6
s22 NW_013617436.1 10
s24 NW_013617438.1 6
s25 NW_013617439.1 8
s26 NW_013617440.1 8
s27 NW_013617441.1 5
s30 NW_013617444.1 16
s33 NW_013617447.1 7
s34 NW_013617448.1 9
s35 NW_013617449.1 10
s36 NW_013617450.1 7
s40 NW_013617454.1 5
s41 NW_013617455.1 4
s42 NW_013617456.1 7
s43 NW_013617457.1 4
s44 NW_013617458.1 9
s45 NW_013617459.1 4
s47 NW_013617461.1 8
s52 NW_013617466.1 5
s53 NW_013617467.1 5
s54 NW_013617468.1 8
s56 NW_013617470.1 12
s57 NW_013617471.1 8
s58 NW_013617472.1 4
s62 NW_013617476.1 4
s63 NW_013617477.1 5
s65 NW_013617479.1 12
s66 NW_013617480.1 7
s67 NW_013617481.1 7
s68 NW_013617482.1 6
s69 NW_013617483.1 4
s72 NW_013617486.1 4
s73 NW_013617487.1 5
s76 NW_013617490.1 4
s77 NW_013617491.1 7
s79 NW_013617493.1 5
s81 NW_013617495.1 9
s84 NW_013617498.1 4
s89 NW_013617503.1 6
s90 NW_013617504.1 5
s91 NW_013617505.1 4
s93 NW_013617507.1 8
s96 NW_013617510.1 8
s100 NW_013617514.1 6
s103 NW_013617517.1 7
s106 NW_013617520.1 5
s109 NW_013617523.1 5
s110 NW_013617524.1 5
s111 NW_013617525.1 7
s118 NW_013617532.1 4
s119 NW_013617533.1 6
s121 NW_013617535.1 5
s124 NW_013617538.1 4
s125 NW_013617539.1 5
s126 NW_013617540.1 5
s129 NW_013617543.1 4
s132 NW_013617546.1 7
s133 NW_013617547.1 4
s135 NW_013617549.1 5
s136 NW_013617550.1 7
s140 NW_013617554.1 4
s144 NW_013617558.1 5
s154 NW_013617568.1 8
s163 NW_013617577.1 4
s183 NW_013617597.1 5
s188 NW_013617602.1 5
s191 NW_013617605.1 4
s201 NW_013617615.1 4
s213 NW_013617627.1 4
s227 NW_013617641.1 4
s228 NW_013617642.1 5
s235 NW_013617649.1 4
s239 NW_013617653.1 5
s241 NW_013617655.1 4
s249 NW_013617663.1 4
s267 NW_013617681.1 4
s285 NW_013617699.1 7
s300 NW_013617714.1 5
s313 NW_013617727.1 4
s320 NW_013617734.1 6
s323 NW_013617737.1 5
s347 NW_013617761.1 5
s374 NW_013617788.1 4
s406 NW_013617820.1 4
s449 NW_013617863.1 5
Scaffolds with <=3 gene (not listed) 32,773
Genes not included 81
Suggestion: There are 32,885 sequences. Set SyMAP project parameter 'Minimum length' to reduce number loaded.
------ Finish ConvertNCBI ./data/seq/cabbN/ ------