Functional Analysis and Binning of Genomic Data
This data analysis explores the functional analysis and binning results of assembled contigs, assessing completeness, genome size, and GC content. The genomic bins are sorted by size, with details on the best bin quality and Kraken classification of the best bin as Anabaena. The Quast report provides statistics on contigs of various sizes.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Functional analysis of Functional analysis of sample Fv7 sample Fv7
Binning of assembled contigs ( Binning of assembled contigs (MaxBin MaxBin) ) Bin name sample_maxbin.001.fasta 62.5% sample_maxbin.002.fasta 80.0% sample_maxbin.003.fasta 47.5% sample_maxbin.004.fasta 25.0% sample_maxbin.005.fasta 55.0% sample_maxbin.006.fasta 45.0% sample_maxbin.007.fasta 42.5% sample_maxbin.008.fasta 57.5% sample_maxbin.009.fasta 60.0% sample_maxbin.010.fasta 55.0% sample_maxbin.011.fasta 55.0% sample_maxbin.012.fasta 22.5% sample_maxbin.013.fasta 45.0% Completeness Genome size 1999021 2359160 3442721 4810780 4630709 2319323 1587779 3246338 3139766 1465910 2406435 716843 815330 GC content 44.0 48.7 51.7 53.8 39.3 51.2 43.7 48.3 44.0 56.3 62.1 56.4 56.8
Sorted by size Sorted by size Bin name Completeness Genome size GC content sample_maxbin.012.fasta 22.5% 716843 56.4 sample_maxbin.013.fasta 45.0% 815330 56.8 sample_maxbin.010.fasta 55.0% 1465910 56.3 sample_maxbin.007.fasta 42.5% 1587779 43.7 sample_maxbin.001.fasta 62.5% 1999021 44.0 sample_maxbin.006.fasta 45.0% 2319323 51.2 sample_maxbin.002.fasta 80.0% 2359160 48.7 sample_maxbin.011.fasta 55.0% 2406435 62.1 sample_maxbin.009.fasta 60.0% 3139766 44.0 sample_maxbin.008.fasta 57.5% 3246338 48.3 sample_maxbin.003.fasta 47.5% 3442721 51.7 sample_maxbin.005.fasta 55.0% 4630709 39.3 sample_maxbin.004.fasta 25.0% 4810780 53.8
Quality of the genomic bin Click to add text # of contigs in the bin 002= 1481 Genomic bin belongs probably to Anabaena
Quast report of the best bin All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs). Assembly # contigs (>= 0 bp) # contigs (>= 1000 bp) # contigs (>= 5000 bp) # contigs (>= 10000 bp) # contigs (>= 25000 bp) # contigs (>= 50000 bp) Total length (>= 0 bp) Total length (>= 1000 bp) Total length (>= 5000 bp) Total length (>= 10000 bp) 0 Total length (>= 25000 bp) 0 Total length (>= 50000 bp) 0 # contigs Largest contig Total length GC (%) N50 N75 L50 L75 sample_maxbin.002 1481 1481 6 0 0 0 2359160 2359160 33367 1481 6086 2359160 48.67 1580 1217 516 944
KRAKEN classification of the best bin 93.72 1388 1388 U 6.28 93 6.21 92 6.21 92 5.27 78 5.06 75 5.06 75 1.55 23 0.68 10 0.14 2 0 0.14 2 0 0.14 2 2 0.14 2 0 0.07 1 0 0.07 1 1 0.07 1 0 0.07 1 1 0.07 1 0 0.07 1 0 0.07 1 1 0.41 6 0 0.27 4 0 0.07 1 0 : 0 root unclassified 1 0 6 1 0 14 1 5 - - D - - P O F G S - G S - S - G S - F G S 1 131567 cellular organisms 2 Bacteria 1783272 Terrabacteria group 1798711 Cyanobacteria/Melainabacteria group 1117 Cyanobacteria 1161 Nostocales 1162 Nostocaceae 1163 Anabaena 1165 Anabaena cylindrica 272123 Anabaena cylindrica PCC 7122 1177 Nostoc 92942 Nostoc linckia 1091006 Nostoc linckia NIES-25 272131 Nostoc punctiforme 63737 Nostoc punctiforme PCC 73102 264688 Trichormus 1164 Trichormus azollae 551115 'Nostoc azollae' 0708 1185 Rivulariaceae 1186 Calothrix 938406 Calothrix brevissima
==> sample_kaiju_maxbin2_order.summary <== % reads order ------------------------------------------- 1.553005 23 Nostocales 1.282917 19 Synechococcales 1.080351 16 Oscillatoriales ------------------------------------------- 0.000000 0 Viruses 1.485483 22 cannot be assigned to a order 0.877785 13 belong to a order with less than 1% of all reads ------------------------------------------- 93.720460 1388 unclassified Bin Taxonomic classification Kaiju and kraken ==> sample_kaiju_maxbin2_family.summary <== % reads family ------------------------------------------- ------------------------------------------- 0.000000 0 Viruses 1.823092 27 cannot be assigned to a family 4.456448 66 belong to a family with less than 1% of all reads ------------------------------------------- 93.720460 1388 unclassified ==> sample_kaiju_maxbin2_genus.summary <== % reads genus ------------------------------------------- ------------------------------------------- 0.000000 0 Viruses 2.295746 34 cannot be assigned to a genus 3.983795 59 belong to a genus with less than 1% of all reads ------------------------------------------- 93.720460 1388 unclassified ==> sample_kaiju_maxbin2_species.summary <== % reads species ------------------------------------------- ------------------------------------------- 0.000000 0 Viruses 2.430790 36 cannot be assigned to a species 3.848751 57 belong to a species with less than 1% of all reads ------------------------------------------- 93.720460 1388 unclassified
==> sample_SendSketch _maxbin2_.summary <== Query: sample_maxbin.002.fasta DB: RefSeq WKID KID ANI 0.47% 0.03% 82.94% 100.00% 1.30% 0.32% 0.03% 81.76% 100.00% 1.30% 0.14% 0.04% 79.34% 29.73% 1.22% 0.11% 0.04% 78.81% 34.41% 1.23% 0.10% 0.04% 78.51% 39.47% 1.24% 0.10% 0.04% 78.51% 37.28% 1.24% 0.09% 0.04% 78.18% 43.89% 1.25% 0.08% 0.05% 77.82% 56.23% 1.26% 0.09% 0.03% 78.22% 33.89% 1.26% 0.08% 0.04% 77.82% 43.59% 1.26% 0.07% 0.04% 77.07% 57.00% 1.27% 0.08% 0.03% 77.82% 35.37% 1.26% 0.06% 0.03% 76.57% 58.73% 1.28% 0.06% 0.03% 76.57% 53.80% 1.28% 0.06% 0.03% 76.57% 52.59% 1.28% 0.06% 0.03% 76.57% 51.59% 1.28% 0.06% 0.03% 76.57% 51.16% 1.28% 0.06% 0.03% 76.57% 51.14% 1.28% 0.06% 0.03% 76.57% 49.85% 1.28% 0.05% 0.03% 75.96% 74.77% 1.29% SketchLen: 8825 Seqs: 1481 Bases: 2359160 gSize: 2314730 gSeqs taxName Ceramium sungminbooi 253441 1 Porphyridium sordidum 1574623 7688703 298 118166 6674022 2 549789 5811350 44 1229172 6144835 13 317619 5218282 13 1342301 4085736 8 2107702 6780200 548 2107701 5262637 907 1443111 3976552 4 2107699 6480604 730 1342302 3896817 14 1247867 4241449 39 467661 4368039 6 391589 4477141 9 1446476 4498565 33 1225660 4461480 42 1715692 4581022 42 2175247 3074953 2 Complt Contam Matches Unique noHit 3 3 11 10 9 9 8 7 8 7 6 7 5 5 5 5 5 5 5 4 TaxID 1896769 168450 1 28024 gSize 0 0 6 0 0 3 7 0 0 1 1 0 0 0 0 0 0 0 0 0 8707 8707 7861 8672 8707 8707 8707 8707 8591 8707 8707 8707 8707 8707 8707 8707 8707 8707 8707 8707 Lyngbya confervoides BDU141951 Nodosilinea nodulosa PCC 7104 Phormidium tenue NIES-30 Leptolyngbya sp. KIOST-1 Prochlorothrix hollandica PCC 9006 = CALU 1027 Sulfitobacter noctilucicola filamentous cyanobacterium CCT1 filamentous cyanobacterium CCP5 Sulfitobacter guttiformis KCTC 32187 filamentous cyanobacterium CCP4 Sulfitobacter noctilucae Roseovarius albus Rhodobacteraceae bacterium KLH11 Roseobacter sp. GAI101 Ruegeria meonggei Ruegeria sp. 6PALISEP08 Ruegeria denitrificans Pelagicola sp. LXJ1103
Fucntional annotation of the best bin (Prokka) Number of protein coding genes: 2364 Antimicrobial resistance genes Using nucl database ncbi: 4579 sequences - 2018-Oct-20 #FILE SEQUENCE START END GENE COVERAGE COVERAGE_MAP GAPS %COVERAGE %IDENTITY DATABASE ACCESSION PRODUCT Processing: contigs_bin2_above500.fasta Found 0 genes in contigs_bin2_above500.fasta abricate -db=vfdb contigs_bin2_above500.fasta Using nucl database vfdb: 2597 sequences - 2018-Oct-20 #FILE SEQUENCE START END GENE COVERAGE COVERAGE_MAP GAPS %COVERAGE %IDENTITY DATABASE A CCESSION PRODUCT Processing: contigs_bin2_above500.fasta Found 0 genes in contigs_bin2_above500.fasta
Analysisof the metagenome Processing: final.contigs.fa (abricate) Found 9 genes in final.contigs.fa final.contigs.fa final.contigs.fa final.contigs.fa final.contigs.fa final.contigs.fa final.contigs.fa Otr(A) final.contigs.fa Otr(A) final.contigs.fa final.contigs.fa k141_160377 k141_22497 k141_283299 k141_330467 k141_368758 k141_393406 19 125 100 93 81 142 160 269 250 237 217 257 aacA32 mupA mupA mupA mupA otr(A) 1-142/555 1525-1669/3075 1523-1667/3075 1523-1661/3075 1525-1661/3075 209-324/1992 ====........... ......==/...... ......==/...... ......==/...... .......==...... .==............ 0/0 25.59 4.65 2/6 2/6 0/0 0/0 85.21 76.19 76.82 76.55 78.10 5.82 ncbi ncbi ncbi ncbi ncbi ncbi A7J11_04797 A7J11_05394 A7J11_05394 A7J11_05394 A7J11_05394 A7J11_00007 aminoglycoside 6'-N-acetyltransferase mupirocin-resistant isoleucine--tRNA ligase MupA mupirocin-resistant isoleucine--tRNA ligase MupA mupirocin-resistant isoleucine--tRNA ligase MupA mupirocin-resistant isoleucine--tRNA ligase MupA tetracycline resistance ribosomal protection protein 3/4 4.72 4.52 4.46 80.17 k141_395106 15 227 otr(A) 114-326/1992 ===...../...... 2/2 10.64 75.70 ncbi A7J11_00007 tetracycline resistance ribosomal protection protein k141_443937 k141_60737 33 67 171 424 mupA aacA32 1525-1663/3075 198-555/555 ......==/...... .....========== 3/4 0/0 4.46 64.50 78.01 81.56 ncbi ncbi A7J11_05394 A7J11_04797 mupirocin-resistant isoleucine--tRNA ligase MupA aminoglycoside 6'-N-acetyltransferase