
Microbiome Data Analysis with Phyloseq: Advanced Techniques
Explore advanced techniques in processing microbiome data using Phyloseq, including taxonomic summary, abundance transformation, grouping by categories, donor removal, and facet grid plotting for in-depth analysis.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Lecture #12: Processing Microbiome Data: Secondary Data Analysis with Phyloseq (Part 2)
Plot Taxa Summary Let s take a look at what the taxonomic summary looks like at the Class level > > plot_bar plot_bar(ps1, fill= Class ) (ps1, fill= Class ) This plot shows the absolute abundance (read counts) for each individual sample at the Class level 2
Abundance Transformation Typically we want to use relative abundance for samples in order to account for differences in absolute read counts We will define a transformation function which will be applied to each sample > ps1ra = transform_sample_counts(ps1, function(x){x / sum(x)}) > plot_bar(ps1ra, fill= Class ) 4
Taxa Groupings As with QIIME, we can also plot taxonomic summaries by group using a category Here let s plot by the Mouse category > plot_bar(ps1ra, fill= Class , x= Mouse ) 6
Removing Donors Note that the donor samples do not fit well into the relative abundance plot by Mouse Donor samples are pooled from several mice Once sample per donor, vs two per mouse Let s create a new phyloseq object removing donors > ps1ranodonors = subset_samples(ps1ra, SampleType %in% c( Fecal , Cecal ) > plot_bar(ps1ranodonors, fill= Class , x= Mouse ) 8
Facet Grid Plotting We can also partition the plot by using the facet_grid In this example we will partition the plot by Phylum Within the Phylum partition, we color by Order We subplot within each facet grid over Receive > plot_bar(ps1ranodonors, fill= Order , x= Receive , facet_grid=~Phylum) 10
Ordination Plots The ordination plot in phyloseq is analogous to the PCoA plots we produced with QIIME > out.uuf <- ordinate(ps1ra, method= MDS , distance= uunifrac ) > evalsuuf <- out.uuf$values$Eigenvalues > plot_ordination(ps1ra, out.uuf, color= Receive ) + labs(col= Receive ) + coord_fixed(sqrt(evalsuuf[2]/evalsuuf[1])) 12
Bray-Curtis Distance The Bray-Curtis distance is similar to unweighted unifrac, but is not phylogenetically aware That is, similarity of microbes is not taken into account > out.bc <- ordinate(ps1ra, method= MDS , distance= bray ) > evalsbc <- out.bc$values$Eigenvalues > plot_ordination(ps1ra, out.bc, color= Receive ) + labs(col= Receive ) + coord_fixed(sqrt(evalsbc[2]/evalsbc[1])) 14
Weighted UniFrac The ordination plot in phyloseq is analogous to the PCoA plots we produced with QIIME > out.wuf <- ordinate(ps1ra, method= MDS , distance= wunifrac ) > evalswuf <- out.wuf$values$Eigenvalues > plot_ordination(ps1ra, out.wuf, color= Receive ) + labs(col= Receive ) + coord_fixed(sqrt(evalswuf[2]/evalswuf[1])) 16
Weighted UniFrac Ordination 17
Metagenomic Methods To this point, we have been discussing 16S rRNA amplification to take census of bacteria present in a community This single gene is amplified and searched against a database to determine which bacteria are present and in what relative quantities Full metagenomic sequencing would entail sequencing all of the DNA found in an environment instead of amplifying 16S rRNA
Full Metagenomic Sequencing A much larger sequencing effort is required for full metagenomic sequencing With 16S rRNA the sequencing breadth is restricted to one ~ 200-500 bp amplicon Selectively amplified with Universal primers This restricted breadth allows for sufficient depth to be achieved with relatively little sequencing Full metagenomic sequencing does not restrict the breadth of sequencing Instead you are sequencing all of the DNA present
Full Metagenomic Sequencing Advantages of this method are that you survey the genetic potential of an environment That is, you find out what genes are present It is very difficult and computationally challenging to assemble metagenomes DNA that is sequenced comes from a variety of organisms It is easy to accidentally misassemble parts of one bacteria with another
Metatranscriptomic Sequencing Metatranscriptomics is the meta corollary to transcriptomic sequencing RNA-Seq on a community This assays the gene expression of a community whereas metagenomics looks at the genetic potential of a community Both Metagenomics and Metatranscriptomics are expensive requiring a lot of sequencing
PICRUSt Phylogenetic Investigation of Communities by Recostruction of Unobserved States (PICRUSt) This approach uses 16S rDNA sequencing data along with databases of metagenomic information about bacteria to infer the metagenomics of an environment 16S tells you who is there Metagenomic database tells you what they typically do
PICRUSt PICRUSt is essentially a cheap alternative to full metegenomic sequencing Authors show that inferred metagenomics of communities very accurately estimates actual metagenomic sequencing done on the same communities Obviously, the target environments must be well characterized in advance Unstudied environments with a variety of unknown bacteria would not do well with PICRUSt