
Exploring Data Analysis Techniques in Genetics Research
Dive into a comprehensive paper exploring data analysis techniques in genetics research, including data description, PCA analysis, clustering methods, linear regression, and correlations. Discover how these methods are applied to leukemia data to derive meaningful insights. Uncover the results and insights obtained from various techniques applied to the data.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
HW 2: Partial draft of paper (due Friday 4/30) Data description: Describe data including number of data points and dimension. Consider using statistics, graphical representations (eg PCA), clustering to also describe data. Appendix (or part of main paper, depending on results): Analyze your data using some of the techniques covered in class/labs. Include a 1+ paragraph description of the results. Not all techniques are good for all data. Explaining why a technique is not good for your data should be included in your paper. HW 3 (due Friday May 7): Create an artificial data set to illustrate something covered in class that I can use in a review. http://clt.astate.edu/crbrown/statchapter12a.ppt
PCA on all Genes Leukemia data, precursor B and T Plot of 34 patients, dimension of 8973 genes reduced to 2 5 www.cse.buffalo.edu/faculty/azhang/data-mining/pca.ppt
PCA on 100 top significant genes Leukemia data, precursor B and T Plot of 34 patients, dimension of 100 genes reduced to 2 6 www.cse.buffalo.edu/faculty/azhang/data-mining/pca.ppt
Linear regression https://en.wikipedia.org/wiki/Linear_regression
Linear Correlations Linear relations or linear correlations have points that cluster around a line Linear relations can be either positive (the points slants upwards to the right) or negative (the points slant downwards to the right) https://emunix.emich.edu/~schu/MATH_170/Fall09_170/Evening%20session/Unit%203.ppt
Positive Correlation Coefficients Examples of positive correlation Strong Positive r = .8 Moderate Positive r = .5 Very Weak r = .1 In general, if the correlation is visible to the eye, then it is likely to be strong https://emunix.emich.edu/~schu/MATH_170/Fall09_170/Evening%20session/Unit%203.ppt
Negative Correlation Coefficients Examples of negative correlation Strong Negative r = .8 Moderate Negative r = .5 Very Weak r = .1 In general, if the correlation is visible to the eye, then it is likely to be strong https://emunix.emich.edu/~schu/MATH_170/Fall09_170/Evening%20session/Unit%203.ppt
Nonlinear versus No Correlation Nonlinear correlation and no correlation Nonlinear Relation No Relation Both sets of variables have r = 0.1, but the difference is that the nonlinear relation shows a clear pattern https://emunix.emich.edu/~schu/MATH_170/Fall09_170/Evening%20session/Unit%203.ppt
Linear regression https://en.wikipedia.org/wiki/Linear_regression