
Statistical Data Analysis Techniques and Types of Analysis Explained
Explore the world of statistical data analysis with a focus on choosing appropriate techniques, qualitative and quantitative analysis, types of qualitative analysis processes, and statistical analysis methods like explorative and confirmative data analysis. Understand the role of descriptive and inductive statistics in summarizing and generalizing data effectively.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Statistical Data Analysis Zulkarnain Lubis
Choosing the Appropriate Statistical Technique Choosing the correct statistical technique requires considering: Type of question to be answered Number of variables involved Level of scale measurement
Data Analysis QUALITATIVE ANALYSIS STATISTICAL ANALYSIS QUANTITATIVE ANALYSIS BESIDES STATISTICS
Types of qualitative analysis process Main types Summarising (condensation) of meanings Categorising (grouping) of meanings Structuring (ordering of meanings using narrative
Qualitative Data Analysis Qualitative data result from the collection of non- standardised data that require classification and are analysed through use of conceptualisation Qualitative analysis can involve summarising, categorising and structuring data The process of data analysis and collection are necessarily interactive
STATISTICAL ANALYSIS Explorative Data Analysis Confirmative Data Analysis Searching and disclosure of structure and pattern of existing data, checking the form and pattern of distribution of data, revealing the presence of irregularities Using simple arithmatics and graphs Finding information about a population based on a sample, Performing inference or generalization from sample to population Consideration of strict assumptions
STATISTICAL ANALYSIS Descriptive Statistics: Part of statistics which is specifically used to describe data; describing visually and measurement Inductive Statistics: Part of Statistics for taking formal conclusions and generalizing to population based on data sample; classified on Parametric Statistics and Non- Parametric Statistics
Descriptive Statistics By measurement Visually Table: Cross Tabulation, Frequency Tables, etc. Figure/Picture/ Chart/Graph: Histogram, Bar Chart, Plot Diagram, Box-Plot Diagram, Pie Chart, Run Chart, Control Chart, Time Series graph, Stem and Leaf Diagram Measures of central tendency or measure of location: mean, median, modus, midrange, midhinge Measures of dispersion: range, variance, standard deviation, standard deviation, absolute deviation, inter-quartile range Other measures: proportion, percentages, ratio
To identify the pattern of data spread by using tables and figures Frequency Table Histogram Stem and Leaf Diagram Box-Plot Diagram
To find out the relationship among variables using graphs and tables Cross Tabulation Plot Diagram
To forecast, to identify problems, to observe a process by using graphs Run Chart Control Chart Time Series graph
To Describe the distribution of data in the nominal scale of measurement Pie Chart Bar Chart
To Describe Data by using measurement Mean Median Modus Midrange Midhinge Range Variance Standard deviation Inter-quartile range Covariance Proportion Ratio Percentage
Inductive Statistics Parametric Statistics Non-Parametric Statistics
Inductive Statistics Parametric Statistics Non-Parametric Statistics Parametric Statistics: based on strict assumptions relating to the characteristics of the population from which data were obtained Such assumptions: normal distribution, independent, homogenous variance Non-Parametric Statistics: The assumptions are not so strict , the assumption is usually required only symmetry Can be used for an ordinal, interval, and ratio scale of measurement Suitable social sciences which are sometimes the data are difficult to be quantified Usually used interval and ratio scale of measurement Suitable for natural science
Parametric versus Nonparametric Tests 21 16 Parametric Statistics Involve numbers with known, continuous distributions. Appropriate when: Data are interval or ratio scaled. Sample size is large. Nonparametric Statistics Appropriate when the variables being analyzed do not conform to any known or continuous distribution.
In general, statistical parametric and non-parametric statistics have equivalent analytical tools that can be used for the same purpose The Pair of Data Analysis Tools of Parametric and Non Parametric Statistics Hypothesis Parametric Non Parametric Sign test or Wilcoxon sign test Mann-Whitney-(Wilcoxon) test Kruskal Wallis test or Friedmen test Siegel Tukey test One sample or paired samples Two independent samples Many independent samples The parameters of location or dispersion of two independent samples Z-test or t-test Z-test or t-test F-test (ANOVA) F-test Association or Correlation Analysis Pearson Correlation or 2 test or F-test Spearman Correlation or Tau Kendall Correlation
Confidence Interval Determining the confidence interval of a population mean using Z statistic Determining the confidence interval of a population mean using t statistics Determining the confidence interval of the difference of two population means using Z statistic Determining the confidence interval of the difference of two population means using t statistic Determining the confidence interval of a population variance using statistic 2 Determining the confidence interval of the comparison of two population variances using the statistic F
Hypothesis Test Testing the magnitude of a population mean using Z test Testing the magnitude of a population mean using t- test Testing the magnitude of the difference of two population mean using Z-test Testing the magnitude of the difference of two population means using t-test Testing the magnitude of a population variance using using 2 test Testing the magnitude of the ratio of two population variances using F-test Testing the differences of several population means using F-test (Analysis of Variances )
ESTIMATING RELATIONSHIP AMONG VARIABLES Simple correlation Simple linear regression Multiple linear regression Non-linear regression
Classical Assumption For Regression Analysis Normality Homoscedasticity No Multicollinearity No Autocorrelation
MORE ON ESTIMATING RELATIONSHIP AMONG VARIABLES Structural Equation Modeling Path Analysis Partial Least Square Logistic Regression
Structural Equation Modeling Structural equation modeling (SEM) A very general, chiefly linear, chiefly cross-sectional statistical modeling technique factor analysis path analysis and regression SEM is a largely confirmatory rather than exploratory technique A researcher are more likely to use SEM to determine whether a certain model is valid rather than using SEM to "find" a suitable model although SEM analyses often involve a certain exploratory element
A structural equation model implies a structure of the covariance matrix of the measures hence an alternative name for this field, "analysis of covariance structures"
Path Analysis Path analysis is a technique for analyzing the causal relationship that occurs in multiple regression if the independent variables affect the dependent variable not only directly but also indirectly ". (Robert D. Retherford 1993). Path analysis is an extension of multiple regression analysis http://4.bp.blogspot.com/-BXqdGM903wE/UVjWC8ScWAI/AAAAAAAAAV0/vebPeAPKNrY/s400/1.png D = DA + DB + DC + 1 E = EA + EC + ED + 2
Partial Least Square (PLS) PLS is an alternative method of settlement of a complex multilevel models that do not require a big size samples PLS regression is particularly useful when we need to predict a set of dependent variables from a (very) large set of independent variables (predictors) In addition there are also some advantages, namely PLS which will have implications for the optimal prediction accuracy. PLS method is a powerful method of analysis because it does not assume a scale of measurement data and can also be used to confirm the theory.
PLS regression is a recent technique that generalizes and combines features from principal component analysis and multiple regression. Its goal is to predict or analyze a set of dependent variables from a set of independent variables or predictors. This prediction is achieved by extracting from the predictors a set of orthogonal factors called latent variables which have the best predictive power. Some programs are designed to complete the PLS is SmartPLS, PLSGraph, VPLS or PLS-GUI.
logistic regression For logistic regression, the data scale dependent variable (Y) is categorical (non-metric), either binary (binary logistic regression) or multinomial (ordinal logistic regression) In logistic regression, we know namely the concept of odds ratio related to the concept of probability Logistic regression is part of the regression analysis that is used when the dependent variable (response) is a dichotomous variable (for binary). Dichotomous variables usually only consists of two values, which represent the appearance or absence of an event that is usually given the number 0 or 1
Unlike ordinary linear regression, logistic regression does not assume the relationship between independent and dependent variables is linear. Logistic regression is a non-linear regression models specified which would follow the pattern of the curve as shown below
The model used in the logistic regression is: Log (p / 1 - p) = 0 + 1X1 + 2X2 + .... + kXk Where p is the possibility for Y = 1, and X1, X2, X3 are the independent variables, and s are regression coefficients.