Data Mining for Business Intelligence and Decision Making

data mining chapter 5 n.w
1 / 13
Embed
Share

Explore the evolving technologies of data mining and its applications in gathering, analyzing, and interpreting large datasets. Understand the iterative process of data mining, the development of inductive learning models, and the prediction and interpretation of patterns to make informed decisions. Learn about classification trees, association rules, regression, Bayesian classifiers, and more in the realm of data mining and machine learning.

  • Data Mining
  • Business Intelligence
  • Machine Learning
  • Data Analysis
  • Predictive Modeling

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Data Mining CHAPTER 5 Business Intelligence: Carlo Vercellis 1

  2. Data Mining Evolving technologies of information gathering and storage; large amount of data sets; complex analysis Set of related activities referred to as ;data mining, knowledge discovery, pattern recognition, machine learning Iterative process Experts on the application domain cooperate with data analysts who use mathematical models. 2

  3. Data Mining Refers to the overall process consisting of data gathering and analysis, development of inductive learning models and adoption of practical decisions and consequent actions. Is the process of exploration and analysis of a dataset to find regular patterns, to extract relevant knowledge. Characteristic of data mining Procedure of collection of data and inserting them into the database The data mining process is based on inductive learning methods, whose main purpose is to derive general rules starting from a set of available examples, consisting of past observations recorded in one or more databases. models and patterns:- linear equations, sets of rules in if then else form, clusters, charts and trees. 3

  4. Data Mining Activities PREDICTION INTERPRETATION Identify regular patterns in the data Express the patterns through rules and criteria Rules generated must be original and non-trivial Anticipate the value that a random variable will assume in the future Estimate the likelihood of future events. 4

  5. Models and methods for data mining Classification trees or association rules machine learning/knowledge discovery Regression /Bayesian classifiers Probability/optimization theory Linear regression model best known learning and predictive method Linear regression is used to relate a dependent response variable Y to an independent predictor X; in the form Y=aX+b; a and b are parameters to be determined using past observations. Steps Selection of a class of models for representing patterns of data. Definition of a metric for evaluating the effectiveness and accuracy of models Design a computational algorithm optimizing the evaluation metric 5

  6. Data mining,classical statistics and OLAP *OLAP Online Analytical Processing 6

  7. Applications of data mining Relational Marketing marketing campaigns:cross-selling, up-selling, market basket analysis Fraud detection insurance and banking (illegal activities) Risk Evaluation Estimate the risk connected with future decisions; granting loan based on the characteristics of the customer Text Mining web search engines, filters for email messages Image Recognition recognize written characters,identify human faces, detect suspicious behaviours Web mining Analysis of clickstreams (sequences of pages visited) for e- commerce sites or e-learning sites Medical Diagnosis Early detection of diseases, Image analysis 7

  8. Representation of input data 2 Dimensional table- data set Rows instances,observations,records Columns attributes,variables,characteristics Categorical - whether a customer is using pre-paid or post-paid Numerical Amount of outgoing phone calls during a month Counts true or false; a bank s customer may/may not hold credit card. Nominal categorical attributes without natural ordering; province of residence Ordinal - categorical attributes with natural ordering; education level Discrete numerical that has finite number of values Continuous numerical that has uncountable infinite number of values. 8

  9. Data Mining Process 9

  10. Analysis Methodologies Supervised and unsupervised learning processes Supervised means training data has ground truth tables to learn from. Unsupervised means no truth tables for training data. Seven data mining tasks Characterization and discrimination -S Classification -S Regression - S Time Series analysis - S Association rules -U Clustering -U Description and visualization - U 10

  11. income student credit_rating buys_computer high no fair high no excellent 31 40 high no fair >40 medium no fair >40 low yes fair >40 low yes excellent 31 40 low yes excellent <=30 medium no fair <=30 low yes fair >40 medium yes fair <=30 medium yes excellent 31 40 medium no excellent 31 40 high yes fair >40 medium no excellent age <=30 <=30 no no yes yes yes no yes no yes yes yes yes yes no 11

  12. CLASSIFICATION (Supervised) Training Set(labelled images) Test Set (not a part of training set but used for evaluation) Yes/No Questions REGRESSION (Supervised) How many customers will visit the website? What will be the income from the clicks on the websites? How many human figures in this picture? CLUSTERING (Unsupervised) Put the objects in a group 12

  13. Actors and roles in the data mining process 13

More Related Content