Exploring Data Analytics: Introduction, Terminology, Challenges, Platforms, Tools, Applications

Slide Note
Embed
Share

Delve into the world of data analytics through this comprehensive guide covering topics such as the definition of data, big data, analytics vs analysis, the importance of data analytics, real-world applications, and more. Explore the classification of data, the 3Vs of big data, and how data analytics has transformed industries like healthcare and retail. Discover the power of predictive analytics in shaping decisions and driving innovation.


Uploaded on Apr 16, 2024 | 4 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Table of Contents Introduction Why Data Analytics Data Analytics Terminology Predictive Analytics Data Analytics challenges Data Analytics Platform Data Analytics tools Hadoop Data Analytics Application Recommendations

  2. Introduction What is data ? What is big data ? Analysis v/s Analytics

  3. WHAT IS DATA.. ? Collection of Facts and Statistics

  4. WHAT IS DATA.. ? (contd..) CLASSIFICATION OF DATA Structured High degree of organization such as relational database Unstructured Information that is difficult to organize using traditional mechanisms Eg: Facebook, Whatsapp, Gmail

  5. WHAT IS BIG DATA Complex and Dynamic 3V 90% of World s DATA produced in Last 2 year -IBM

  6. ANALYTICS Vs ANALYSIS ANALYTICS Extensive use of mathematics & statistics, use of descriptive techniques and predictive models to gain valuable knowledge ANALYSIS ANALYTICS Why did something happen? What is likely to happen?

  7. WHY DATA ANALYTICS ? From Reactive strategy to proactive strategy: Helped in Determining President of America

  8. DATA ANALYTICS IN REAL WORLD WALLMART Using predictive analytics to better identify customer preferences on a regional basis and stock their branch locations accordingly

  9. REAL WORLD APPLICATIONS (contd..) Medical diagnostics company analyzed and developed first non-intrusive test for predicting coronary artery disease: . Researchers analyzed over 100 million gene samples Identified the 23 primary predictive genes for coronary artery disease The resulting test, known as the Corus CAD Test, was recognized as on of the Top Ten Medical Breakthroughs of 2010 by TIME Magazine

  10. Data Analytics terminology Data mining Data Warehousing OLAP Big Data Analytics Business Analytics Descriptive Analytics Predictive Analytics 11

  11. PREDICTIVE ANALYTICS Extracting information from existing data sets in order to determine patterns and predict future outcomes and trends Predictive analytics is an enabler of big data Faster, cheaper computers and easier-to-use software

  12. PREDICTIVE ANALYTICS ( contd..)

  13. What Is Machine Learning Type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed. Some Application Of ML Spam filtering Topic Spotting Weather pridiction Medical diagnosis Fraud Detection 14

  14. Types Of Machine Learning Supervised learning: 15

  15. Types Of Machine Learning UnSupervised learning: 16

  16. Some Algorithms Used For ML Linear Regression Decision Tree Na ve Byes theorem K-means Algorithm 17

  17. SOME DATA ANALYTICS TOOLS 18

  18. R R is a programming language Open Source environment High Availability An interpreted Language Good data handling capability Most advanced graphical capability R support procedural and object oriented programming Get better result faster 19

  19. SAS SAS is a commercial software developed by SAS institute It is expensive Easy to learn Good data handling capability SAS releases updates in controlled environment SAS provide dedicated customer support 20

  20. DATA ANALYTICS IN CANADIAN RAILWAY 21

  21. IBM PURE DATA ANALYTICS TOOLS Fast and Easy Set Up Peta scale user data capacity Better Access to Information Customized Analytics Integrated third party software 3 X faster scan rate 128 GB/sec scan rate per rack 50% greater data capacity per rack 22

  22. DATA ANALYTICS PLATFORM 23

  23. DATA ANALYTICS PLATFORMS (contd.) Cloudera Cloudera Inc. was founded by big data geniuses from Facebook, Google, Oracle and Yahoo in 2008. First company to develop and distribute Apache Hadoop-based software. Use Cloudera management suite to automate the installation process It uses HDFS component for file system access Centralized metadata architecture 24

  24. DATA ANALYTICS PLATFORMS (contd.) Hortonworks Hortonworks, founded in 2011, has quickly emerged as one of the leading vendors of Hadoop It is a completely open source platform based on Apache Hadoop for analysing, storing and managing big data It is better than MapReduce in the sense that it will enable inclusion of more data processing frameworks It uses HDFS component for File system access Centralized metadata architecture 25

  25. HADOOP Apache Hadoop is an open-sourcesoftware framework written in java fordistributed storageand distributed processing of very large data sets on computer clusters built from commodity hardware

  26. HDFS Specially designed file system for storing huge data sets with cluster of commodity hardware with streaming access pattern

  27. MAP REDUCE Apache Hadoop MapReduce is a framework for processing large data sets in parallel across a Hadoop cluster. Data analysis uses a two step map and reduce process MapReduce is a programming model Google has used successfully is processing its big-data sets (~ 20000 peta bytes per day) Users specify the computation in terms of a map and a reduce function

  28. EXISTING CHALLENGES IN INDIAN RAIL SYSTEM Delays Signaling problem Broken down trail Congestion QoS One Solution to these problems can be Analysis of BIG Data through Predictive maintenance Big Data in the Rail industry can be used in Predictive analysis to predicts fault before they happen, thus improving the services

  29. PREDICTIVE MAINTENANCE: BIG DATA ON RAILS

  30. PREDICTIVE MAINTENACE (contd) Choose the right system or subsystem for prediction The prediction possibility zone Prediction effectiveness zone Identify the required data sets as early as possible. Identify the value-add of PM for maintenance strategies Complement your data science team with rail expertise Look for the right skills when hiring data scientists

  31. CHOOSING THE RIGHT SYSTEM OR SUBSYSTEM FOR PREDICTION The prediction possibility zone Prediction effectiveness zone

  32. APPLICATION OF DATA ANALYTICS IN INDIAN RAILWAYS

  33. Automatic vehicle location

  34. PASSENGER INFORMATION SYSTEM

  35. AUTOMATED FARE COLLECTION Using ticket vending machine Using smart card that provides access to all type of transit services across multiple operating agencies AFC Analytics provides details of passengers are using systems , identify the trends and help improve the services

  36. AUTOMATED PASSENGER COUNTING No of passengers boarding de-boarding each vehicle in a particular Station Rate of Increase of passengers can be predicted over the years by using the recorded data Peak hours in a day and Peak Months in a year can be identified These data can used to provide better services and project evolving ridership trends

Related


More Related Content