Teaching Students About Realities of Data at Cornell College

messy data n.w
1 / 24
Embed
Share

Explore the innovative approach taken by Cornell College in teaching students about data through hands-on courses covering data cleaning, visualization, and big data concepts. The curriculum includes a blend of statistics and computer science, providing a comprehensive understanding of handling and analyzing data effectively.

  • Data Education
  • Cornell College
  • Data Visualization
  • Big Data
  • Statistics

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Messy Data: Teaching Students Early on About the Realities of Data

  2. Cornell College Small liberal arts college (1100 students) Mathematics and Statistics Department with 4.5 tenure track lines Teach on the block plan

  3. Statistics History at Cornell Intro stat Probability/Math Stat Stat 2 New Frontiers Epidemiology Dealing with Data: Data Manipulation, Data Visualization, and Big Data

  4. Data Course Team taught with computer scientist Prerequisite either intro stat or CS 1 Focused on hands-on Morning was two hours of lecture Afternoon was two hours of computer lab

  5. Data Course - Plan 1/3 of the course on each topic Data Cleaning Data Visualization Big Data Relevant computer science fundamentals addressed in a just-in- time fashion Use R as the software tool

  6. Data Course - Reality 1/3 Data Cleaning 1/2 Data Visualization 1/6 Big Data

  7. Daily Structure Morning 2 hours M-Thur: Lecture 1 hour stat 1 hour CS Fri : Student presentations Afternoon 2 hours Computer lab

  8. Data Cleaning Simple issues Clearly wrong entries Potentially wrong entries Functions of a variable

  9. Data Cleaning More complex issues Combining data sets Linking variable issues Making sure data sets are combined properly Different variable formatting in different data sets

  10. Data Visualizations Look at published visualizations Discuss ways to improve published visualizations Specific visualizations created: Stream graphs Tree graphs Maps

  11. Big Data Described big data Volume Velocity Variety Discussed computer science issues MapReduce Hadoop

  12. Projects 3 Projects Chapter 2 of Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving by Deborah Nolan and Duncan Temple Lang Twitter project Group project

  13. Project 1 Introduce students to R 10 years of data from the Cherry Blossom Road Race in DC Lots of data cleaning Introduced some visualization issues with larger data sets Introduced the idea of smoothing

  14. Project 1 Done in pairs Deliberately formed with one stat and one cs student In class work following the steps given for the men s data Written report due for women s data Includes both code and statistical report

  15. Project 2 Download public tweets Filter for a query term Assign a sentiment score Aggregate tweets by state Produce geographic visualization of data

  16. Project 2 Again done in cs/stat pairs Final report Required an extension of the basic lab Required both code and statistical report

  17. Project 3 Term-long 4-person group project First week Individual brainstorming about topics Friday morning elevator pitches Second week Teams find data and refine goals Friday morning check-in report from all teams class feedback provided

  18. Project 3 Third week Lab time devoted to project Finish data cleaning and do much of the analysis Friday morning check-in report from all teams class feedback provided

  19. Project 3 Last 3 days of class Finishing touches on the analysis Create project website Final presentation to both class and other visitors

  20. Examples of group projects

  21. Examples of group projects

  22. Examples of group projects

  23. Lessons Learned Slower introduction to R Small individual assignments as we go More faculty input for statistical analysis of group projects

  24. For more information Ann Cannon Department of Math and Stat Cornell College 600 First St SW Mt. Vernon, IA 52314 (319) 895-4461 acannon@cornellcollege.edu

Related


More Related Content