R Scripting Basics for Data Analysis

r scripting basics n.w
1 / 16
Embed
Share

Explore the fundamentals of R scripting, from basic syntax to key packages and applications, in this comprehensive guide. Learn about vectors, data frames, statistical testing, and the Tidyverse syntax for streamlined data analysis in R programming.

  • Data Analysis
  • R Scripting
  • Tidyverse
  • Data Frames
  • Statistical Testing

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. R SCRIPTING BASICS Ross Wickham Senior Hydraulic Engineer NWW, Hydrology Branch Date: 17 March 2022

  2. 2 OUTLINE R Background Basic Syntax and Scripting Key Packages Example Applications Demo Summary Resources

  3. 3 R BACKGROUND Object oriented, like Python, C++, Java Interpreted language (you don t need to pre-compile anything) Free, open-source Implementation of S programming, a statistical programming language Created for statistics, data mining, and data analysis

  4. 4 BASIC SYNTAX Assignment operators are bidirectional, and can be chained a = 3, b = 4, c = 4, d = 5, e = 6 Create sequences, or specified values; c = default concatenate function

  5. 5 BASIC SYNTAX Vectors are a key data type, like Python lists: Standard (or base) R syntax is to wrap objects with a function for evaluation: Subset vectors using brackets

  6. 6 BASIC SYNTAX Data.frames are an essential data storage type for any tabular/paired data (e.g., time series), like pandas dataframe in Python: Data.frames are easily plotted:

  7. 7 BASIC SYNTAX Code is streamlined for statistical testing

  8. 8 INSTALLING AND HELP Loading and installing packages is very simple: Add a ? in front of any function to see its help page in RStudio: Examples and additional documentation can be found in package vignettes:

  9. 9 TIDYVERSE SYNTAX An alternative to basic R syntax that has quickly gained popularity Tidyverse style has a core philosophy for data structure and analysis ( tidy data ): Every column is a variable Every row is an observation Every cell is a single value Rich library of functions to streamline data analysis using pipelines %>% operator indicates you are passing the object on the left to the function on the right. These can be chained, passing the object between multiple functions

  10. 10 SCRIPTING IN R My preferred method: use RStudio, an Integrated Developer Environment (IDE) See your code Test evaluations View errors See plots Get help See current objects being used (your environment ) See history Customize the user interface Configure R version Available on App Portal

  11. 11 SCRIPTING WITH RSTUDIO Navigate through script tabs Code Outline Current Objects Source Code Help, Plots, Packages, and File management Console for testing code and viewing output

  12. 12 SCRIPTING WITH RSTUDIO CONFIGURING R Able to control which version of R is being used: Tools > Global Options *Use 64-bit unless you have a good reason to use 32-bit

  13. 13 R VS PYTHON Use what you know and are comfortable with both are great R is more portable Python is generally considered easier to learn R is better for statistics R considered easier for plotting Python typically faster, better for machine learning

  14. 14 KEY PACKAGES (FOR H&H) tidyverse set of packages designed to work together for data analysis under a core philosophy ggplot2 plotting library dplyr data manipulation purrr create complex data pipelines readr fast, user-friendly way to read rectangular data tidyr consistently organize tabular data and others reshape2 data.frame manipulation lubridate parse and manipulate dates rgdal, sp, raster read, write, and manipulate geospatial objects plotly interactive plots dataRetrieval - USGS data web retrieval Specific to the Corps: cwms_read read publicly available NWD CWMS data, Jeff Tilton (NWD) dssrip read, manipulate, write DSS data, Evan Heisman (HEC) Advanced Users: shiny, shinydashboard develop web app user interfaces; dynamically interact with data leaflet interactive maps, with interactive content

  15. 15 QUESTIONS?

  16. 16 RESOURCES Simple R cheat sheet Cheat sheets for multiple packages Portable version of R and RStudio with example code: <HEC share drive> The R Manuals (written by R development core team) The Little Book of R for Time Series Code Academy tutorials Translation between Python pandas and R data.frame Stackoverflow: help forum dssrip package: read and write DSS in R cwms_read package: read NWW CWMS data in Python and R

Related


More Related Content