
Working with SAS Data Sets and Data Steps
Learn how SAS data sets are divided into descriptor and data portions, how to access and manipulate data using data steps, and the essential concepts of the Program Data Vector (PDV) in SAS programming.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
The Data Step in SAS SAS files (tables, data sets) are created or changed using a data step. SAS files consist of: Observations (records, rows) containing information on a single object. Each row contains information on variables pertaining to the single object.
SAS datasets are divided into two parts. Descriptor Portion provides information on the data set and is viewed with PROC CONTENTS. Data Portion contains the actual data and is viewed using PROC PRINT
Accessing the descriptor portion. libname s5238 "d:\dropbox\sas\sasdata\5238"; proc proc contents contents data=s5238.chd2018;run run; The location of the data set is specified with a libname statement.
Accessing the data portion. proc proc print print data=s5238.chd2018 (obs=100 var chd gender sbp1 chol; run run; 100); The var statement is not required. If not given all variables are presented. The obs= option (in parentheses) after the data set name is a data set option.
The data step is most often used to create a new data set. By reading a SAS data set (with or without subsetting) by reading an existing SAS Data set(s). Through programming (e.g. simulation). Putting multiple datasets together (concatenating, appending, merging)
How the data step works. One observation at a time, one statement at a time. Program Data Vector
data data analysis; set s5238.chd2018(keep=sbp1 chol); run run; proc proc print print data=work.analysis(obs=10 10);run run; Program Data Vector (PDV) sbp1 (num 8) chol (num 8) . .
The where statement data data analysis(drop=gender); set s5238.chd2018(keep=gender sbp1 chol); where gender="Male"; run run; proc proc print print data=analysis(obs=7 7); run run; Program Data Vector (PDV) gender (char 8) sbp1 (num 8) chol (num 8) . .
Assignment Statements data data arith; input htin wtlb chol; htm=.0254 .0254*htin;/*1 inches = 0.0254 meters*/ wtkg=.45359237 .45359237*wtlb;/*1 pound = 0.45359237 kilograms*/ bmi=wtkg/htm**2 2; cholmmol=chol*0.02586 0.02586;/*1 mg/dl = 0.02586 mmol/l*/ datalines; 56.50 98 234 62.25 145 172 62.50 128 248 64.75 119 215 68.75 144 145 ; run run; proc proc print print data=arith;run run;
Loops data data normals; call streaminit(1735171 do i = 1 1 to 1000 z = rand("Normal"); /* z ~ N[0,1] */ output; end; run run; 1735171); /* set random number seed */ 1000; ods select histogram qqplot; proc proc univariate univariate data=normals ; var z; histogram z/normal; qqplot z; run run;