
Writing and Reading XML Files with SAS for Efficient Data Analysis
Learn how SAS (Statistical Analysis System) Institute, based in North Carolina, provides tools for writing and reading XML files to enhance data management and decision-making. Utilize the SAS XML Libname engine to seamlessly convert datasets between XML and SAS code formats. Follow a step-by-step guide to create XML output files and explore the process for effective data handling in SAS.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Writing and Reading XML files with SAS (Statistical Analysis System) What is SAS ? SAS Institute (or SAS, pronounced "sass") is an American developer of analytics software based in Cary, North Carolina. SAS develops and markets a suite of analytics software (also called SAS), which helps access, manage, analyze and report on data to aid in decision-making.
How to write and read XML files with SAS? Make use of the XML libname engine. Libname engine translates SAS datasets to XML files (Input is SAS CODE and Output is an XML file). Libname translates XML files to SAS (Input is an XML file and Output is a SAS code).
Writing XML files with the SAS XML Libname engine let s start by writing an XML file with SAS, and examine the XML output file. 1. data work.study_abc; 2. length patientid $10 patientheight 8 patientweight 8; 3. infile cards; 4. input patientid $ patientheight patientweight; 5. cards; 6. subj001 181 78 7. subj002 . . 8. subj003 173 66 9. ; 10. run;
Writing XML files with the SAS XML Libname engine (cont d) 11. 12. libname phuse xml 'c:\temp\study_abc.xml' ; 13. 14. data phuse.study_abc; 15. set work.study_abc; 16. run;
XML output file. XML output file. <?xml version="1.0" encoding="windows-1252" ?> <TABLE> <STUDY_ABC> <patientid> subj001 </patientid> <patientheight> 181 </patientheight> <patientweight> 78 </patientweight> </STUDY_ABC> <STUDY_ABC> <patientid> subj002 </patientid> <patientheight Missing="." /> <patientweight Missing="." /> </STUDY_ABC>
XML output file(contd) XML output file(cont d) <STUDY_ABC> <patientid> subj003 </patientid> <patientheight> 173 </patientheight> <patientweight> 66 </patientweight> </STUDY_ABC> </TABLE>
Comment on the XML output file Comment on the XML output file SAS names the root element of the XML file TABLE . The elements defining the observations of the datasets contain the name of the SAS dataset Study_ABC . The elements nested in the observations are the names of the variables within the dataset. The value for each observation and variable is stored within a variable element. The missing values are presented with the missing attribute.
Reading XML files with the SAS XML Reading XML files with the SAS XML Libname engine engine Libname When reading XML files with the SAS XML libname engine, SAS expects the XML files to be in the following format: One root element (as is mandatory on a well formed XML). An element bundling the variable information per observation. The name of this element will be used to name the dataset. Element(s) within the observation element. These will become the SAS variables.
Reading XML files with the SAS XML Libname engine (cont d) The following code will read in the XML file into a SAS dataset: 1. libname phuse xml 'c:\temp\study_abc.xml'; 2. 3. data work.study_abc; 4. set phuse.study_abc; 5. run;
Writing XML FILES and SCHEMAS Writing XML FILES and SCHEMAS The custom structure of an XML document is described via an XSD. XSD:XML Schema Definition. This XSD is referred to by SAS as the XML schema. With the SAS XML libname statement SAS offers the possibility to write the schema and the data. To utilize this functionality the option XMLMETA must be specified. This option can have the values SCHEMA, DATA and SCHEMADATA, where SCHEMA will write just the schema, DATA will write just the XML data, and SCHEMADATA will write both the SCHEMA and the XML data.
SAS code that produces an XML code that writes both schema and data The following example shows the SAS code to write both the schema and the data. 1. 2. 3. data phuse.study_abc; libname phuse xml 'c:\temp\study_abc.xml' xmlmeta=schemadata; 4. set work.study_abc; 5. run; The OUTPUT will be the 40 XML lines of code that follows:
The resulting XML (both schema and data The resulting XML (both schema and data combined ) cont d combined ) cont d first 10 lines of code first 10 lines of code 1.<?xml version="1.0" encoding="windows-1252" ?> 2. <TABLE> 3. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 4. xmlns:od="urn:schemas-microsoft-com:officedata"> Notice the depth of indention for lines 5 &9 5. <xs:element name="TABLE"> 6. <xs:complexType> <xs:sequence> 7. <xs:element ref="STUDY_ABC" minOccurs="0" maxOccurs="unbounded" /> 8. </xs:sequence> </xs:complexType> 9. </xs:element> 10. <xs:element name="STUDY_ABC">
The resulting XML (both schema and data The resulting XML (both schema and data combined ) cont d combined ) cont d Second 10 lines of code Second 10 lines of code 11. <xs:complexType> <xs:sequence> 12. <xs:element name="patientid" minOccurs="0" od:jetType="text" 13. od:sqlSType="nvarchar"> 14. <xs:simpleType><xs:restrictionbase="xs:string"> 15. <xs:maxLength value="10" /> 16. </xs:restriction></xs:simpleType> Notice the depth of indention for lines 12 and 17 17. </xs:element> 18. <xs:element name="patientheight" minOccurs="0" od:jetType="double" 19. od:sqlSType="double" type="xs:double" /> 20. <xs:element name="patientweight" minOccurs="0" od:jetType="double"
The resulting XML (both schema and data The resulting XML (both schema and data combined ) cont d combined ) cont d Third 10 lines of code Third 10 lines of code 21. od:sqlSType="double" type="xs:double" /> 22. </xs:sequence> </xs:complexType> 23. </xs:element> 24. </xs:schema> 25. <STUDY_ABC> 26. <patientid>subj001</patientid> Notice the nested elements and their values in lines 26,27 and 28 27. <patientheight>181</patientheight> 28. <patientweight>78</patientweight> 29. </STUDY_ABC> 30. <STUDY_ABC>
The resulting XML (both schema and data The resulting XML (both schema and data combined ) cont d combined ) cont d The Last 10 lines of code The Last 10 lines of code 31. <patientid>subj002</patientid> Notice that the second subject (patient) has missing attributes, so no values appeared in the generated XML code (lines 31, 32 and 33). 32. <patientheight/> 33. <patientweight/> 34. </STUDY_ABC> 35. <STUDY_ABC> 36. <patientid>subj003</patientid> 37. <patientheight>173</patientheight> 38. <patientweight>66</patientweight> Notice that SAS names the root elemnt of the XML file TABLE (lines 2 and 40) 39. </STUDY_ABC> 40. </TABLE>