
Introduction to CKAN Data Portals and Workflows for Data Retrieval
Discover the world of CKAN data portals and learn how to access data directly in R with helpful workflows. Explore the possibilities and resources available to get you started on your data journey.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
ckanr ckanr ckan + R: workflows for data retrieval and archiving by Wiebke Toussaint | @SaintlyVi Data Scientist, UCT Energy Research Centre
Overview Objective of talk Objective of talk Introduce you to some CKAN data portals & enable you to use data from those portals directly in R Talk Structure Talk Structure 1. Intro to CKAN & open data portals 2. Demonstration of what s possible 3. Getting started with ckanr 4. Typical workflow for a project accessing data from an open data portal 5. Resources to get you started
Yes, we can CKAN for all! CKAN, the world s leading Open Source data portal platform What makes CKAN nice? What makes CKAN nice? Available on rOpenSci :) Free Well documented Large user community Customisable to needs of individual entities Different permission levels and access control An online place for datasets Access, manage and retrieve data with an API
African Data Portals built on CKAN SA Online SA Online B Budget https://data.vulekamali.gov.za/ SA Energy Research Data Portal SA Energy Research Data Portal http://energydata.uct.ac.za/ openAfrica openAfrica (volunteer (volunteer- -driven dataset) https://africaopendata.org/ udget D Data ata driven dataset)
or simply use ckanr ckanr
c ckanr kanr in in GHGviz GHGviz library(ckanr) path <- getwd() projhome <- dirname(dirname(path)) # CKAN variables ckanr_setup(url = 'http://energydata.uct.ac.za') # set the url to the CKAN data portal ghgdata <- package_search('ghg inventory')$results # get a list of all projects with model outputs # Get GHG data tables resA <- resource_show(id = "c3a6dea5-e4c1-43cc-82ce-4c2d531ae19b", as = "table") tabA <- fetch(resA$url)
Getting ready to Getting ready to R Rumble ckanr_setup ckanr_setup() () umble install.packages( ckanr ) library(ckanr) ckanr_setup( http://energydata.uct.ac.za/ , key= ) > ckan_info() $ ckan_version[1] "2.6.0 $ site_url[1] http://energydata.uct.ac.za $ site_title[1] "Energy Research Data Portal for South Africa"
Data Data discove discoveR Ry y Groups Users Organisations Packages Resources
Discovering packages Discovering packages ckanr ckanr:: ::package_search package_search() () # General query > package_search(q="carbon") # Filter query > for (p in package_search( fq="mini grids results")$results){ print(p$name)} # Specific query > package_search(q="title:climate change")
Discovering & fetching resources Discovering & fetching resources resource_show resource_show( (res$id res$id) ) # Get package ids > p <- package_search(q="carbon") > results <- p$results > for (r in results){for (i in r$resources){ rdata <- resource_show(i$id) d <- fetch(rdata$url)}}
Overview of typical data workflow Overview of typical data workflow Create new package on CKAN instance (likely requires permission) Create new package on CKAN instance (likely requires permission) Upload resources ( Upload resources (ie ie datasets) to package datasets) to package ckan Create a new R project that includes a jupyter notebook, shiny app, etc R Add project to version control manager (eg git) git Use ckanr to retrieve data from CKAN instance in project Use ckanr to retrieve data from CKAN instance in project ckanr Create funky visualisations Share with colleagues & friends Update and be happy R
How to spot a CKAN data portal How to spot a CKAN data portal Friendly tab header, often with CKAN logo Reference to CKAN in site footer Dead giveaway: grey blocks
R Resource Links esource Links ckanr ckanr on ckan ckan API g git it repo with starter code from this talk repo with starter code from this talk https://github.com/SaintlyVi/ckanr_starter_code git git repo repo with code for batch operations with code for batch operations https://github.com/ERC-data/batch_ckan figshare figshare https://github.com/ropensci/rfigshare on github github https://github.com/ropensci/ckanr API docs docs http://docs.ckan.org/en/latest/api/
Keep R Rocking! Keep ocking! chat to me @ @SaintlyVi & join @ @RLadiesCapeTown RLadiesCapeTown for your monthly dose of Rrraarrr SaintlyVi Rrraarrr