
Pollen Forecasting and Machine Learning Insights
Explore the application of machine learning in predicting pollen activity, the challenges faced with pollen data, methodology using climate variables, ARIMA model comparison, machine learning importance, random forest vs. RNN, and a case study on Red Maple with insights into K-fold cross-validation for model evaluation and future possibilities in pollen release prediction.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Machine Learning and Pollen Forecasting Sydney Filler and Kathy Gerst www.usanpn.org
Background Goals for predicting pollen Unique challenges posed by pollen data Why machine learning is a unique solution www.usanpn.org
Methodology & Data 15 Daymet climate variables from USA-NPN website, used Status and Intensity dataset Looked activity across 3 days and used proportion of yes days as proxy for activity Used climate variables on the first day of each period Changes to methodology: weighting status observations by intensity, filling in data gaps www.usanpn.org
Traditional ARIMA Model ARIMA models are univariate forecasting models that are uniquely suited for data with seasonality and/or long-term trends Discovered that there is no concise long-term trend for this pollen data; troughs and peaks could be explained by masting or by data ARIMA model failed to accurately forecast www.usanpn.org
What is machine learning? Machine learning programs are given a set of inputs and outputs and attempt to fit a model Why is it important? How is it different than standard modeling? www.usanpn.org
Random Forest vs. RNN www.usanpn.org
Red Maple: A Case Study 1.0 0.8 0.4 0.2 www.usanpn.org
Red Maple: A Case Study Kfold cross validation to evaluate the model Mean Average Percent Error: 2.6% ( www.usanpn.org
What next? Can we use the limited pollen release data we have as a model variable? How can we improve model accuracy across species with minimal data? How can we integrate long-term trends into models? How can we predict pollen release across a region? www.usanpn.org