
Environmental Data Analysis with MATLAB or Python 3rd Edition Lecture 14 Overview
Dive into Lecture 14 of "Environmental Data Analysis with MATLAB or Python" where concepts like linear filters, predictions, and filters for error estimation are explored. Discover how past values impact present and future predictions using specialized algorithms.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Environmental Data Analysis with MATLAB or Python 3rdEdition Lecture 14
SYLLABUS Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Intro; Using MTLAB or Python Looking At Data Probability and Measurement Error Multivariate Distributions Linear Models The Principle of Least Squares Prior Information Solving Generalized Least Squares Problems Fourier Series Complex Fourier Series Lessons Learned from the Fourier Transform Power Spectra Filter Theory Applications of Filters Factor Analysis and Cluster Analysis Empirical Orthogonal functions and Clusters Covariance and Autocorrelation Cross-correlation Smoothing, Correlation and Spectra Coherence; Tapering and Spectral Analysis Interpolation and Gaussian Process Regression Linear Approximations and Non Linear Least Squares Adaptable Approximations with Neural Networks Hypothesis testing Hypothesis Testing continued; F-Tests Confidence Limits of Spectra, Bootstraps
Goals of the lecture further develop the idea of the Linear Filter and its applications
from last lecture present output past and present values of input output input filter convolution , not multiplication
Part 1: Predicting the Present or very close to a convolution
output input prediction error filter
strategy for predicting the future 1. take all the data, d d, that you have up to today 2. use it to estimate the prediction error filter, p (use generalized least-squares to solve p p*d d=0) 3. use the filter, p p, and all the data, d d, to predict dtomorrow
application to the Neuse River hydrograph 4 x 10 2 discharge, cfs 1 0 0 500 1000 1500 2000 time, days 2500 3000 3500 4000 9 PSD, (cfs)2 per cycle/day x 10 8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 frequency, cycles per day
heres the best fit filter, p p prediction error filter, p(t) 1.5 1 pef(t) (std) 0.5 0 -0.5 0 10 20 30 40 50 60 70 80 90 time t, days time t, days 1.5 1 pef(t)( bicg) 0.5 0 -0.5 0 10 20 30 40 50 60 70 80 90 time t, days
heres the best fit filter, p p prediction error filter, p(t) in this case, only the first few coefficients are large 1.5 1 what s that? pef(t) (std) 0.5 0 -0.5 0 10 20 30 40 50 60 70 80 90 time t, days time t, days 1.5 1 pef(t)( bicg) 0.5 0 -0.5 0 10 20 30 40 50 60 70 80 90 time t, days
importance of the prediction error since one is using least squares, the equation 0 0=p p*d d is not solved exactly the prediction error, e e=p p*d d tells you what aspects of the data cannot be predicted on the basis of past behavior
A) prediction error, e(t) discharge, d(t) 15000 10000 d(t) 5000 0 -5000 0 50 100 150 200 250 300 350 time t, days time t, days B) 15000 10000 e(t) 5000 0 -5000 0 50 100 150 time t, days 200 250 300 350 time t, days
A) prediction error, e(t) discharge, d(t) 15000 10000 d(t) 5000 0 -5000 0 50 100 150 200 250 300 350 time t, days time t, days B) many spikes are at times when discharge increases 15000 the error is small the error is spiky 10000 e(t) 5000 0 -5000 0 50 100 150 time t, days 200 250 300 350 time t, days
Part 2: Inverse Filters Can a convolution be undone? if = g g * h h is there another filter g ginv for which h h = g ginv * ? ?
convolution c c=a a*b b by hand: step 1 for simplicity, suppose a a and b b are of length 3 write a a backward in time write b b forward in time overlap the ends by one, and multiply. That gives c1
convolution c c=a a*b b by hand: step 2 slide a a right one place multiply and add. That gives c2
convolution c c=a a*b b by hand: step 3 slide a a right another place multiply and add. That gives c3
convolution c c=a a*b b by hand: keep going Multiply. That gives c5
an important observation this is the same pattern that we obtain when we multiply polynomials
z-transform turn a filter into a polynomial g g = [g1, g2, g3, gN]T g(z) = g1 + g2 z + g3 z2 + gN zN-1
inverse z-transform turn a polynomial into a filer g(z) = g1 + g2 z + g3 z2 + gN zN-1 g g = [g1, g2, g3, gN]T
why would we want to do this? because we know a lot about polynomials
the fundamental theorem of algebra a polynomial of n-th order has exactly n roots and thus can be factored into the product of n factors
the fundamental theorem of algebra a polynomial of n-th order has exactly n-roots largest power, zn solutions to g(z)=0 and can be factored into the product of n factors g(z) (z-r1) (z-r2) (z-rn)
in the case of a polynomial constructed from a length-N filter, g g where r1, r2, rN-1 are the roots
so, the filter g g is equivalent a cascade of N-1 length-2 filters
now lets try to find the inverse of a length-2 filter the filter that undoes convolution by [-ri, 1]Tis ? z-transform the function that undoes multiplication by z-riis 1 /(z-ri)
problem: 1/(z-ri) is not a polynomial solution: compute its Taylor series
Taylor series contains all powers of z
so the filter that undoes convolution by [-ri, 1]Tis an indefinitely long filter
this filter will only be useful if its coefficients fall off must decrease this happens when |ri|-1 > 1 or |ri|< 1
this filter will only be useful if its coefficients fall off must decrease this happens when |ri|-1 < 1 or |ri|> 1 the root, ri, must lie outside the unit circle
the inverse filter for a length-N filter g g step 1: find roots of g(z) step 2: check that roots are inside the unit circle step 3: construct inverse filter associated with each root step 4: convolve them all together
example construct inverse filter of: 6 gj g(j) 4 2 0 5 10 15 20 element j 25 30 35 40 45 50 element j 0.1 ginv(j) 0 -0.1 0 5 10 15 20 25 30 35 40 45 50 element j 1 [g*ginv](j) 0 -1 0 5 10 15 20 25 30 35 40 45 50 element j
only hard part of the process is finding the roots of a polynomial MATLAB r = roots(flipud(g)); Python pol = (np.flipud(g)).ravel(); r = np.roots(pol);
6 gj g(j) 4 2 0 5 10 15 20 25 30 35 40 45 50 element j element j 0.1 ginv(j) gjinv 0 -0.1 0 5 10 15 20 25 30 35 40 45 50 element j element j 1 [g*ginv](j) [ginv*g]j 0 -1 0 5 10 15 20 25 30 35 40 45 50 element j element j
short time series 6 gj g(j) 4 2 0 5 10 15 20 25 30 35 40 45 50 element j element j long timeseries 0.1 ginv(j) gjinv 0 -0.1 0 5 10 15 20 25 30 35 40 45 50 element j element j spike 1 [g*ginv](j) [ginv*g]j 0 -1 0 5 10 15 20 25 30 35 40 45 50 element j element j
Part 3: Recursive Filters a way to make approximate a long filter with two short ones
in the standard filtering formula we compute the output 1, 2, 3, in sequence but without using our knowledge of 1when we compute 2 or 2when we compute 3 etc
suppose we tried to put the information to work, as follows here we ve introduced two new filters, u u and v v
new but conventional filter, u filter v that acts on already computed values of convention al filter, g
now define v1=1, so if we can find short filters u u and v such that v vinv * u u g g then we can speed up the convolution process
an example g*h g*h is the weighted average of recent values of h if g g is truncated to N 10 elements, then each time step takes 10N multiplications and 10N additions
try this works, since the inverse of a length-2 filter is
the convolution then becomes which requires only one addition and one multiplication per time step a savings of a factor of about ten
A) 1 h1(t) and q1(t) h(t) and (t) 0 -1 0 10 20 30 40 50 60 70 80 90 100 time, t time , t 5 B) h2(t) and q2(t) h(t) and (t) 0 -5 0 10 20 30 40 50 60 70 80 90 100 time, t time , t