Propagation of uncertainties for a function of observables
Alex Read, Department of Physics, University of Oslo, December, 2020
We want to estimate the uncertainty of a function
of a set of n observables
with associated uncertainties
. We assume that the observables
are sampled from normal distributions that are narrow relative to the form of the function and thus we approximate the function with its Taylor-series expansion, keeping only the lowest-order derivatives
,
where
represents the Gaussian-distributed deviation from the true mean, which is unknown, but estimated by the observed value
. In other words
and
.
The uncertainty in f is estimated as the square root of its mean variance
, where
. We replace f with the series approximation above and take advantage of that in this approximation
. In addition, we write the dot product as an explicit sum, taking care to have separate indices for the terms in the square of the dot product, obtaining
. This can be re-written as the sum of 2 terms:
Multiple uncorrelated observables
If there is no correlation between the observables
and the second term above is 0. Since the variance of
is
, the final and familiar result is
. In other words, uncorrelated contributions to the total uncertainty of a function of observables are to be added in quadrature.
A single observable
If there is only a single observable we have
. This can be visualized in the illustration below. The function is approximated by its slope at the point of the observation. The variance on the x-axis is transformed through the slope to the variance on the y-axis, i.e. f.
%plot an illustrative parabola
xwide = linspace(0,5,1000);
ywide = 1*xwide.^2 + 2*xwide + 3;
figure
subplot(2,2,2)
plot(xwide,ywide,'k-')
%draw a tangent to a point (e.g. 3.5)
hold on
xpObs = 3.5;
ypObs = 1*xpObs.^2 + 2*xpObs + 3;
plot(xpObs,ypObs,'k.','MarkerSize',10)
dypObs = 2*xpObs+2;
plot(xwide,dypObs*(xwide-xpObs)+ypObs,'k:','LineWidth',1)
xlabel('x'), ylabel('y')
title('f(x) and tangent at x_0')
ca2 = gca;
grid on
% Plot the pdf of x
sigma_x = 0.3;
subplot(2,2,4)
hx = histogram(normrnd(xpObs,sigma_x,10000,1));
hx.Normalization = 'pdf';
ca = gca;
ca.XLim = ca2.XLim;
xlabel('x'), ylabel('Probability')
title('pdf of x_0')
grid on
%histogram the transformed distribution on the y-axis
subplot(2,2,1)
hy = histogram(normrnd(ypObs,0.3*dypObs,10000,1));
hy.Orientation = 'horizontal';
hy.Normalization = 'pdf';
ca = gca;
ca.YLim = ca2.YLim;
ylabel('y'), xlabel('Probability')
title('pdf of f(x_0)')
grid on
In the upper right plot we see the function
shown as a solid curve and the tangent at a particular point
shown as a dotted line. In the lower right plot we see the pdf of x (a normal distribution) given the observation
and the uncertainty on
,
. In the upper left plot we see the pdf of
(also a normal distribution) given
and
. The ratio of the spread (RMS) of f to that of x is given by the slope of the tangent, i.e.
.
Example with multiple uncorrelated observables: Volume of a box
We measure the height h, width w, and depth d of a box with uncertainties
,
, and
, respectively and want to estimate the volume
and its uncertainty
. Assuming the measurements are uncorrelated we calculate
. By the way, by dividing both sides of this by the volume we obtain an elegant expression for the relative uncertainty on the volume:
.
Let's study a numerical example: Suppose we measure
cm with relative uncertainties 1, 2, and 3%. The volume and its uncertainty are calculated below.
h = 1; w = 2; d = 3;
dhRel = 0.01; dwRel = 0.01; ddRel = 0.03;
V = h * w * d;
dVrel = sqrt(dhRel^2+dwRel^2+ddRel^2);
dV = dVrel * V;
fprintf('V = %.3f +- %.3f cc\n',V,dV)
fprintf('Relative uncertainty on V = %.1f %%\n',dVrel*100)
We see that the relative uncertainty on d dominates the relative uncertainty on V.
Let's create an ensemble of Monte Carlo experiments of the measurement, by generating (uncorrelated!) random values of h, w, and d according to their uncertainties, and calculate the mean and standard deviation of V.
N = 10000; % Number of Monte Carlo experiments
h = normrnd(h,h*dhRel,N,1);
w = normrnd(w,w*dwRel,N,1);
d = normrnd(d,d*ddRel,N,1);
V = h.*w.*d;
figure
hist = histogram(V);
xlabel('Volume (cc)')
ylabel('Measurements/bin')
title('Histogram of Monte Carlo measurements of V')
meanV = mean(V); sigmaV = std(V);
fprintf('Mean and std.dev. of V = %.3f +- %.3f',mean(V),std(V))
x = linspace(hist.BinEdges(1),hist.BinEdges(end),100);
y = normpdf(x,meanV,sigmaV)*N*hist.BinWidth;
hold on
plot(x,y,'r-')
legend('Measurements','Normal distr.')
With 10k Monte Carlo experiments we see that the analytic and numerical propagations of uncertainty are consistent to several significant digits. We also see that the distribution of measured volumes is well described by the normal distribution.
Correlated observables
There are sizeable correlations between observables in many experiments, i.e. the covariances
are not zero but rather
, where the correlation coefficient ρ can take on values anywhere between
. For example, the slope and intercept of a straight line fitted to data will tend to be correlated (unless the x-axis is chosen carefully). In this case we have
.
Example: The uncertainty on y at a given x of a fitted straight line
The equation of a line can be written
. The uncertainty on
, keeping the correlation terms, is
. Here we see that the uncertainty on y at
is
, which is natural since b is the y-intercept of the line. It turns out that the covariance is proportional to
, and that
is minimized at
.
These features we can see in the plot below, where a straight line is fitted to an artificial dataset.
Generate 10 Monte Carlo or "toy" datapoints with the same normal uncertainty at each point.
x = linspace(1,10,10);
mmodel = 0.25; bmodel = 2; sigma_y = 0.5;
y0 = mmodel*x+bmodel; % The modell
y = normrnd(y0,sigma_y); % The fake data generated from the model
Plot the data as points with error bars.
figure(1)
hold off
errorbar(x,y,ones(length(y),1)*sigma_y,'k.','MarkerSize',20)
hold on
xlabel('x'), ylabel('y')
title('A toy dataset')
Fit a line to the data by
minimization and extract the slope
and intercept
.
% Define the model to perform a chi-squared fit of
model = @(p,x) x*p(1)+p(2); %+ x.^2*p(3);
dy = sigma_y*ones(size(y));
% Rough guess of the parameter values to guide minimization
guess = [1,1];
[p,perrors,chisqmin,exitflag,C ] = fitchisq(x,y,dy,model,guess);
m=p(1);
b=p(2);
Extract the variances and covariance from the covariance (or error) matrix C. The diagonal elements of C are the variances of the parameters. Non-zero off-diagonal elements indicate a covariance between the parameters.
dm2 = C(1,1); % Error in the slope
db2 = C(2,2); % Error in the intercept
dmb = C(1,2); % Covariance
fprintf('Covariance of fitted (m,b) = %.3f\n',dmb)
fprintf('Correlation coefficient (m,b) = %.3f\n',dmb/sqrt(dm2*db2))
Evaluate and plot the model and the fitted result.
xwide = linspace(0,12,100);
yfit = polyval(p,xwide);
y0wide = mmodel*xwide+bmodel;
plot(xwide,y0wide,'b--','LineWidth',1)
plot(xwide,yfit,'k','LineWidth',1)
Calculate and plot the error in y as function of x, represented by a green band.
dywide = sqrt(xwide.^2*dm2 + db2 + 2*xwide*dmb);
% Plot error in y as function of x as a transparent green band
f=fill([xwide flip(xwide)],[yfit-dywide flip(yfit+dywide)],'g');
f.FaceAlpha=0.3;
% Put the grid on so we can see the y and x intercepts
grid on
legend('Toy data','Model','Fitted line','\sigma_y(x)','Location','NorthWest')
We see that the fitted line ressembles the model line, the data points are spread around the model line, and the fitted line tends to be within the predicted variance of
(the green band). We also see that the variance band is narrowest in the middle of the data points.
If the error band of the line has any value, then it should be consistent with the root-mean-square (RMS) of the lines fitted to an ensemble of toys!
N = 1000;
yfit = zeros(N,length(y));
for i=1:N
y = normrnd(y0,sigma_y);
[p,perrors,chisqmin,exitflag,C ] = fitchisq(x,y,dy,model,guess);
yfit(i,:) = polyval(p,x);
end
ymean = mean(yfit,1);
ystd = std(yfit,1);
figure
plot(xwide,y0wide,'b-','LineWidth',1)
hold on
errorbar(x,ymean,ystd,'k.','MarkerSize',10)
% Plot error in y as function of x as a transparent green band
f=fill([xwide flip(xwide)],[y0wide-dywide flip(y0wide+dywide)],'g');
f.FaceAlpha=0.3;
legend('Model','Fitted toys','Model variance','Location','NorthWest')
xlabel('x'), ylabel('y'),title('Mean and RMS of many fitted toys')
We see that the mean and RMS of the toys are consistent with the model and the predicted variance in
, including the covariance term. The parabolic shape of the band indicates, corresponding to the negative covariance, that an upward flucutation of the slope tends to give a downward fluctuation of the y-intercept, and vis-versa.
Summary
We have derived formulae for error propagation for functions of observables under the condition that the function is approximately linear over the span of the uncertainties on the observables.
We have seen where the rule-of-thumb to combine uncorrelated uncertainties in quadrature comes from.
We have studied examples of both uncorrelated and correlated observables and reproduced the results of the formulae with Monte Carlo "toy" experiments.