Uncertainty
Table of contents
Activity 1
Load the Capstone software on your computer and set it up to use the motion sensor. Your TA can show you how to use the software. Aim the motion sensor at something which is not moving (like the ceiling). Keep the sample rate reasonably low (below 50 Hz) and then record about two seconds of data of the motionless object.
If you zoom into your data, you will see that your data is fluctuating. How much does your sensor claim that the motionless object is moving? Describe what this experiment indicates is a reasonable limit of the accuracy of the motion sensor. We call this limitation a measurement uncertainty. Why do we use the word uncertainty instead of error?
Change the plot to be velocity-time. What is the uncertainty of the velocity? Similarly, estimate the uncertainty of the acceleration.
Summary: you should now be able to explain what a physicist means when they use the words "measurement uncertainty".
Activity 2
Load Capstone. Configure it to use the force sensor. Use a method similar to activity 1 to estimate the measurement uncertainty of the force sensor.
You have a series of weights. Use them to measure several different masses. Make a plot of your data: the horizontal axis is the mass (your independent variable, that is the quantity you directly controlled); the vertical axis is the force (your dependent variable, that is the quantity that you measured after changing the independent variable).
From your graph, discuss how well the force sensor is calibrated. Your data should be a straight line. The slope tells you one thing about the calibration, and the y-intercept tells you another thing about the calibration.
The force sensor has a button labelled “Zero” (or "Tare" if it is an older device). Remove all the masses from the force sensor, then hit that button. Now re-check 2 of your mass measurements. How did the button affect your data? Did it change your slope, your intercept, or both?
Summary: you should now be able to explain what a physicist means when they use the words "calibration uncertainty" or "bias".
Activity 3
If you do not know what a standard deviation is, please skip down to Appendix A before doing this exercise.
Time how long it takes a piece of paper to fall 1 m. Repeat 20 times. Find the average value and the standard deviation of your data three times: with the first 5 data points, with the first 10 data points, and with all 20 data points. Without knowing your data, I predict that your standard deviation will not change much given the different data sets. If yours does, check out the results of the other groups nearby and see if they have the same pattern you do.
Draw a scatter plot (just dots, do not connect them with lines) of your data with the horizontal axis as the trial number and the vertical axis as the time. Draw a solid horizontal line at the value of your average value of all 20 trials. Draw a dashed horizontal line a distance Y above and below the solid line where Y is the standard deviation of all 20 trials. Explain why it is reasonable to use the standard deviation as a measure of statistical uncertainty in an experiment like this.
What fraction of your data points fall between the dashed lines? My guess, without knowing your data, is about two thirds of your data falls within the dashed lines. If you drew dashed lines at the X +/- 2Y values then I would guess that 95% of your data points lie within that range. These guesses are generically true for most statistical uncertainties. If you have read Appendix A (or know some statistics) you should be able to explain my predictions.
Given that the standard deviation probably did not change much when your data went from 5 to 10 to 20 data points, discuss whether it should be used as a measure of the uncertainty of the average or the uncertainty of any one specific datum.
If you divide the standard deviation by \(\sqrt{N-1}\) where N is the number of data points used to get that average, you get a new type of uncertainty. Calculate this value for your set of 5, 10 and 20 data points.
Discuss whether it should be used as a measure of the uncertainty of the average or the uncertainty of any one specific datum.
You should now be able to demonstrate quantitatively that taking more data gets a better estimate of the time it takes a paper to fall 1 m. Do so.
Summary: you should now be able to explain what a physicist means when they use the words "statistical uncertainty"
Activity 4
Do the previous activity before you do this one.
You have 10 coins. Flip them several times to test whether they are “fair” coins. A mathematician could show that the average number of heads which results from flipping N “fair” coins is N/2, and that the standard deviation is \(\sqrt{N/4}\). Flip your coins a few times and determine whether these results seem reasonable.
At some point during the class, your TA will go to each group and record the result of you flipping all 10 coins. Keep this information private! We are going to have a contest. Each group is going to privately tell the TA their guess for the number of heads which resulted from the entire class. The guess must be in the form of X +/- Y. The winner is the smallest value of Y such that the correct answer Z satisfies X – Y < Z < X + Y. In the event of a tie, the group with the X value which was closest to Z will win. You should use your knowledge of your coin toss to improve your prediction!
When the contest is over, justify your choice of X and Y given your data (the 10 coins you flipped should definitely affect your answer) and your understanding of statistical uncertainty gained from the previous activity.
Summary: you should now be able to explain why claiming a smaller uncertainty is more impressive, but it increases your chances of being wrong.
Activity 5
Different branches of sciences use different conventions for uncertainty. In physics, we quote values as X +/- Y where Y is our uncertainty. But we do not claim two measurements differ unless they differ by at least 3 times their uncertainty! And typically we wait until a measurement is 5 times larger than its uncertainty before announcing we have discovered something new. Basically, if X < 5Y then we worry that our measurement is actually zero and the thing "discovered" might not exist.
Medicine, social sciences and political survey polls use 2 standard deviations for most of their measurements and for announcing a new discovery. If you have read the phrase “margin of error is 5% 19 times out of 20” that means that 5% is twice the uncertainty which a physicist would report (two standard deviations). Part of this is because it is very expensive for them to take enough data to use more stringent conditions.
You can now explain the following XKCD joke: https://xkcd.com/882/
What should engineers use for uncertainty? For the purposes of this course, you will follow the physics convention, but that might be either too risky or too conservative for Engineering needs, depending on the context. Discuss this open ended question.
Appendix A: Basic statistics
If you have N data points which have values xi then the average (mean) of the values is
\(\bar{x} = \frac{1}{N}\sum_{i=0}^N x_i\)
The standard deviation of your data is
\(\sigma = \sqrt{\frac{1}{N-1}\sum_{i=0}^N(\bar{x}-x_i)^2}\)
What do they represent? If you plot all your data points as a histogram, the average is roughly the middle of the histogram (it is exactly the middle if your data is symmetric, but real data is not always so accommodating). The standard deviation is a measure of the width. The graph on the right shows how much of your data will be within 1, 2 and 3 standard deviations of the average value (on average). These values are only true for Gaussian (or Normal) data distributions. Real data might have different distributions.
Appendix B: Conventions for this course
Round all uncertainties to 1 significant figure.
Round your measurement to the same decimal place (usually 2 or 3 significant figures).
Only keep the largest uncertainty using the same basic rules as significant figures: if you add or subtract numbers, keep the larger uncertainty; if you do anything else (multiply, divide, exponentials, trig functions), find the percentage uncertainties and keep the larger percentage value.
Examples:
5.235346 +/- 0.076923 should be written 5.24 +/- 0.08 (one significant figure for the uncertainty, same decimal place for the measurement).
(5.24 +/- 0.08) + (8.125 +/- 0.006) = 13.37 +/- 0.08
(5.24 +/- 0.08) * (8.125 +/- 0.006) = 42.6 +/- 0.7
Note: the rules for significant figures you learned in high school are simplified versions of these rules for uncertainties.