I’m trying to plot data from NOAA regarding CO2 concentrations. In particular I’m asked to first plot the fit and then print the fit parameters and reduced chi square along with their uncertainties using a linear fit. The plots come out as I expected but there are errors that crop up that I can’t seem to fix. These errors show up when calculating the parameters and their uncertainties, as well as the reduced chi square. It all shows up as infinity. I know the errors are due to dividing by zero but that’s a result of certain measurements in the data having an uncertainty of zero so I can’t do much about that.
Link to dataset: https://gml.noaa.gov/ccgg/trends/ though mine only goes up to September 2023
Any uncertainty that is negative is changed to an uncertainty of 0.1
import numpy as np
import matplotlib.pyplot as plt
from numpy import cos, sqrt, exp, sin, pi, array
from matplotlib import cm
data=np.loadtxt('co2_data.txt', skiprows=2)
from scipy.optimize import curve_fit
def linfit(x, p0, p1):
return (p0*x)+p1
def quadfit(x, p0, p1, p2):
return p0*(x**2)+(p1*x)+p2
x=data[:,2]
t=array([i-x[0] for i in x])
y=data[:,3]
yerr=data[:,7]
for i in range(len(yerr)):
if yerr[i]<0:
yerr[i]=.1
guesses = (1.5, 310)
(p0, p1), cc = curve_fit(linfit, t, y, p0 = guesses, sigma = yerr)
(up0, up1) = np.sqrt(np.diag(cc))
print(f'p0 = {p0:.4} +/- {up0:.4f} ppm/year')
print(f'p1 = {p1:.4f} +/- {up1:.4f} ppm')
xm = np.linspace(t[0], t[-1], 201)
ym1 = linfit(xm, p0, p1)
yfit1=linfit(t, p0, p1)
n = len(t)
p=2
rcsq = sum(((y - yfit1)/yerr)**2)/(n-p)
print(f'Reduced chisquared for linear fit = {rcsq:.5f}')
/usr/local/lib/python3.10/dist-packages/scipy/optimize/_minpack_py.py:968: RuntimeWarning: divide by zero encountered in divide
transform = 1.0 / sigma
<ipython-input-44-c1614db438c8>:9: OptimizeWarning: Covariance of the parameters could not be estimated
(p0, p1), cc = curve_fit(linfit, t, y, p0 = guesses, sigma = yerr)
<ipython-input-44-c1614db438c8>:18: RuntimeWarning: divide by zero encountered in divide
rcsq = sum(((y - yfit1)/yerr)**2)/(n-p)
p0 = 1.5 +/- inf ppm/year
p1 = 310.0000 +/- inf ppm
Reduced chisquared for linear fit = inf
The only thing I could think to do was change my guesses for the fit but considering that a linear fit is pretty bad to begin with, I don’t think it’s possible to fix the errors. But I also don’t want to rule out that the errors are simply errors and can’t be fixed. Are the errors a feature of this fit or a code/fit error?