dtscalibration.variance_stokes.variance_stokes_linear

dtscalibration.variance_stokes.variance_stokes_linear(st, sections, acquisitiontime, nbin=50, through_zero=False, plot_fit=False)[source]

Approximate the variance of the noise in Stokes intensity measurements with a linear function of the intensity, suitable for large setups.

  • variance_stokes_constant() for small setups with small variations in intensity. Variance of the Stokes measurements is assumed to be the same along the entire fiber.

  • variance_stokes_exponential() for small setups with very few time steps. Too many degrees of freedom results in an under estimation of the noise variance. Almost never the case, but use when calibrating pre time step.

  • variance_stokes_linear() for larger setups with more time steps. Assumes Poisson distributed noise with the following model:

        st_var = a * ds.st + b
    
    
    where `a` and `b` are constants. Requires reference sections at
    beginning and end of the fiber, to have residuals at high and low
    intensity measurements.
    

The Stokes and anti-Stokes intensities are measured with detectors, which inherently introduce noise to the measurements. Knowledge of the distribution of the measurement noise is needed for a calibration with weighted observations (Sections 5 and 6 of [1]) and to project the associated uncertainty to the temperature confidence intervals (Section 7 of [1]). Two sources dominate the noise in the Stokes and anti-Stokes intensity measurements (Hartog, 2017, p.125). Close to the laser, noise from the conversion of backscatter to electricity dominates the measurement noise. The detecting component, an avalanche photodiode, produces Poisson- distributed noise with a variance that increases linearly with the intensity. The Stokes and anti-Stokes intensities are commonly much larger than the standard deviation of the noise, so that the Poisson distribution can be approximated with a Normal distribution with a mean of zero and a variance that increases linearly with the intensity. At the far-end of the fiber, noise from the electrical circuit dominates the measurement noise. It produces Normal-distributed noise with a mean of zero and a variance that is independent of the intensity.

Calculates the variance between the measurements and a best fit at each reference section. This fits a function to the nt * nx measurements with ns * nt + nx parameters, where nx are the total number of reference locations along all sections. The temperature is constant along the reference sections, so the expression of the Stokes power can be split in a time series per reference section and a constant per observation location.

Idea from Discussion at page 127 in Richter, P. H. (1995). Estimating errors in least-squares fitting.

The timeseries and the constant are, of course, highly correlated (Equations 20 and 21 in [1]), but that is not relevant here as only the product is of interest. The residuals between the fitted product and the Stokes intensity measurements are attributed to the noise from the detector. The variance of the residuals is used as a proxy for the variance of the noise in the Stokes and anti-Stokes intensity measurements. A non-uniform temperature of the reference sections results in an over estimation of the noise variance estimate because all temperature variation is attributed to the noise.

Notes:

  • Because there are a large number of unknowns, spend time on calculating an initial estimate. Can be turned off by setting to False.

  • It is often not needed to use measurements from all time steps. If your variance estimate does not change when including measurements from more time steps, you have included enough measurements.

References:

Examples:

Parameters:
  • st_label (str) – Key under which the Stokes DataArray is stored. E.g., ‘st’, ‘rst’

  • sections (dict, optional) – Define sections. See documentation

  • nbin (int) – Number of bins to compute the variance for, through which the linear function is fitted. Make sure that that are at least 50 residuals per bin to compute the variance from.

  • through_zero (bool) – If True, the variance is computed as: VAR(Stokes) = slope * Stokes If False, VAR(Stokes) = slope * Stokes + offset. From what we can tell from our inital trails, is that the offset seems relatively small, so that True seems a better option for setups where a reference section with very low Stokes intensities is missing. If data with low Stokes intensities available, it is better to not fit through zero, but determine the offset from the data.

  • plot_fit (bool) – If True plot the variances for each bin and plot the fitted linear function