{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bcd29771",
   "metadata": {},
   "source": [
    "# 17. Uncertainty of the temperature estimated using single-ended calibration\n",
    "After comleting single-ended calibration, you might be interested in inspecting the uncertainty of the estimated temperature.\n",
    "- Decomposing the uncertainty\n",
    "- Monte Carlo estimate of the standard error\n",
    "- Monte Carlo estimate of the confidence intervals\n",
    "\n",
    "First we quickly repeat the single-ended calibration steps:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a45c5207",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "from dtscalibration import read_silixa_files\n",
    "\n",
    "# The following line introduces the .dts accessor for xarray datasets\n",
    "import dtscalibration  # noqa: E401  # noqa: E401\n",
    "from dtscalibration.variance_stokes import variance_stokes_constant\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline\n",
    "\n",
    "filepath = os.path.join(\"..\", \"..\", \"tests\", \"data\", \"single_ended\")\n",
    "ds = read_silixa_files(directory=filepath, timezone_netcdf=\"UTC\", file_ext=\"*.xml\")\n",
    "\n",
    "ds = ds.sel(x=slice(-30, 101))  # dismiss parts of the fiber that are not interesting\n",
    "sections = {\n",
    "    \"probe1Temperature\": [slice(20, 25.5)],  # warm bath\n",
    "    \"probe2Temperature\": [slice(5.5, 15.5)],  # cold bath\n",
    "}\n",
    "st_var, resid = variance_stokes_constant(\n",
    "    ds.dts.st, sections, ds.dts.acquisitiontime_fw, reshape_residuals=True\n",
    ")\n",
    "ast_var, _ = variance_stokes_constant(\n",
    "    ds.dts.ast, sections, ds.dts.acquisitiontime_fw, reshape_residuals=False\n",
    ")\n",
    "\n",
    "out = ds.dts.calibrate_single_ended(sections=sections, st_var=st_var, ast_var=ast_var)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb3389d4",
   "metadata": {},
   "source": [
    "## Decomposing the uncertainty\n",
    "The components of the uncertainty are stored in the `ds.var_fw_da` dataarray. The sum of the different components is equal to `ds.tmpf_var`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19628bdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "ds1 = out.isel(time=0)\n",
    "\n",
    "# Uncertainty from the noise in (anti-) stokes measurements\n",
    "stast_var = ds1.var_fw_da.sel(comp_fw=[\"dT_dst\", \"dT_dast\"]).sum(dim=\"comp_fw\")\n",
    "\n",
    "# Parameter uncertainty\n",
    "par_var = ds1.tmpf_var - stast_var\n",
    "\n",
    "# Plot\n",
    "plt.figure(figsize=(12, 4))\n",
    "plt.fill_between(ds1.x, stast_var, label=\"Noise in (anti-) stokes measurements\")\n",
    "plt.fill_between(ds1.x, y1=ds1.tmpf_var, y2=stast_var, label=\"Parameter estimation\")\n",
    "plt.suptitle(\"Variance of the estimated temperature\")\n",
    "plt.ylabel(\"$\\sigma^2$ ($^\\circ$C$^2$)\")\n",
    "plt.legend(fontsize=\"small\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a727c73",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The effects of the parameter uncertainty can be further inspected\n",
    "# Note that the parameter uncertainty is not constant over the fiber and certain covariations can reduce to temperature uncertainty\n",
    "ds1.var_fw_da.plot(hue=\"comp_fw\", figsize=(12, 4));"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "87590d5b",
   "metadata": {},
   "source": [
    "## Monte Carlo estimate of the uncertainty\n",
    "The uncertainty of the calibrated temperature can be computed in two manners:\n",
    "1. The variance of the calibrated temperature can be approximated using linear error propagation\n",
    "    - Very fast computation\n",
    "    - Only the variance is estimated\n",
    "    - Sufficiently accurate approximation for most cases\n",
    "2. The uncertainty distribution of the calibrated temperature can be approximated using Monte Carlo\n",
    "    - Slow computation\n",
    "    - Computes variances and confidence intervals\n",
    "    - Correctly propagates all uncertainties from the calibration\n",
    "    - Requires sufficiently large number of samples to be drawn to be correct, hence the slow computation.\n",
    "    - Only use this method: 1) To check the first method. 2) Specific interest in confidence intervals.\n",
    "    \n",
    "The first approach works very well and is used in the previous examples. **Here we show the second approach**.\n",
    "\n",
    "The uncertainty comes from the noise in the (anti-) Stokes measurements and from the parameter estimation. Both sources are propagated via Monte Carlo sampling to an uncertainty distribution of the estimated temperature. As weigths are correctly passed to the least squares procedure via the `st_var` arguments, the covariance matrix can be used as an estimator for the uncertainty in the parameters. This matrix holds the covariances between all the parameters. A large parameter set is generated from this matrix as part of the Monte Carlo routine, assuming the parameter space is normally distributed with their mean at the best estimate of the least squares procedure.\n",
    "\n",
    "The large parameter set is used to calculate a large set of temperatures. By using `percentiles` or `quantile` the 95% confidence interval of the calibrated temperature between 2.5% and 97.5% are calculated. The confidence intervals differ per time step. If you would like to calculate confidence intervals temporal averages or averages of fiber sections see notebook 16."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dfce0126",
   "metadata": {},
   "outputs": [],
   "source": [
    "out2 = ds.dts.monte_carlo_single_ended(\n",
    "    result=out,\n",
    "    st_var=st_var,\n",
    "    ast_var=ast_var,\n",
    "    conf_ints=[2.5, 97.5],\n",
    "    mc_sample_size=500,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9cf9ec3a",
   "metadata": {},
   "source": [
    "This function computes `ds.tmpf_mc_var` and `ds.tmpf_mc` if the keyword argument `conf_ints` is passed containing the confidence intervals. Increase the `mc_sample_size` for a 'less noisy' approximation.\n",
    "### Monte Carlo estimation of the standard error"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53039d6b",
   "metadata": {},
   "outputs": [],
   "source": [
    "ds1 = out.isel(time=0)\n",
    "\n",
    "(out2.isel(time=0).tmpf_mc_var ** 0.5).plot(\n",
    "    figsize=(12, 4), label=\"Monte Carlo approx.\"\n",
    ")\n",
    "(out.isel(time=0).tmpf_var ** 0.5).plot(label=\"Linear error approx.\")\n",
    "plt.ylabel(\"$\\sigma$ ($^\\circ$C)\")\n",
    "plt.legend(fontsize=\"small\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac84c1c9",
   "metadata": {},
   "source": [
    "### Monte Carlo estimation of the confidence intervals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8b55f517",
   "metadata": {},
   "outputs": [],
   "source": [
    "out.isel(time=0).tmpf.plot(linewidth=0.7, figsize=(12, 4))\n",
    "out2.isel(time=0).tmpf_mc.sel(CI=2.5).plot(linewidth=0.7, label=\"CI: 2.5%\")\n",
    "out2.isel(time=0).tmpf_mc.sel(CI=97.5).plot(linewidth=0.7, label=\"CI: 97.5%\")\n",
    "plt.legend(fontsize=\"small\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "55ea2e3f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}