Analysis of 100 mocks
======================

This page reports the validation of ``sys_mapping`` on 100 independent synthetic
mock galaxy catalogs.  Each mock is generated with known injected contamination
parameters (:math:`a_i^{\rm true}`, :math:`b_i^{\rm true}`), so parameter
recovery, model-selection power, and intrinsic-scatter recovery can be measured
against ground truth across all six implemented methods.

Run configuration
-----------------

.. list-table::
   :widths: 35 65
   :header-rows: 1

   * - Parameter
     - Value
   * - Mode
     - Synthetic (``--synthetic``); no external files required
   * - Number of mocks
     - 100
   * - NSIDE
     - 64  (pixel area ≈ 0.84 deg², 49 152 pixels total, 32 512 unmasked)
   * - Number of systematic templates
     - 3 (``synth_0``, ``synth_1``, ``synth_2``, lognormal random fields)
   * - Injected :math:`a_i^{\rm true}`
     - drawn i.i.d. from :math:`\mathcal{N}(0, 0.10)` per mock
   * - Injected :math:`b_i^{\rm true}`
     - drawn i.i.d. from :math:`\mathcal{N}(0, 0.10)` per mock
   * - Mean galaxies per pixel (:math:`\bar{n}`)
     - ≈ 30  (mean total :math:`N_{\rm gal}` = 972 318 ± 33 218 over 100 mocks)
   * - MCMC walkers / steps / burn-in
     - 80 / 400 / 100
   * - JAX device
     - CPU
   * - Script
     - ``scripts/run_mock_analysis.py``
   * - Output directory
     - ``results/mock_analysis_100/``

All six methods — OLS, ElasticNet, ISD-1, ISD-3, MCMC-add, MCMC-comb — are
applied to every mock realisation.  In addition, the two MCMC methods fit an
intrinsic scatter parameter :math:`\hat{\sigma}`.  A likelihood ratio test (LRT)
then compares the additive model (:math:`b_i = 0`) against the combined model
(free :math:`a_i, b_i`) at the 5 % significance level.

Mock generation
---------------

Each synthetic mock is produced in four steps:

1. **Lognormal galaxy field** — a Gaussian random field :math:`G(\hat{n})` is
   drawn from a power spectrum :math:`C_\ell \propto (\ell+1)^{-2}` (normalised
   to :math:`\sigma_G = 0.5`, i.e.\ variance :math:`\sigma_G^2 = 0.25`).
   The true overdensity is
   :math:`\delta_g^{\rm true} = e^{G - \sigma_G^2/2} - 1`.

2. **Galactic mask** — pixels within :math:`|b| < 20°` are masked, retaining
   ≈ 66 % of the sky (32 512 HEALPix pixels at NSIDE = 64).

3. **Contamination injection** — the observed overdensity is

   .. math::

      \delta_g^{\rm obs}(\hat{n}) =
        \delta_g^{\rm true}(\hat{n})\,\bigl(1 + \textstyle\sum_i b_i^{\rm true}\,t_i(\hat{n})\bigr)
        + \sum_i a_i^{\rm true}\,t_i(\hat{n})

   where :math:`a_i^{\rm true}, b_i^{\rm true} \sim \mathcal{N}(0, 0.10)`
   independently for each mock.

4. **Poisson sampling** — galaxy counts :math:`n_g(p) \sim
   \mathrm{Poisson}(\bar{n}(1+\delta_g^{\rm obs}(p)))` with
   :math:`\bar{n} \approx 30` galaxies per pixel; random counts are
   :math:`n_r(p) = 8\bar{n}` (uniform over unmasked pixels).

Systematic templates
--------------------

Three synthetic HEALPix templates are generated by ``sm.generate_systematic_maps``:

.. list-table::
   :widths: 15 85
   :header-rows: 1

   * - Template
     - Spectral family
   * - ``synth_0``
     - Exponential-linear:  :math:`C_\ell \propto e^{-\ell/10}`
   * - ``synth_1``
     - Exponential-quadratic:  :math:`C_\ell \propto e^{-(\ell/20)^2}`
   * - ``synth_2``
     - Power-law :math:`\ell^{-2}`

Each map is normalised to zero mean and unit standard deviation over the full
sky before use.  The three families span a wide range of angular scales, from
large-scale coherent structure (``synth_0``) to steep power-law (``synth_2``).
As a result, ``synth_2`` consistently exhibits the largest scatter in parameter
recovery across all methods.

Parameter recovery — all six methods
--------------------------------------

For each mock and each method, the estimated additive amplitudes
:math:`\hat{a}_i` are compared against the injected truth.  The bias is
:math:`\hat{a}_i - a_i^{\rm true}`.

.. figure:: _static/results_mock_analysis/mock_parameter_recovery_all_methods.png
   :width: 100%
   :align: center
   :alt: Parameter recovery bias across all six methods, 100 mocks

   **Additive parameter recovery across all six methods (100 realisations, NSIDE = 64).**
   Each panel shows one method; scatter points are individual mock × template
   combinations, box-plots show the inter-quartile range and whiskers extend to
   1.5 × IQR.  Medians cluster near zero for all templates and all methods,
   confirming unbiased recovery of the additive contamination parameters.
   ``synth_2`` (power-law spectrum) shows the largest scatter in every panel.

Summary of additive-parameter recovery over all 100 mocks and 3 templates:

.. list-table::
   :header-rows: 1
   :widths: 20 22 22

   * - Method
     - MAD of :math:`\hat{a}_i - a_i^{\rm true}`
     - Mean :math:`|\hat{a}_i - a_i^{\rm true}|`
   * - OLS
     - 0.0059
     - 0.0122
   * - ElasticNet
     - 0.0073
     - 0.0135
   * - ISD-1
     - 0.0061
     - 0.0126
   * - ISD-3
     - 0.0090
     - 0.0157
   * - MCMC-add
     - 0.0057
     - 0.0122
   * - MCMC-comb
     - 0.0058
     - 0.0116

All methods achieve a median absolute deviation well below the injected scatter
of 0.10.  MCMC-add and MCMC-comb deliver the tightest recovery (MAD ≈ 0.006),
closely followed by OLS and ISD-1.  ISD-3 shows the largest MAD (0.009),
consistent with its higher polynomial complexity introducing some variance.

Likelihood ratio test: additive vs. combined model
---------------------------------------------------

The null hypothesis is the additive model (:math:`b_i = 0\ \forall i`); the
alternative is the combined model.  Under the null, the test statistic
:math:`\lambda_{\rm LR}` follows a :math:`\chi^2(3)` distribution (3 degrees of
freedom, one per systematic template).  The critical value at the 5 % level is
:math:`\chi^2_{3,\,0.95} = 7.815`.

Summary over 100 mocks:

* **Reject fraction**: 100 % (all 100 mocks reject the additive model)
* :math:`\lambda_{\rm LR}` mean ± std: **1 467.7 ± 1 203.9**
* :math:`\lambda_{\rm LR}` median: **1 157.5**
* :math:`\lambda_{\rm LR}` range: 17.5 – 6 150.4

.. figure:: _static/results_mock_analysis/mock_lrt_statistics.png
   :width: 70%
   :align: center
   :alt: Distribution of LRT statistics across 100 mocks (log x-axis)

   **Distribution of** :math:`\lambda_{\rm LR}` **across 100 synthetic mocks
   (logarithmic x-axis).**  The distribution spans nearly two decades (17.5 to
   6 150), with a mode around :math:`10^3`.  Even the smallest value lies far
   above the :math:`\chi^2_3` critical value of 7.8, confirming that the
   combined model is universally preferred whenever both additive and
   multiplicative contamination are present.

The wide range of :math:`\lambda_{\rm LR}` values reflects variability in the
injected :math:`b_i^{\rm true}` amplitudes from mock to mock.  In a dataset
where contamination is purely additive (:math:`b_i^{\rm true} = 0`), the LRT
would fail to reject the additive model for the majority of mocks at the 5 %
level.

Multiplicative parameter recovery (MCMC-comb)
----------------------------------------------

Only MCMC-comb fits the multiplicative amplitudes :math:`b_i` freely.  All
other methods fix :math:`b_i = 0`, so their effective
:math:`\hat{b}_i - b_i^{\rm true} \equiv -b_i^{\rm true}` and are not
informative about :math:`b_i` recovery.

.. figure:: _static/results_mock_analysis/mock_b_parameter_recovery.png
   :width: 65%
   :align: center
   :alt: Multiplicative parameter recovery for MCMC-comb, 100 mocks

   **Multiplicative parameter recovery for MCMC-comb (100 realisations,
   NSIDE = 64).**  Each scatter point is one mock realisation; box-plots
   show the inter-quartile range.  Medians sit on the dashed zero line for
   all three templates, confirming unbiased recovery of the multiplicative
   contamination amplitudes.  ``synth_2`` again shows the largest scatter.

Summary of multiplicative-parameter recovery for MCMC-comb over all 100 mocks:

.. list-table::
   :header-rows: 1
   :widths: 25 30 30

   * - Quantity
     - Value
     - Notes
   * - MAD of :math:`\hat{b}_i - b_i^{\rm true}`
     - 0.0164
     - Median absolute deviation across 300 mock × template pairs
   * - Mean :math:`|\hat{b}_i - b_i^{\rm true}|`
     - 0.0215
     - Mean absolute bias
   * - Injected scatter :math:`\sigma_{b}^{\rm true}`
     - 0.10
     - Standard deviation of the :math:`\mathcal{N}(0,0.10)` prior

The bias is small relative to the injected scatter: MAD / :math:`\sigma_b^{\rm true}` = 0.164.
By comparison, ignoring the multiplicative term entirely (as all non-combined
methods do) incurs a mean absolute b-error equal to the injected scatter itself
(MAD ≈ 0.064).

Scatter parameter recovery
--------------------------

The MCMC likelihood (Gaussian model) treats the observed galaxy overdensity
residuals as normally distributed with standard deviation :math:`\sigma`:

.. math::

   \ln\mathcal{L} = -\tfrac{N_{\rm pix}}{2}\ln(2\pi\sigma^2)
   - \frac{1}{2\sigma^2}\sum_p \bigl(\delta_g^{\rm obs}(p) - \textstyle\sum_i \hat{a}_i\,t_i(p)\bigr)^2

The MCMC parameter :math:`\sigma` is therefore the **standard deviation of the
observed galaxy overdensity field** (after contamination removal), not the
Gaussian-field width :math:`\sigma_G`.  For a lognormal field with Poisson
sampling noise the effective scatter is

.. math::

   \sigma_{\rm eff} = \sqrt{e^{\sigma_G^2} - 1 + \tfrac{1}{\bar{n}}}

With :math:`\sigma_G = 0.5` and :math:`\bar{n} = 30` galaxies per pixel:

.. math::

   \sigma_{\rm eff} = \sqrt{e^{0.25} - 1 + \tfrac{1}{30}}
   = \sqrt{0.284 + 0.033} \approx 0.563

.. figure:: _static/results_mock_analysis/mock_sigma_recovery.png
   :width: 90%
   :align: center
   :alt: Scatter parameter recovery across 100 mocks for MCMC-add and MCMC-comb

   **Scatter parameter** :math:`\hat{\sigma}` **recovery across 100 mocks.**
   Left: MCMC-add; right: MCMC-comb.  The dashed vertical line marks the
   theoretical prediction :math:`\sigma_{\rm eff} = 0.563` (lognormal field
   variance plus Poisson shot noise).  Both methods recover
   :math:`\hat{\sigma}` consistent with this prediction.

Recovered :math:`\hat{\sigma}` statistics:

.. list-table::
   :header-rows: 1
   :widths: 25 25 25 25

   * - Method
     - :math:`\langle\hat{\sigma}\rangle`
     - :math:`\mathrm{std}(\hat{\sigma})`
     - :math:`\sigma_{\rm eff}` (predicted)
   * - MCMC-add
     - 0.560
     - 0.028
     - 0.563
   * - MCMC-comb
     - 0.554
     - 0.027
     - 0.563

Both methods recover :math:`\hat{\sigma}` close to the theoretical prediction
of 0.563.  The small residual offset (~1 %) is consistent with the additional
variance contributed by the injected contamination signal, which is not fully
removed before the MCMC samples :math:`\sigma`.  The recovered distributions
are narrow (std ≈ 0.03 across 100 mocks), confirming that :math:`\sigma` is
well-constrained by the data.  The two MCMC models give statistically
consistent results (:math:`\Delta\langle\hat{\sigma}\rangle = 0.006`).

Output files
------------

Per-mock JSON results and summary plots are written to
``results/mock_analysis_100/``::

    results/mock_analysis_100/
    ├── mock_0000_results.json        # per-mock MAP parameters and LRT
    ├── …
    ├── mock_0099_results.json
    ├── mock_results.csv              # all 100 mocks in tabular form
    ├── mock_parameter_recovery_all_methods.png   # 2×3 bias grid, all methods
    ├── mock_sigma_recovery.png       # σ̂ histograms (MCMC-add, MCMC-comb)
    └── mock_lrt_statistics.png       # histogram of λ_LR

To reproduce these results::

    OMP_NUM_THREADS=16 OPENBLAS_NUM_THREADS=16 MKL_NUM_THREADS=16 \
    python scripts/run_mock_analysis.py \
        --synthetic \
        --n-mocks 100 \
        --n-sys 3 \
        --nside 64 \
        --n-walkers 80 \
        --n-steps 400 \
        --n-burn 100 \
        --output-dir results/mock_analysis_100/

.. warning::

   The full 100-mock run takes approximately 9 hours on a modern CPU workstation
   (two MCMC methods, 80 walkers × 400 steps each mock, all 100 realisations).
   For a quick sanity check, use ``--n-mocks 5 --n-walkers 32 --n-steps 200``.