Bibliography
This page documents the key literature on systematic effects mitigation in
photometric galaxy surveys. For each reference, the main method(s), their
mathematical formulation, pros/cons, and implementation status in
sys_mapping are described.
Ross et al. 2011
Methods: Masking of stellar contamination, dust extinction corrections, sky background removal, and jack-knife covariance estimation for systematic null tests. Systematic corrections are applied by assigning per-pixel weights based on observational condition maps (seeing, airmass, star density, Galactic extinction).
Key equation — Landy-Szalay two-point estimator after systematic weighting:
Null test: cross-correlation of the weight map with each template should be consistent with zero for a well-corrected field.
Jack-knife covariance:
where \(w_i\) is the estimator with patch \(i\) removed.
Pros and cons
Pros: Intuitive per-pixel weighting. Jack-knife uncertainties are conservative and capture spatial correlations. Multiple independent null tests can be run simultaneously.
Cons: Masking reduces effective area. Jack-knife requires many patches for the prefactor to be accurate. Correlations between systematics are not jointly modelled.
Implementation status in sys_mapping
Landy-Szalay estimator: implemented (
utils.measure_two_point_function).Jack-knife covariance: implemented (
bootstrap.jackknife_covariance).Null test cross-correlations: implemented (
diagnostics.null_test_cross_correlations).
Ho et al. 2012
Methods: Optimal quadratic estimator (QMV) for the galaxy angular power spectrum with systematic control. The covariance-weighted estimator minimises variance while accounting for mode-coupling from the survey mask. Systematic templates are handled via mode projection in harmonic space.
Key equations — optimal quadratic estimator and Fisher matrix:
Pros and cons
Pros: Statistically optimal (minimum variance). Rigorous treatment of data covariance including noise and mask coupling.
Cons: Requires explicit covariance matrix inversion — scales poorly to large pixel numbers. Systematic templates must be characterised a priori.
Implementation status in sys_mapping
Harmonic-space pseudo-\(C_\ell\) estimator (simpler, mask-based): implemented (
power_spectrum.measure_pseudo_cl).Full QMV optimal estimator: not planned (high computational cost; NaMaster/pymaster covers this use case).
Elsner, Leistedt & Peiris 2016
Methods: Three complementary harmonic-space approaches with closed-form bias expressions:
Template subtraction (TS): subtract \(\hat\alpha_i\,\tilde C_\ell^{t_i}\) from the measured pseudo-\(C_\ell\).
Basic mode projection (BMP): marginalise over template modes via a modified pixel covariance.
Extended mode projection (EMP): threshold-based exclusion of modes dominated by templates, with an analytical bias correction.
Key equations:
Cleaned power spectrum (TS):
Additive bias from template subtraction (\(n\) templates):
EMP bias (Eq. 27 of Elsner+16):
Pros and cons
Pros: Closed-form bias corrections; analytically tractable. Template subtraction is fast. Mode projection does not require fitting contamination amplitudes.
Cons: Template subtraction biases scale as \(n/(2\ell+1)\) — significant at low \(\ell\) with many templates. EMP bias depends on signal-to-noise and threshold choice. Mixing from the survey mask breaks spherical harmonic orthogonality.
Implementation status in sys_mapping
Template subtraction in harmonic space: implemented (
power_spectrum.subtract_template_cl).Harmonic bias formula: implemented (
power_spectrum.harmonic_bias).Basic and extended mode projection: implemented (
power_spectrum.mode_projection_bias).Full correction pipeline: implemented (
correction.correct_power_spectrum_harmonic).
Leistedt & Peiris 2014
Methods: Introduces extended mode projection (EMP) as a blind systematic mitigation strategy. Templates are identified from the data themselves by iteratively projecting out the modes most correlated with external maps, without prior knowledge of contamination amplitudes.
Key idea: project a set of templates \(\{f_i\}\) out of the data vector before power spectrum estimation:
This is iterated until the cross-power spectrum of the cleaned field with each template is consistent with zero.
Pros and cons
Pros: Blind — no prior knowledge of contamination amplitudes needed. Naturally handles correlated templates.
Cons: Each projected template costs one mode per \(\ell\) band — significant variance penalty for large template sets. Power loss must be corrected via a transfer function.
Implementation status in sys_mapping
EMP basis: implemented as part of
power_spectrum.mode_projection_bias.Blind iterative variant: planned (future extension of
power_spectrum).
Elsner, Leistedt & Peiris 2017
Methods: Integrates mode projection directly into the pseudo-\(C_\ell\) estimator, deriving exact closed-form expressions for the deprojection bias within the MASTER/PCL framework. Extends the 2016 work to handle arbitrary mask geometry and pixel weights.
Key result: the pseudo-\(C_\ell\) after EMP is related to the true power spectrum by a modified coupling matrix \(M_{\ell\ell'}^{\rm proj}\) that is efficiently computable from the mask.
Pros and cons
Pros: Provides exact bias correction in the pseudo-\(C_\ell\) framework. Suitable for non-Gaussian fields and realistic survey footprints.
Cons: Coupling matrix computation scales as \(O(\ell_{\max}^3)\); large surveys require optimised implementations (e.g. NaMaster).
Implementation status in sys_mapping
Coupling matrix computation: partially implemented via
power_spectrum.mode_projection_bias(analytical approximation).Full coupling matrix: deferred (requires NaMaster as optional backend).
Weaverdyck & Huterer 2021
Methods: Systematic comparison of four decontamination methods applied to the same simulated and real data:
DES-Y1 iterative method: multiplicative weights \(w = (1 + \alpha T)^{-1}\) iterated to convergence.
Template subtraction: subtract the cross-correlation component.
Mode projection: harmonic-space marginalisation.
ElasticNet regression: L1+L2 penalised linear regression.
ElasticNet loss:
Per-pixel weight from regression:
Pros and cons
Pros: ElasticNet naturally prevents overfitting and finds sparse solutions. The comparison framework reveals method-dependent biases. No iterative convergence issue for ElasticNet.
Cons: Regularisation hyperparameters (\(\lambda_1, \lambda_2\)) require tuning. No single method dominates in all scenarios.
Implementation status in sys_mapping
ElasticNet regression: implemented (
regression.elasticnet_contamination_fit).Iterative OLS: implemented (
regression.iterative_systematics_decontamination).Method comparison framework: implemented (
regression.method_comparison).
Rezaie et al. 2020
Methods: Artificial neural networks (ANN) trained to predict the galaxy density field from observational systematic maps (Galactic extinction, seeing, stellar density). The ratio of the predicted to mean density defines per-pixel weights. The method is model-free and captures non-linear systematic dependencies.
Weight definition:
where \(n_g^{\rm ANN}\) is the ANN-predicted galaxy count from systematic templates.
Pros and cons
Pros: Captures non-linear and cross-template systematic dependencies. No functional form assumed for contamination.
Cons: Prone to overfitting when templates are correlated with large-scale structure. Requires careful validation (e.g. cross-validation on disjoint sky regions). Less interpretable than linear methods.
Implementation status in sys_mapping
ANN-based systematic mapping: not planned (out of scope for the current Bayesian likelihood framework; recommended as external pre-processing step).
Alonso et al. 2019 — NaMaster
Methods: NaMaster (pymaster) provides a unified pseudo-\(C_\ell\) estimator with: - Exact mask mode-coupling matrix computation. - E/B-mode purification for spin-2 fields. - Template deprojection in harmonic space. - Support for arbitrary pixelisation schemes (HEALPix and flat-sky).
MASTER equation:
where \(M_{\ell\ell'}\) is the mode-coupling matrix computed from the mask power spectrum.
Pros and cons
Pros: Highly optimised coupling matrix computation. Standard reference implementation used by DES, DESI, Euclid.
Cons: External dependency; adds build complexity. Coupling matrix computation can be expensive for large \(\ell_{\max}\).
Implementation status in sys_mapping
measure_pseudo_cluseshealpy.anafast(no NaMaster required).NaMaster: optional backend — planned as an optional dependency for exact coupling matrix computation in
power_spectrum.mode_projection_bias.
Berlfein, Mandelbaum & Schafer 2024
Methods:
The primary reference for sys_mapping. Defines three nested
contamination models and a joint Bayesian MCMC framework for inferring
their parameters.
Contamination model (combined):
Gaussian log-likelihood (Eq. 17):
Two-point function correction (Eq. 15–16):
Noise debiasing (Eq. 21):
Pros and cons
Pros: Joint treatment of additive and multiplicative systematics avoids double-counting. PCA template rotation decorrelates parameters. Noise debiasing removes variance inflation. Skew-normal likelihood handles lognormal galaxy fields.
Cons: MCMC is slower than regression approaches. Assumes Gaussian/skew-normal pixel overdensities. Linear contamination model may miss non-linear systematics.
Implementation status in sys_mapping
All core methods from Berlfein+2024 are fully implemented:
Forward/inverse contamination model:
contaminationJAX JIT-compiled likelihoods:
likelihoodemcee MCMC inference:
inferencePCA template rotation + noise debiasing:
correctionTwo-point function correction:
contamination.compute_two_point_correctionLikelihood ratio model selection:
model_selection
Rodríguez-Monroy et al. 2025
Methods: Two complementary strategies applied to DES Y6 galaxy clustering:
Iterative Systematics Decontamination (ISD): OLS regression on polynomial expansions of templates up to 3rd order, iterated to convergence.
Footprint masking: conservative masking of survey regions with high systematic sensitivity, enabling simpler and more robust mitigation.
ISD cleansed overdensity:
where \(f_{\rm add}\) and \(f_{\rm mult}\) are the additive and multiplicative systematic components estimated from polynomial template regression.
Pros and cons
Pros: ISD captures non-linear systematic dependencies via polynomial template expansion. Footprint masking provides conservative, robust improvement.
Cons: Polynomial expansion grows combinatorially with template number and polynomial order. Masking reduces effective survey area.
Implementation status in sys_mapping
ISD (polynomial OLS iteration): implemented (
regression.iterative_systematics_decontamination).Footprint masking diagnostics: implemented (
diagnostics.footprint_mask_diagnostics).
Cornish et al. 2026
Methods: Extends template deprojection to work directly on discrete source catalogues (not pixelised maps), avoiding pixelisation-induced power leakage. Introduces a transfer function calibration loop to correct for power loss from deprojection.
Catalogue-based deprojection:
Transfer function calibration:
Pros and cons
Pros: Avoids pixelisation bias. Transfer function corrects for deprojection power loss. Exact treatment for discrete catalogues.
Cons: Transfer function calibration requires simulations (expensive). Mode-coupling increases in complexity for catalogue-based fields.
Implementation status in sys_mapping
Catalogue-based deprojection: deferred (planned as
deprojectionmodule oncepower_spectrumis validated).Transfer function calibration: deferred.
Tanidis et al. 2026
Methods: Quadratic estimator exploiting E/B mode coupling to reconstruct spatially-varying multiplicative bias \(m(\hat n)\) in weak lensing surveys. Three signal-to-noise ratio definitions are introduced for template detection and ranking.
Quadratic estimator:
Three SNR definitions:
Pros and cons
Pros: Detects percent-level multiplicative bias at high significance for Stage IV surveys. Mode-coupling signal is insensitive to additive (c-bias) contamination.
Cons: Primarily developed for weak lensing shear fields; adaptation to galaxy density requires modification. Requires knowledge of the cosmic shear power spectrum.
Implementation status in sys_mapping
SNR-based template ranking (all three definitions adapted for galaxy clustering): implemented (
diagnostics.snr_template_ranking).Full quadratic estimator for shear E/B coupling: not planned (specific to weak lensing; outside the scope of galaxy density maps).