sys_mapping.regression
Regression-based systematic decontamination: ElasticNet and Iterative Systematics Decontamination (ISD).
Inputs: Observed overdensity delta_g_obs (shape (n_pix,)),
template maps delta_t (shape (n_sys, n_pix)).
Outputs: Amplitude vector alpha_hat, per-pixel weight array
weights, and diagnostic metadata.
elasticnet_contamination_fit()— L1+L2 penalised regression; requiresscikit-learn >= 1.3(pip install sys_mapping[regression]).iterative_systematics_decontamination()— polynomial OLS expansion, no extra dependencies.method_comparison()— run multiple methods and return a unified result dict.
Key papers: Weaverdyck & Huterer 2021; Rodríguez-Monroy et al. 2025 — see also Methods Reference.
Regression-based systematic decontamination methods.
- Implements:
ElasticNet regularised regression (Weaverdyck & Huterer 2021)
Iterative Systematics Decontamination via polynomial OLS (Rodríguez-Monroy+2025)
Multi-method comparison framework (Weaverdyck & Huterer 2021)
Unified run_decontamination() interface for all 6 methods
- sys_mapping.regression.elasticnet_contamination_fit(delta_g_obs, delta_t, *, l1_ratio=0.5, alpha_reg=None, cv_folds=5, max_iter=10000, fit_intercept=False)[source]
Fit systematic contamination amplitudes via ElasticNet regression.
Minimises the penalised least-squares loss (Weaverdyck & Huterer 2021, Eq. 9):
\[\mathcal{L} = \frac{1}{2N_{\rm pix}} \left\|\delta_g - \sum_i\alpha_i\,t_i\right\|_2^2 + \lambda_1\|\boldsymbol\alpha\|_1 + \lambda_2\|\boldsymbol\alpha\|_2^2\]where the total regularisation strength and L1 fraction are controlled by
alpha_regandl1_ratio.- Parameters:
delta_g_obs (ndarray) – Observed galaxy overdensity (shape
(n_pix,)).delta_t (ndarray) – Template maps (shape
(n_sys, n_pix)).l1_ratio (float) – ElasticNet mixing parameter: 1.0 = Lasso, 0.0 = Ridge.
alpha_reg (float | None) – Regularisation strength. If
None, tuned automatically viacv_folds-fold cross-validation (ElasticNetCV).cv_folds (int) – Number of cross-validation folds used when
alpha_reg=None.max_iter (int) – Maximum coordinate descent iterations.
fit_intercept (bool) – Whether to fit a constant offset. Typically
Falsebecause the overdensity field has zero mean by construction.
- Returns:
alpha_hat – Fitted contamination amplitudes (shape
(n_sys,)).weights – Per-pixel systematic weights
w(p) = 1 / (1 + alpha_hat · t(p))(shape(n_pix,)).cv_info – Dictionary with keys
alpha_reg(used value),l1_ratio, andcv_scores(array of cross-validation MSE scores, orNoneifalpha_regwas provided explicitly).
- Return type:
Notes
Requires
scikit-learn >= 1.3. Install via:pip install "sys_mapping[regression]"
The per-pixel weight corresponds to the DES-Y1 / Weaverdyck+2021 form \(w(p) = (1 + \hat{\boldsymbol\alpha}\cdot\mathbf{t}(p))^{-1}\).
Examples
>>> import numpy as np >>> from sys_mapping import elasticnet_contamination_fit >>> rng = np.random.default_rng(42) >>> n_pix, n_sys = 5000, 3 >>> delta_t = rng.standard_normal((n_sys, n_pix)) >>> true_alpha = np.array([0.2, -0.1, 0.05]) >>> delta_g = true_alpha @ delta_t + rng.standard_normal(n_pix) * 0.3 >>> alpha_hat, weights, info = elasticnet_contamination_fit( ... delta_g, delta_t, alpha_reg=1e-4) >>> weights.shape (5000,) >>> np.all(weights > 0) True
References
Weaverdyck & Huterer 2021, MNRAS 503, 5061.
- sys_mapping.regression.iterative_systematics_decontamination(delta_g_obs, delta_t, *, poly_order=3, max_iter=20, tol=1e-05, lambda_poly=0.0)[source]
Iterative Systematics Decontamination (ISD) via polynomial OLS.
Expands templates to include cross-products up to
poly_order, then iteratively fits OLS weights until convergence. Implements the approach of Rodríguez-Monroy et al. 2025 (DES Y6), Sec. 3.2.The ISD cleansed overdensity after convergence is:
\[\delta_g^{\rm clean}(p) = \frac{\delta_g(p) - f_{\rm add}(p)}{1 + f_{\rm mult}(p)}\]where \(f_{\rm add}\) and \(f_{\rm mult}\) are the additive and multiplicative systematic components estimated from polynomial template regression (here approximated by a single linear OLS pass per iteration).
- Parameters:
delta_g_obs (ndarray) – Observed galaxy overdensity (shape
(n_pix,)).delta_t (ndarray) – Template maps (shape
(n_sys, n_pix)).poly_order (int) – Maximum polynomial order for template expansion. Order 1 = linear templates only; order 2 adds pairwise products; order 3 adds triple products. Automatically reduced if the expanded template matrix is underdetermined.
max_iter (int) – Maximum number of OLS iterations.
tol (float) – Convergence tolerance on the relative change in weights (
||w_new - w_old|| / ||w_old||).lambda_poly (float) – Ridge penalty applied exclusively to the polynomial (non-linear) expansion columns — the first
n_syslinear columns are never penalised. A small positive value (e.g.1e-3 * np.var(delta_g_obs)) prevents the polynomial terms from absorbing signal that belongs to the linear coefficients, which are the only ones used for the weight update. Defaults to0.0(plain OLS, backward-compatible).
- Returns:
weights – Per-pixel systematic weights (shape
(n_pix,)), all positive.alpha_hat_all – Fitted coefficients for the expanded template matrix at the final iteration (shape
(n_expanded,)).n_iterations – Number of iterations performed (1 if converged on first pass).
- Return type:
Notes
Template expansion uses
itertools.combinations_with_replacement— no new dependencies. Forn_sys=5,poly_order=3the expanded set has \(\binom{n+k}{k} - 1 = 55\) columns. Ifn_pix < 10 × n_expanded,poly_orderis reduced with a warning.Speed: the design matrix
Xis constant across iterations; only the diagonal weight matrix changes. The per-iteration cost is therefore one BLAS-3 callX^T diag(w) Xof shape(n_expanded, n_expanded)plus a smalllstsqsolve — roughly 5–15× faster than SVD of the full(n_pix × n_expanded)matrix that the previous sqrt-weight formulation required.Examples
>>> import numpy as np >>> from sys_mapping import iterative_systematics_decontamination >>> rng = np.random.default_rng(0) >>> n_pix, n_sys = 8000, 3 >>> delta_t = rng.standard_normal((n_sys, n_pix)) >>> true_alpha = np.array([0.15, -0.08, 0.04]) >>> delta_g = true_alpha @ delta_t + rng.standard_normal(n_pix) * 0.4 >>> weights, alpha_all, n_it = iterative_systematics_decontamination( ... delta_g, delta_t, poly_order=2, max_iter=10) >>> weights.shape (8000,) >>> np.all(weights > 0) True
References
Rodríguez-Monroy et al. 2025, arXiv:2509.07943.
- sys_mapping.regression.method_comparison(delta_g_obs, delta_t, good_pixels, nside, methods=('combined_mcmc', 'additive_mcmc', 'elasticnet', 'ols'), seed=42, mcmc_kwargs=None)[source]
Run multiple decontamination methods on the same data and compare.
Implements the multi-method comparison framework of Weaverdyck & Huterer 2021 (Sec. 4). Each method returns coefficients, per-pixel weights, and a goodness-of-fit metric, enabling direct comparison.
- Parameters:
delta_g_obs (ndarray) – Observed overdensity (shape
(n_pix,)).delta_t (ndarray) – Template maps (shape
(n_sys, n_pix)).good_pixels (ndarray) – Boolean mask (shape
(n_full_pix,)). Passed through for spatial context but not used in computation directly (templates already masked).nside (int) – HEALPix NSIDE of the full map.
methods (tuple[str, ...]) – Subset of
{"combined_mcmc", "additive_mcmc", "elasticnet", "ols"}."combined_mcmc"(the default first entry) is the recommended method; it infers both additive and multiplicative amplitudes jointly.seed (int) – Random seed for MCMC initialisation.
mcmc_kwargs (dict | None) – Keyword arguments forwarded to
run_mcmc()for the"additive_mcmc"method.
- Returns:
Dictionary keyed by method name. Each value is a dict with:
alpha_hatora_hat: contamination amplitudesweights: per-pixel weights (shape(n_pix,))chi2_residual: sum-of-squared residuals / n_pix
- Return type:
results
Notes
"additive_mcmc"requires no optional dependencies (uses the built-in MCMC pipeline)."elasticnet"requiresscikit-learn.Examples
>>> import numpy as np >>> from sys_mapping import method_comparison >>> rng = np.random.default_rng(0) >>> n_pix, n_sys = 3000, 2 >>> delta_t = rng.standard_normal((n_sys, n_pix)) >>> delta_g = 0.1 * delta_t[0] + rng.standard_normal(n_pix) * 0.5 >>> good = np.ones(n_pix, dtype=bool) >>> results = method_comparison(delta_g, delta_t, good, nside=16, ... methods=("elasticnet", "ols")) >>> set(results.keys()) == {"elasticnet", "ols"} True >>> "weights" in results["ols"] True
References
Weaverdyck & Huterer 2021, MNRAS 503, 5061.
- sys_mapping.regression.run_decontamination(method, delta_g_obs, delta_t, *, n_walkers=100, n_steps=1200, n_burn=200, seed=42, use_skewed=False, progress=False, cv_folds=5, isd_max_iter=50, isd_lambda_poly=0.0)[source]
Run a single decontamination method and return a standardised result dict.
This is the unified entry point used by both validation scripts and production pipelines. All 6 methods share the same interface so callers can iterate over a list of method names without any method-specific boilerplate.
- Parameters:
method (str) – One of
"OLS","ElasticNet","ISD-1","ISD-3","MCMC-add","MCMC-comb".delta_g_obs (ndarray) – Observed galaxy overdensity at unmasked pixels, shape
(n_pix,).delta_t (ndarray) – Template maps in the original (non-rotated) basis, shape
(n_sys, n_pix). MCMC methods rotate internally.n_walkers (int) – MCMC sampler settings.
n_steps (int) – MCMC sampler settings.
n_burn (int) – MCMC sampler settings.
seed (int) – Random seed for MCMC walker initialisation.
use_skewed (bool) – Use skew-normal likelihood for MCMC-comb. Ignored for all other methods.
progress (bool) – Show emcee progress bar (MCMC methods only).
cv_folds (int) – Cross-validation folds for ElasticNet (when
alpha_regis auto).isd_max_iter (int) – Maximum iterations for ISD methods.
isd_lambda_poly (float) – Ridge penalty on the polynomial expansion columns for ISD-3 (see
iterative_systematics_decontamination()). A value of1e-3 * np.var(delta_g_obs)is recommended when ISD-3 converges to implausible solutions. Has no effect for ISD-1 (poly_order=1 has no polynomial-only columns). Default0.0(plain OLS).
- Returns:
dict with keys
All methods –
a_hat(n_sys,),b_hat(n_sys,),weights(n_pix,),elapsed_s(float).MCMC methods additionally –
flat_chain,sampler,a_rot,b_rot,cov_a_rot,cov_b_rot,cov_a,cov_b,R,acceptance_fraction,sigma_hat.ISD methods additionally –
n_iterations(int).ElasticNet additionally –
cv_info(dict).Non-applicable keys are
None.
- Return type:
Notes
Weight convention
OLS / ElasticNet / ISD-1 / ISD-3 / MCMC-add:
w(p) = 1 / (1 + a_hat @ t(p))— additive weight.MCMC-comb:
w(p) = (1 + δ_g_clean(p)) / (1 + δ_g_obs(p))— exact pixel-level inverse of the contamination weight; cancelsweight_contexactly when (a_hat, b_hat) equal the true parameters.
References
Berlfein et al. 2024, MNRAS 531, 4954. Weaverdyck & Huterer 2021, MNRAS 503, 5061. Rodríguez-Monroy et al. 2025, arXiv:2509.07943.