sys_mapping.model_selection

Likelihood ratio test (LRT) for comparing nested contamination models.

likelihood_ratio_test() computes \(\lambda_{\rm LR} = 2(\ln\mathcal{L}_{\rm alt} - \ln\mathcal{L}_{\rm null})\) and returns the p-value against a \(\chi^2(r)\) distribution where \(r\) is the difference in free parameters.

Typical use: test additive (null) vs. combined (alternative) to decide whether multiplicative contamination is significant.

Key paper: Berlfein et al. 2024, Eq. 19 — see also Methods Reference.

Likelihood ratio test for model selection (Eq. 19, Berlfein et al. 2024).

The likelihood ratio statistic:

λ_LR = 2 [ln L(Θ̂) - ln L(Θ̂_0)]

is asymptotically χ²(r) distributed under the null hypothesis, where r = dim(Θ̂) - dim(Θ̂_0) is the number of additional free parameters.

class sys_mapping.model_selection.LikelihoodRatioResult(lambda_lr: 'float', n_dof: 'int', p_value: 'float', reject_null: 'bool', null_model: 'str', alt_model: 'str')[source]

Bases: object

Parameters:
lambda_lr: float
n_dof: int
p_value: float
reject_null: bool
null_model: str
alt_model: str
sys_mapping.model_selection.likelihood_ratio_test(delta_g_obs, delta_t, theta_null, theta_alt, null_model, alt_model, use_skewed=False, significance=0.05)[source]

Perform a likelihood ratio test comparing null vs. alternative model (Eq. 19).

Computes λ_LR = 2 [ln L(θ̂_alt) ln L(θ̂_null)], which under the null hypothesis is asymptotically χ²(r) where r = dim(alt) dim(null) is the extra degrees of freedom.

Note

Each call creates new JIT-compiled likelihood functions internally. For repeated calls with the same data, precompute the log-likelihoods using make_log_likelihood() to avoid recompilation.

Parameters:
  • delta_g_obs ((n_pix,) observed overdensity)

  • delta_t ((n_sys, n_pix) template values)

  • theta_null (flat parameter vector for the null model (from get_mle_params()))

  • theta_alt (flat parameter vector for the alternative model)

  • null_model (str 'additive' or 'multiplicative' (must be nested in alt_model))

  • alt_model (str 'combined' (must have more free parameters than null_model))

  • use_skewed (bool)

  • significance (float p-value threshold for rejecting the null; default 0.05)

Returns:

  • LikelihoodRatioResultlambda_lr : test statistic (≥ 0 only when both theta are at their MLEs) n_dof : degrees of freedom r = n_free_params(alt) n_free_params(null) p_value : Pr(χ²(r) > λ_LR) under the null reject_null: True when p_value < significance null_model, alt_model: model names

  • Performance

  • ———–

  • Measured on CPU (n_pix=10_000, n_sys=3) (*~481 ms/call (includes JIT*)

  • compilation of two likelihood functions, ~227 ms each on first call).

  • In a pipeline where likelihoods are already compiled, evaluation is ~1 ms.

  • Precision

  • ———

  • p-value is in [0, 1] by construction.

  • lambda_lr = 0 when theta_alt uses b=0 (equivalent to the null).

  • For large n_pix, lambda_lr is χ²-distributed under the null asymptotically.

Return type:

LikelihoodRatioResult

Examples

>>> import numpy as np
>>> from sys_mapping import likelihood_ratio_test, pack_params
>>> rng = np.random.default_rng(0)
>>> n_pix, n_sys = 5000, 3
>>> delta_g = rng.standard_normal(n_pix) * 0.1
>>> delta_t = rng.standard_normal((n_sys, n_pix))
>>> a_hat = np.zeros(n_sys)
>>> b_hat = np.zeros(n_sys)
>>> sigma  = float(delta_g.std())
>>> theta_add  = pack_params(a_hat, None,  sigma, model="additive")
>>> theta_comb = pack_params(a_hat, b_hat, sigma, model="combined")
>>> result = likelihood_ratio_test(delta_g, delta_t,
...                                theta_add, theta_comb,
...                                "additive", "combined")
>>> result.n_dof   # combined has n_sys extra b parameters
3
>>> 0.0 <= result.p_value <= 1.0
True