Running the real-data pipeline

This page describes the input data and how to run sys_mapping on the Legacy Survey DR10 Bright Galaxy Survey (LS10 BGS), following the DESI-BGS selection described by Hahn et al. 2023. The galaxy and random samples are those used in Comparat et al. 2025a and are available on Zenodo record 15111974. For the mathematical background see Methods Reference; for the synthetic-mock tutorial see Quickstart. Results are documented in Results: systematic weights.

Warning

The script uses 5 synthetic template families by default when --template-dir is omitted. Always pass --template-dir pointing to the real GAIA + LS10 FITS maps to get scientifically meaningful results:

--template-dir ~/data/legacysurvey/dr10/systematics

Input data

BGS VLIM samples

Nine volume-limited stellar-mass threshold samples spanning \(0.08 < z < 0.35\). Each is a galaxy + random FITS pair located under --catalog-dir (default ~/data/legacysurvey/dr10/sweep/BGS_VLIM_Mstar).

log M*

zmax

Ngal

Nrand

9.0

0.08

523 486

2 617 332

9.5

0.12

1 432 502

7 160 697

10.0

0.18

2 759 238

13 795 884

10.25

0.22

3 308 841

16 544 481

10.5

0.26

3 263 228

16 315 418

10.75

0.31

2 802 710

14 013 316

11.0

0.35

1 619 838

8 097 853

11.25

0.35

541 855

2 708 912

11.5

0.35

120 882

606 304

Systematic templates

44 HEALPix maps at NSIDE ∈ {32, 64, 128, 256} from Legacy Survey imaging metadata and GAIA DR3 stellar catalogues. Templates are standardised to zero mean and unit variance over the survey footprint before fitting.

Template family

Source

Physical quantity

LS10:EBV

SFD98

Galactic dust extinction \(E(B-V)\)

LS10:GALDEPTH_{G,R,Z}

LS10 imaging

5σ galaxy detection depth (per band)

LS10:PSFSIZE_{G,R,Z}

LS10 imaging

PSF FWHM (per band)

LS10:NOBS_{G,R,Z}

LS10 imaging

Number of exposures (per band)

GAIA:nstar_faint/medium

GAIA DR3

Surface density of faint / medium stars

GAIA:phot_{g,bp,rp}_mean_flux

GAIA DR3

Mean stellar flux (per photometric band)

Each family appears at all four NSIDE values, giving 44 maps in total. The dominant systematic in LS10 BGS is stellar density (GAIA:nstar_faint), which correlates with galaxy counts at the 5–70 % per-pixel level depending on mass threshold.

Representative figures for all nine samples are in Results: systematic weights.


Running the pipeline

Regenerate figures without re-running MCMC

python scripts/run_ls10_analysis.py \
    --catalog-dir ~/data/legacysurvey/dr10/sweep/BGS_VLIM_Mstar \
    --template-dir ~/data/legacysurvey/dr10/systematics \
    --figures-only \
    --output-dir data/sys_weights/

This reloads the saved *_params.json files, redraws all figures, and copies them to docs/_static/results_ls10/.

Key command-line options

Flag

Default

Description

--catalog-dir

(required)

Directory containing *_DATA.fits / *_RAND.fits pairs

--template-dir

(none — synthetic)

Directory of real HEALPix FITS maps; required for real results

--nside

64

HEALPix resolution

--n-walkers

210

emcee walkers per MCMC run

--n-steps

1500

MCMC steps after burn-in

--n-burn

300

MCMC burn-in steps

--only-methods

all

Restrict to a subset, e.g. OLS ElasticNet

--force

off

Re-run even if output JSON already exists

--figures-only

off

Regenerate figures from saved JSON without MCMC

--output-dir

data/sys_weights/

Root output directory


Output files

All results are written to --output-dir (default data/sys_weights/). Figures are copied to docs/_static/results_ls10/.

data/sys_weights/
├── <sample_id>_NSIDE0064_WEIGHTS.fits          # per-galaxy weights, all methods
├── <sample_id>_NSIDE0064_params.json           # MCMC amplitudes, LRT, σ_hat
├── <sample_id>_NSIDE0064_partial_OLS.json      # partial results per method
├── <sample_id>_NSIDE0064_weight_map.png        # 2×3 Mollweide weight maps
├── <sample_id>_NSIDE0064_weight_hist.png       # log-scale weight distributions
├── <sample_id>_NSIDE0064_wtheta.png            # w(θ) before/after correction
└── summary_NSIDE0064.yaml                      # cross-sample YAML summary

The FITS weight table contains one column per method plus WEIGHT_SYS:

Column

Description

WEIGHT_OLS

OLS additive correction weights

WEIGHT_ENET

ElasticNet additive correction weights

WEIGHT_ISD1

ISD (order 1) additive correction weights

WEIGHT_ISD3

ISD (order 3) additive correction weights

WEIGHT_ADD

MCMC-additive correction weights

WEIGHT_COMB

MCMC-combined (additive + multiplicative) weights

WEIGHT_SYS

Alias for WEIGHT_COMBrecommended default

After the run, rebuild the HTML documentation:

make -C docs html

Note

:math:`w(theta)` figurewtheta_corrected_nside64.png is produced by scripts/plot_ls10_wtheta_corrected.py using the analytical correction (Eq. 15–16) from data/sys_weights/*_wtheta_data.json. It shows all 6 decontamination methods across all 9 samples. To regenerate after a new run:

python scripts/plot_ls10_wtheta_corrected.py

See also

Results: systematic weights — per-sample results tables, LRT statistics, and fractional systematic uncertainty on \(w(\theta)\).