This repository provides the code necessary to reproduce the nested sampling results on the Green Bank Ammonia Survey (GAS) data on NGC 1333 (Sokolov et al. 2020, ApJL / arXiv). It includes the scripts for data import, spectral sampling, methods to parallelize the sampling on a spectral cube, and scripts for the collection and visualization of the results.
Evidence / Bayes factor maps, as well as MLEs of parameter estimation are available at https://doi.org/10.7910/DVN/PDH6VT.
The requirements.txt
file contains a complete list
(pip install -r requirements.txt
should do the job)
- pyspecnest is needed for sampling and cube-oriented results collection.
- pymultinest and pyspeckit as base dependencies (
pyspecnest
is essentially a wrapper of the two) - Optional plotting dependencies used in imaging scripts:
- APLpy for image plotting. A version of
1.1.1
(slightly older is fine too) is needed. - ChainConsumer was used for generating corner plots.
- scikit-image was used for small feature removal in plotting maps.
- APLpy for image plotting. A version of
To avoid playing a lengthy game of whack-an-error, please read the following items carefully. The following describes the script hierarchy and a rough order of things to do before the sampler is ready to be used:
- Install the requirements above. Make sure you can run MultiNest via PyMultiNest - try an example here to make sure everything works.
- Set up a configuration file by copying a template file, which should be adapted for your local setup and named as
config.py
. All folder paths, file names, tunable spectral model parameters and priors reside in it. - Pull the GAS data files, both the FITS cubes as well as velocity dispersion files (for paper figures only), and place them into the
data/
folder. - Digest the GAS data for lazy loading of the spectra. A cube loader script is provided for it. It should be adapted to load a
pyspeckit
'sCube
orCubeStack
instance viaopencube.make_cube
function and ran as follows:- Because loading the full FILE file into RAM is time-consuming, a
opencube.save_datacube
function has to be executed first, to prepare the spectra for lazy loading. - Make sure
nested-sampling/cubexarr/
folder exists (or whichever folder is defined ascube_storage_dir
variable inconfig.py
). - Run
import opencube; spc=opencube.make_cube(); opencube.save_datacube(spc)
.
- Because loading the full FILE file into RAM is time-consuming, a
- A script that runs a sampling on a single spectrum can be invoked from command line with the following arguments:
npeaks
: number of independent velocity components to sample with (defaults todefault_npeaks
inconfig.py
)y
,x
: Y/X pixel coordinates of a target spectrum (defaults todefault_yx
inconfig.py
)plotting
: a Boolean specifying whether to perform automatic plotting after the sampling has finished (defaults to 0: no plotting)
- A pool of processes is generated in the scheduling / parallelization script, which can invoke a batch sampling on a full spectral cube, for example, as follows:
python pool_xy.py $NPEAKS $METHOD $CUT $NCPU
with the following arguments:$NPEAKS
: number of velocity components, defaults to 1.$METHOD
:snr
orBfactor
, sets ordering of the process schedule (e.g., withsnr
high SNR regions are processed first), defaults tosnr
.$CUT
: stopping condition for the chosen method (e.g., method='snr' with cut=5 finishes when all SNR>5 pixels were sampled), defaults to 8.$N_CPU
: number of CPUs to use, defaults to 7.
- Finally, to make evidence and Bayes factor FITS maps, run
make_KZ_cubes.py
; and to make "best-fit", i.e., MLE point estimate parameter cubes, runmake_parcubes.py
. - Post-processing and fancy plotting scripts are inside
make_presentable_Ks.py
(maps of Bayes factors, spectral multiplicity map), andplot_spectra.py
(plots MLE parameters over selected spectra) files.
The sampling script is fully compatible with Open MPI, meaning that one can run seven sampling processes in parallel by simply invoking:
mpirun -np 7 python innocent_script.py $NPEAKS $Y $X
,
where $NPEAKS
is a number of components to sample and $X
and $Y
are
pixel coordinates of the sampled spectrum. However, because all spectra are
treated independently, the sampling of the cube is an embarrassingly parallel
problem, and the
standard Python multiprocessing
library is used for thread spawning in pool_xy.py
.
FileNotFoundError: [Errno 2] No such file or directory: '/path-to-project-folder/nested-sampling/cubexarr/ngc1333-gas-xarr.npy'
Make sure you follow the instructions on point (4) above. The scripts are not smart enough (feel free to make a PR for that) to do it automatically on the first run.
Fortran runtime error: File already opened in another unit
or other errors of the can't-find-chains-files type
Unfortunately, it seems that the current MutliNest version (LEN=100
in src files here) has a hard limit of 100 characters on its absolute path variables, and will truncate it to a hundred chars in a criminally silent manner. I found it out the hard way, so make sure you don't run over it.
too many files open
error
Routinely happens on some platforms. Follow instructions here or here to increase the maximum number of files your OS lets you open at the same time.
This code is released under an MIT open-source license. Contributions fitting the scope of the project are welcome, and bug reports can be filed in the Issues tab.