-
Notifications
You must be signed in to change notification settings - Fork 0
obreschkow/procorr
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
OVERVIEW ========================================================================================================== ProCorr is a Fortran code to compute several correlation functions of periodic 3D density fields in a cubic box. The code can evaluate the powerspectrum, the 2-point autocorrelation, the bispectrum, as well as the so-called line correlation function, a special three-point phase correlation function, introduced in Obreschkow et al. (ApJ 762, 2013) to measure cosmic filamentarity. The line correlation calculated here uses the modified mode truncation |K|,|Q|,|K+Q|<2*pi/r of Wolstenhulme et al. (ApJ 804, 2015) instead of |K|,|Q|<pi/r in Eq. (D4c) of Obreschkow et al. (2013). The code is written in Fortran 90 for the gfortran compiler and has been tested on gfortran versions 4.4.7 and 5.0.0. It does not require any library, in particular no FFTW. The -O3 -openmp flags are strongly recommended to accelerate the computation on multiple cores. The code comes with some optional visualization routines for the statistical programming language R, contained in the file Rfunctions.R. Copyright Danail Obreschkow (danail.obreschkow@icrar.org), 2013-2018 VERSION HISTORY ========================================================================================================== v0.0 09/2012: Original version v0.1 06/2014: Change to the new definition of the line correlation function with |K|,|Q|,|K+Q|<2*pi/r instead of |K|,|Q|<pi/r in Eq. (D4c) of Obreschkow et al. (2013). v0.2 06/2014: exact computation of 2-point correlation, fixed bug in Edgeworth approximation v0.3 07/2014: optimal trilinear interpolation of particles on grid v0.4 07/2014: robust interpolation with particle translations and top-hat smoothing v0.5 07/2014: change the way r-values are chosen for correlation functions v1.0 11/2016: allow to read density field from a regular grid in addition to particle lists v1.1 01/2017: remove none one-indexed pointers for increased compatibility v1.2 01/2017: fixed screen output when written into file, include -progress option v1.3 02/2017: added input type allowing to read Gadget snapshots split into multiple files v1.4 02/2017: added input type to read ascii data v1.5 02/2017: new input functions to deal with different formats and multiple files consistently, default interpolation method set to nearest neighbor and argument -interpolation has been added v1.6 02/2017: bug fixes, added sidelength as an optional parameter v1.7 03/2017: added interpolation methods n# to suppress shifting the particles v1.8 03/2017: minor bug fixes with output file format v1.9 04/2017: corrected smoothing if the number of particles differs from the number of cells v1.10 04/2017: added developer version of output 'y' for 3-point correlation on a line v1.11 04/2017: small bug fix in the output of line correlation for interpolation type 'n2', improved R-functions v1.12 04/2017: small bug fix when reading large particle files v1.13 04/2017: added input 5, minor bug fix with -test v1.14 02/2018: added arguments -probability and -seed for Poisson subsampling of particles v1.15 02/2018: fixed random seed, optimization ('resolution' argument removed from correlation functions) v1.16 03/2018: added argument -rmin and changed the computation of the scales to be returned v1.17 03/2018: change default input to 4 (Gadget format) v1.18 02/2019: add input format 6 and internal option for redshift-space distortions GETTING STARTED: DOWNLOAD, COMPILE AND RUN THE CODE ========================================================================================================== 1) Install gfortran (tested for versions 4.7, 4.8 and 5.0). 2) In a terminal, download ProCorr > git clone https://github.com/obreschkow/procorr 3) Got to ProCorr directory > cd procorr 4) Compile the code > make In case of compiler issues modify the makefile. 5) Run test code by typing: > ./procorr -test The test should end with a summary of the computation speed. On a modern laptop, it should take less than 10s, equivalent to more than 4 billion term evaluations per second. If running as a local job, it is also nice to track the progress using the optional progress argument: > ./procorr -test -progress y 6) Compute the powerspectrum of the test file 'cdm_redshift0': > ./procorr cdm_redshift0 This creates a file cdm_redshift0_p.txt. Generally run the code via: > ./procorr filename [-option argument] [-option argument] [-option argument] ... For instance, running > ./procorr cdm_redshift0 -output pxdl -ncells 100 converts the particle dat of in the Gadget file 'cdm_redshift0' onto a regular grid of 100^3 cells, stored in the binary file cdm_redshift0_d.bin, and then computes the powerspectrum (cdm_redshift0_p.txt), 2-point correlation (cdm_redshift0_x.txt) and line correlation (cdm_redshift0_l.txt). For particle files, i.e. for input formats other than 1, there can be multiple files with filenames filename.0, filename.1, ... In this case, just call procorr with the base-filename (without the ".#"). Each individual file must have no more than 2^31-1 = 2147483647, i.e. 1290^3, particles. Optional: if the programming language R is installed, you can benefit from the many additional routines in the file Rfunctions.R. Simply call example() to see an example visualization of the density field and some basic correlation functions. OPTIONS ========================================================================================================== -test: run test code, should result in "TEST SUCCESSFUL" and show the number of iterations performed by the deterministic line correlation algorithm. This algorithm is highly optimized. You should expect about a half to one iteration per clock-tic on each core of your machine. For example, a 2 GHz quadro-core machine should evaluate about 4-8 billion terms per second. -version: show version of this code -input (default 4): format of the input file 1: 3D density field, represented on a regular n-by-n-by-n grid, where each cell is specified by a 4 or 8 byte real number (=> total file length is 4*n^3 or 8*n^3 bytes) 2: 3D particle positions, ordered as x1,y1,z1; x2,y2,z2; ...; xn,yn,zn, where each coordinate is specified by a 4-byte real number in a binary file (=> total file size is 4*3*n bytes) 3: same as (2), but using ascii format 4: 3D particle positions in the standard Gadget-2 format. A particle species can be selected using the -species argument (see below) 5: 3D particle positions in surfsuite format 6: 3D particle positions in Gadget-2 stream format, used e.g. by SURFS. m2: same as (2), but includes mass-weights: x1,y1,z1,m1; x2,y2,z2,m2; ...; xn,yn,zn,mn, i.e. the file size is 4*4*n bytes m3: same as (m2), but using ascii format m4: 3D particle positions and masses in Gadget-2 format For particle files, i.e. for input formats other than 1, there can be multiple files with filenames "filename.0", "filename.1", ... In this case, just call procorr with the base-filename (i.e. without the ".#") Each individual file must have no more than 2^31-1 = 2147483647, i.e. 1290^3, particles. -species (default 2): species of Gadget SPH particles to be analyized, only used for input=4,6 -ncells (default=512 or number of particles per dimension, whichever is smaller): only used if input>1 number of cells aside in the regular grid used to discretize the simulation box. Higher values of ncells allow to resolve smaller structures, meaning that real space correlation functions will be computed to smaller values r and powerspectra and bispectra will be computed to larger modes. However, generally it does not make sense to chose ncells larger than the number of SPH particles per dimension, since the SPH simulation cannot reliably resolve scales smaller than the mean interparticle separation. -rmin (default 0): minimum scales to be evaluated in the output correlation functions. If the default of rmin = 0 is used, the smallest scale is sidelength/ncells for the 2PCF and twice that value for the line correlation. -sidelength (default=automatic): side length of cubic simulation box. Mandatory argument for input type 1, optional for all other input types -output (default p): can be any combination of the following four letters: p: compute powerspectrum p(k) = <d(K)d(K)> where <> denotes the spatial average over all generalized rotations of the O(3) group, such that |K|=k, and d=FT(delta). The normalization of p(k) is the same as in CAMB. p(k) is calculated using an exact deterministic algorithm that uses all Fourier modes. x: compute 2-point autocorrelation xi2(r), identical to the Fourier transform of p(k). This correlation is calculated using an exact deterministic algorithm that uses all Fourier modes. y: compute 3-point autocorrelation xi3(r) for three equidistant points on a line with nearest neighbor separation r. This function is calculated using a stochastic algorithm that randomly samples a fraction of the Fourier space. The size of this fraction can be specified by the optional argument '-accuracy'. Caution: unlike the other output options, this output has not been properly optimized and tested. It is mainly for developer use. b: compute bispectrum b(k,q) = <d(K)d(Q)d(-K-Q)>_t, where |K|=k and |Q|=q; the bispectrum is given for isoscales triangles (|K|=|Q|) as a function of the leg length |K|=|Q| and the angle between K and Q. To get, for example, the bispectrum for the case where K, Q, and -K-Q span an equilateral triangle, select the values with an angle(K,Q) of 120 deg in the output file. The bispectrum is calculated using a stochastic algorithm that randomly samples a fraction of the Fourier space. The size of this fraction can be specified by the optional argument '-accuracy'. l: compute line correlation, defined as l(r) = (r/L)^(9/2)*sum_KQ j0(|K-Q|r) e(K) e(Q) e(-K-Q), where e=FT(delta)/|F(delta)| and where the sum_KQ goes over all vectors K and Q in the discretized Fourier space, which satify |K|,|Q|,|K+Q|<2*pi/r. This mode truncation scheme is slightly different from the original choice |K|,|Q|<pi/r of Eq. (D4c) in Obreschkow et al. (2013). The line correlation is calculated either via an extact, deterministic algorithm, or via a stochstic algorithm (see option '-method'). a: compute line correlation l(r) and the approximation l’(r) in the Edgeworth expansion without higher order (N>3) loop terms: l'(r)=(sqrt(pi)/2.0)**3*(r/L)^(9/2)*sum_KQ j0(|K-Q|r)<d(K)d(Q)d(-K-Q)>_t/sqrt(p(K)p(Q)p(-K-Q)), where p(K) = <d(K)d(K)>. This approximation is mainly used by the developers. This function is calculated using a stochastic algorithm that randomly samples a fraction of the Fourier space. The size of this fraction can be specified by the optional argument '-accuracy'. d: save density perturbation field delta(r) in binary format, mainly used for visualization purposes e: save phase field epsilon(r)=IFT(FT(delta(r))/|FT(delta(r))|) in binary format -method (default 2 for ncells<=128, 1 otherwise): only used with output 'l', must be either 1 or 2 and denotes the type of algorithm used to evaluate the line correlation l(r): 1: Stochastic Monte Carlo algorithm. The number of Monte Carlo iterations is proportional to the optional argument -accuracy >0 (default 1). The estimated error of the result, i.e. the expected 1-sigma deviation from the result obtained in a complete sampling of all Fourier modes, is given in the 3rd column of the output file. This error approximately scales as 1/sqrt(accuracy). The accuracy should be increased until this error becomes numerically as small as required. 2: Deterministic algorithm, allowing a controlled sampling of the Fourier space. The fraction of Fourier space sampled is set by the optional argument -accuracy (0..1). If -accuracy is 1 (the default value), this algorithm returns the *exact* line correlation of the given density field. This algorithm is computationally expensive and typically only used up to 200^3 cells, which can be set using -ncell 200. -interpolation (default 1): only used for input formats other than 1, specifies the method used to represent 3D particle positions on the cartesian grid of ncells^3 cells. To avoid beating frequencies, the particles are slightly translated, such that their mean distance from the cell centres is exactly 1/4 of the cell spacing in all three dimensions. To suppress this shifting, add 'n' before the interpolation number, e.g. '-interpolation n1'. 1: "nearest neighbor" method, i.e. each particle is simply put into the grid-cell it lies within. This method conserves small-scale power very well, but can lead to strange artifacts in all correlation functions if the particles positions are strongly correlated with the grid structure, e.g. at the beginning of a simulation, where particles haven't evolved much 2: "top-hat smoothing": each particle is smoothed onto the grid using a cubic top-hat kernel with the width of the grid cells. This method is more robust against potential errors caused by spurious correlations between the particle positions and the grid structure, but it has the disadvantage of suppressing some of the small-scale power. When choosing this method a warning is displayed and included in the output files, stating the k-value above which the power is significantly affected. -accuracy (default 1): used only with outputs 'b', 'y', 'l', 'a', sets the numerical accuracy of different algorithms: + For '-output b' and '-output y' (stochastic algorithm of bispectrum and 3-point correlation), this value must be larger than 0. The higher the value, the more accurate the calculation. + For '-output l -method 1' (stochastic algorithm of line correlation), this value must be larger than 0. The higher the value, the more accurate the calculation. The accuracy should be increased until the error of l(r) in the third column of the output file *_l.txt is as small as desired. + For '-output l -method 2' (deterministic algorithm of line correlation), this value should be between 0 and 1 and denotes the fraction of Fourier space sampled by the algorithm. The higher the value, the more accurate the calculation, but accuracies larger than 1 (more than full sampling) are meaningless and automatically reset to 1. -probability (default 1): only used if input data is provided as particle data (input type other than 1), in which case it specifies the Poisson subsampling of these particles. A probability value smaller than 1 means that each particle will only be considered with a probability equal to this value. Hence, the expected number of particles kept for the computations is probability*(total number of particles). The seed of the random sampling can be specified using the argument -seed. -seed (default 1): seed for random number generator used for the particle subsampling if a probability smaller than 1 is specified. -overwrite (default y): if set to 'y', existing output files are overwriten, if set to 'n', they are only created if they do not already exist -progress (default n): if set to 'y', the live progress is shown on screen, if set to 'n', the live progress is suppressed. The latter (default) should be chosen if the screen output is written to a file. PACKAGED FILES ========================================================================================================== + 'README' = this file + 'procorr.f90' is the main programme to be called to compute correlation functions + 'module_correlation_functions.f90' is the module containing the functions to compute various correlation functions of a density field in the snapshot format of N-body simulations run with Gadget 2.0.7. + 'cdm_redshift0' output file from Gadget-2, containing the 3D-coordinates of 64^3 particles in a periodic box of side-length 100 Mpc/h. The particle positions correspond to an evolved Universe (redshift z=0), with cosmological parameters as indicated in Obreschkow et al. (2013). Simulation courtesy of Chris Power (2012). + 'makefile' contains the compiling instructions for the gfortran compiler. + 'Rfunctions.R' contains some utility and visualization functions for the programming language R.
About
ProCorr cosmological correlations functions including the new line correlation
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published