NNCorrelation: Count-count correlations

class treecorr.NNCorrelation(config=None, *, logger=None, **kwargs)[source]

Bases: Corr2

This class handles the calculation and storage of a 2-point count-count correlation function. i.e. the regular density correlation function.

Ojects of this class holds the following attributes:

Attributes:
  • nbins – The number of bins in logr

  • bin_size – The size of the bins in logr

  • min_sep – The minimum separation being considered

  • max_sep – The maximum separation being considered

In addition, the following attributes are numpy arrays of length (nbins):

Attributes:
  • logr – The nominal center of the bin in log(r) (the natural logarithm of r).

  • rnom – The nominal center of the bin converted to regular distance. i.e. r = exp(logr).

  • meanr – The (weighted) mean value of r for the pairs in each bin. If there are no pairs in a bin, then exp(logr) will be used instead.

  • meanlogr – The mean value of log(r) for the pairs in each bin. If there are no pairs in a bin, then logr will be used instead.

  • weight – The total weight in each bin.

  • npairs – The number of pairs going into each bin (including pairs where one or both objects have w=0).

  • tot – The total number of pairs processed, which is used to normalize the randoms if they have a different number of pairs.

If calculateXi has been called, then the following will also be available:

Attributes:
  • xi – The correlation function, \(\xi(r)\)

  • varxi – An estimate of the variance of \(\xi\)

  • cov – An estimate of the full covariance matrix.

If sep_units are given (either in the config dict or as a named kwarg) then the distances will all be in these units.

Note

If you separate out the steps of the Corr2.process command and use process_auto and/or Corr2.process_cross, then the units will not be applied to meanr or meanlogr until the finalize function is called.

The typical usage pattern is as follows:

>>> nn = treecorr.NNCorrelation(config)
>>> nn.process(cat)         # For auto-correlation.
>>> nn.process(cat1,cat2)   # For cross-correlation.
>>> rr.process...           # Likewise for random-random correlations
>>> dr.process...           # If desired, also do data-random correlations
>>> rd.process...           # For cross-correlations, also do the reverse.
>>> nn.write(file_name,rr=rr,dr=dr,rd=rd)         # Write out to a file.
>>> xi,varxi = nn.calculateXi(rr=rr,dr=dr,rd=rd)  # Or get correlation function directly.
Parameters:
  • config (dict) – A configuration dict that can be used to pass in kwargs if desired. This dict is allowed to have addition entries besides those listed in Corr2, which are ignored here. (default: None)

  • logger – If desired, a logger object for logging. (default: None, in which case one will be built according to the config dict’s verbose level.)

Keyword Arguments:

**kwargs – See the documentation for Corr2 for the list of allowed keyword arguments, which may be passed either directly or in the config dict.

__iadd__(other)[source]

Add a second Correlation object’s data to this one.

Note

For this to make sense, both objects should not have had finalize called yet. Then, after adding them together, you should call finalize on the sum.

__init__(config=None, *, logger=None, **kwargs)[source]

Initialize NNCorrelation. See class doc for details.

calculateNapSq(*, rr, R=None, dr=None, rd=None, m2_uform=None)[source]

Calculate the corrollary to the aperture mass statistics for counts.

\[\begin{split}\langle N_{ap}^2 \rangle(R) &= \int_{0}^{rmax} \frac{r dr}{2R^2} \left [ T_+\left(\frac{r}{R}\right) \xi(r) \right] \\\end{split}\]

The m2_uform parameter sets which definition of the aperture mass to use. The default is to use ‘Crittenden’.

If m2_uform is ‘Crittenden’:

\[\begin{split}U(r) &= \frac{1}{2\pi} (1-r^2) \exp(-r^2/2) \\ T_+(s) &= \frac{s^4 - 16s^2 + 32}{128} \exp(-s^2/4) \\ rmax &= \infty\end{split}\]

cf. Crittenden, et al (2002): ApJ, 568, 20

If m2_uform is ‘Schneider’:

\[\begin{split}U(r) &= \frac{9}{\pi} (1-r^2) (1/3-r^2) \\ T_+(s) &= \frac{12}{5\pi} (2-15s^2) \arccos(s/2) \\ &\qquad + \frac{1}{100\pi} s \sqrt{4-s^2} (120 + 2320s^2 - 754s^4 + 132s^6 - 9s^8) \\ rmax &= 2R\end{split}\]

cf. Schneider, et al (2002): A&A, 389, 729

This is used by NGCorrelation.writeNorm. See that function and also GGCorrelation.calculateMapSq for more details.

Parameters:
  • rr (NNCorrelation) – The auto-correlation of the random field (RR)

  • R (array) – The R values at which to calculate the aperture mass statistics. (default: None, which means use self.rnom)

  • dr (NNCorrelation) – The cross-correlation of the data with randoms (DR), if desired. (default: None)

  • rd (NNCorrelation) – The cross-correlation of the randoms with data (RD), if desired. (default: None, which means use rd=dr)

  • m2_uform (str) – Which form to use for the aperture mass. (default: ‘Crittenden’; this value can also be given in the constructor in the config dict.)

Returns:

Tuple containing

  • nsq = array of \(\langle N_{ap}^2 \rangle(R)\)

  • varnsq = array of variance estimates of this value

calculateXi(*, rr, dr=None, rd=None)[source]

Calculate the correlation function given another correlation function of random points using the same mask, and possibly cross correlations of the data and random.

The rr value is the NNCorrelation function for random points. For a signal that involves a cross correlations, there should be two random cross-correlations: data-random and random-data, given as dr and rd.

  • If dr is None, the simple correlation function \(\xi = (DD/RR - 1)\) is used.

  • if dr is given and rd is None, then \(\xi = (DD - 2DR + RR)/RR\) is used.

  • If dr and rd are both given, then \(\xi = (DD - DR - RD + RR)/RR\) is used.

where DD is the data NN correlation function, which is the current object.

Note

The default method for estimating the variance is ‘shot’, which only includes the shot noise propagated into the final correlation. This does not include sample variance, so it is always an underestimate of the actual variance. To get better estimates, you need to set var_method to something else and use patches in the input catalog(s). cf. Covariance Estimates.

After calling this method, you can use the Corr2.estimate_cov method or use this correlation object in the estimate_multi_cov function. Also, the calculated xi and varxi returned from this function will be available as attributes.

Parameters:
  • rr (NNCorrelation) – The auto-correlation of the random field (RR)

  • dr (NNCorrelation) – The cross-correlation of the data with randoms (DR), if desired, in which case the Landy-Szalay estimator will be calculated. (default: None)

  • rd (NNCorrelation) – The cross-correlation of the randoms with data (RD), if desired. (default: None, which means use rd=dr)

Returns:

  • xi = array of \(\xi(r)\)

  • varxi = an estimate of the variance of \(\xi(r)\)

Return type:

Tuple containing

copy()[source]

Make a copy

finalize()[source]

Finalize the calculation of the correlation function.

The process_auto and Corr2.process_cross commands accumulate values in each bin, so they can be called multiple times if appropriate. Afterwards, this command finishes the calculation of meanr, meanlogr by dividing by the total weight.

classmethod from_file(file_name, *, file_type=None, logger=None, rng=None)[source]

Create an NNCorrelation instance from an output file.

This should be a file that was written by TreeCorr.

Parameters:
  • file_name (str) – The name of the file to read in.

  • file_type (str) – The type of file (‘ASCII’, ‘FITS’, or ‘HDF’). (default: determine the type automatically from the extension of file_name.)

  • logger (Logger) – If desired, a logger object to use for logging. (default: None)

  • rng (RandomState) – If desired, a numpy.random.RandomState instance to use for bootstrap random number generation. (default: None)

Returns:

An NNCorrelation object, constructed from the information in the file.

getStat()[source]

The standard statistic for the current correlation object as a 1-d array.

This raises a RuntimeError if calculateXi has not been run yet.

getWeight()[source]

The weight array for the current correlation object as a 1-d array.

This is the weight array corresponding to getStat. In this case, it is the denominator RR from the calculation done by calculateXi().

process_auto(cat, *, metric=None, num_threads=None)[source]

Process a single catalog, accumulating the auto-correlation.

This accumulates the auto-correlation for the given catalog. After calling this function as often as desired, the finalize command will finish the calculation of meanr, meanlogr.

Parameters:
  • cat (Catalog) – The catalog to process

  • metric (str) – Which metric to use. See Metrics for details. (default: ‘Euclidean’; this value can also be given in the constructor in the config dict.)

  • num_threads (int) – How many OpenMP threads to use during the calculation. (default: use the number of cpu cores; this value can also be given in the constructor in the config dict.)

process_cross(cat1, cat2, *, metric=None, num_threads=None)[source]

Process a single pair of catalogs, accumulating the cross-correlation.

This accumulates the cross-correlation for the given catalogs. After calling this function as often as desired, the finalize command will finish the calculation of meanr, meanlogr.

Parameters:
  • cat1 (Catalog) – The first catalog to process

  • cat2 (Catalog) – The second catalog to process

  • metric (str) – Which metric to use. See Metrics for details. (default: ‘Euclidean’; this value can also be given in the constructor in the config dict.)

  • num_threads (int) – How many OpenMP threads to use during the calculation. (default: use the number of cpu cores; this value can also be given in the constructor in the config dict.)

read(file_name, *, file_type=None)[source]

Read in values from a file.

This should be a file that was written by TreeCorr, preferably a FITS or HDF5 file, so there is no loss of information.

Warning

The NNCorrelation object should be constructed with the same configuration parameters as the one being read. e.g. the same min_sep, max_sep, etc. This is not checked by the read function.

Parameters:
  • file_name (str) – The name of the file to read in.

  • file_type (str) – The type of file (‘ASCII’ or ‘FITS’). (default: determine the type automatically from the extension of file_name.)

write(file_name, *, rr=None, dr=None, rd=None, file_type=None, precision=None, write_patch_results=False, write_cov=False)[source]

Write the correlation function to the file, file_name.

rr is the NNCorrelation function for random points. If dr is None, the simple correlation function \(\xi = (DD - RR)/RR\) is used. if dr is given and rd is None, then \(\xi = (DD - 2DR + RR)/RR\) is used. If dr and rd are both given, then \(\xi = (DD - DR - RD + RR)/RR\) is used.

Normally, at least rr should be provided, but if this is also None, then only the basic accumulated number of pairs are output (along with the separation columns).

The output file will include the following columns:

Column

Description

r_nom

The nominal center of the bin in r

meanr

The mean value \(\langle r\rangle\) of pairs that fell into each bin

meanlogr

The mean value \(\langle \log(r)\rangle\) of pairs that fell into each bin

xi

The estimator \(\xi\) (if rr is given, or calculateXi has been called)

sigma_xi

The sqrt of the variance estimate of xi (if rr is given or calculateXi has been called)

DD

The total weight of pairs in each bin.

RR

The total weight of RR pairs in each bin (if rr is given)

DR

The total weight of DR pairs in each bin (if dr is given)

RD

The total weight of RD pairs in each bin (if rd is given)

npairs

The total number of pairs in each bin

If sep_units was given at construction, then the distances will all be in these units. Otherwise, they will be in either the same units as x,y,z (for flat or 3d coordinates) or radians (for spherical coordinates).

Parameters:
  • file_name (str) – The name of the file to write to.

  • rr (NNCorrelation) – The auto-correlation of the random field (RR)

  • dr (NNCorrelation) – The cross-correlation of the data with randoms (DR), if desired. (default: None)

  • rd (NNCorrelation) – The cross-correlation of the randoms with data (RD), if desired. (default: None, which means use rd=dr)

  • file_type (str) – The type of file to write (‘ASCII’ or ‘FITS’). (default: determine the type automatically from the extension of file_name.)

  • precision (int) – For ASCII output catalogs, the desired precision. (default: 4; this value can also be given in the constructor in the config dict.)

  • write_patch_results (bool) – Whether to write the patch-based results as well. (default: False)

  • write_cov (bool) – Whether to write the covariance matrix as well. (default: False)