Using configuration files

Most of the TreeCorr classes can take a config parameter in lieu of a set of keyword arguments. This is not necessarily incredibly useful when driving the code from Python; however, it enables running the code from some executable scripts, described below.

Specifically, the parameters defined in the configuration file are loaded into a Python dict, which is passed to each class as needed. The advantage of this is that TreeCorr will only use the parameters it actually needs when initializing each object. Any additional parameters (e.g. those that are relevant to a different class) are ignored.

The corr2 and corr3 executables

Along with the installed Python library, TreeCorr also includes two executable scripts, called corr2 and corr3. Each script takes one required command-line argument, which is the name of a configuration file:

corr2 config.yaml
corr3 config.yaml

A sample configuration file is provided, called sample_config.yaml.

For the complete documentation about the allowed parameters, see:

Configuration Parameters

YAML is the recommended format for the configuration file, but we also allow JSON files if you prefer, or a legacy format, which is like an .ini file, but without the section headings, consisting of key = value lines. The three formats are normally distinguished by their extensions (.yaml, .json, or .params respectively), but you can also give the file type explicitly with the -f option. For example:

corr2 my_config_file.txt -f params

would specify that the configuration file my_config_file.txt uses the legacy “params” format.

You can also specify parameters on the command line after the name of the configuration file. For example:

corr2 config.yaml file_name=file1.dat gg_file_name=file1.out
corr2 config.yaml file_name=file2.dat gg_file_name=file2.out
...

This can be useful when running the program from a script for many input files.

Python API and corr2/corr3 parity

The corr2 and corr3 executables and the Python treecorr.corr2 / treecorr.corr3 functions use the same configuration logic and core processing pipeline, so they produce matching results when given equivalent configs and inputs.

What is equivalent:

Same parameter names and meanings from Configuration Parameters.
Same correlation calculations and estimators.
Same output products when the same output file options are set.

What differs:

The executables are file-driven, one-shot runs from the command line.
Direct class usage in Python (Catalog, Corr2, Corr3) provides lower-level control, such as:
- splitting processing into process_auto / process_cross / finalize
- custom post-processing with estimate_cov(..., func=...) and estimate_multi_cov(..., func=...)
- explicit in-memory workflows without writing intermediate files

When to choose which interface:

Use corr2/corr3 for reproducible, config-driven batch jobs.
Use direct Python classes for iterative analysis, custom data vectors, and non-standard control flow.

The corr2 function from Python

The same functionality that you have from the corr2 executable is available in Python via the corr2 function:

import treecorr
config = treecorr.read_config(config_file)
config['file_name'] = 'catalog.dat'
config['gg_file_name'] = 'gg.out'
treecorr.corr2(config)

treecorr.corr2(config, logger=None)[source]

Run the full two-point correlation function code based on the parameters in the given config dict.

The function print_corr2_params will output information about the valid parameters that are expected to be in the config dict.

Optionally a logger parameter may be given, in which case it is used for logging. If not given, the logging will be based on the verbose and log_file parameters.

Parameters:

config – The configuration dict which defines what to do.
logger – If desired, a Logger object for logging. (default: None, in which case one will be built according to the config dict’s verbose level.)

The corr3 function from Python

treecorr.corr3(config, logger=None)[source]

Run the full three-point correlation function code based on the parameters in the given config dict.

The function print_corr3_params will output information about the valid parameters that are expected to be in the config dict.

Optionally a logger parameter may be given, in which case it is used for logging. If not given, the logging will be based on the verbose and log_file parameters.

Parameters:

config – The configuration dict which defines what to do.
logger – If desired, a Logger object for logging. (default: None, in which case one will be built according to the config dict’s verbose level.)

Utilities related to the configuration dict

treecorr.config.check_config(config, params, aliases=None, logger=None)[source]

Check (and update) a config dict to conform to the given parameter rules. The params dict has an entry for each valid config parameter whose value is a tuple with the following items:

type
can be a list?
default value
valid values
description (Multiple entries here are allowed for longer strings)

The file corr2.py has a list of parameters for the corr2 program.

Parameters:

config – The config dict to check.
params – A dict of valid parameters with information about each one.
aliases – A dict of deprecated parameters that are still aliases for new names. (default: None)
logger – If desired, a Logger object for logging any warnings here. (default: None)

Returns:

The updated config dict.

treecorr.config.convert(value, value_type, key)[source]

Convert the given value to the given type.

The key helps determine what kind of conversion should be performed. Specifically if ‘unit’ is in the key value, then a unit conversion is done. Otherwise, it just parses the value according to the value_type.

Parameters:

value – The input value to be converted. Usually a string.
value_type – The type to convert to.
key – The key for this value. Only used to see if it includes ‘unit’.

Returns:

The converted value.

treecorr.config.get(config, key, value_type=<class 'str'>, default=None)[source]

A helper function to get a key from config converting to a particular type

Parameters:

config – The configuration dict from which to get the key value.
key – Which key to get from config.
value_type – Which type should the value be converted to. (default: str)
default – What value should be used if the key is not in the config dict, or the value corresponding to the key is None. (default: None)

Returns:

The specified value, converted as needed.

treecorr.config.get_from_list(config, key, num, value_type=<class 'str'>, default=None)[source]

A helper function to get a key from config that is allowed to be a list

Some of the config values are allowed to be lists of values, in which case we take the num item from the list. If they are not a list, then the given value is used for all values of num.

Parameters:

config – The configuration dict from which to get the key value.
key – What key to get from config.
num – Which number element to use if the item is a list.
value_type – What type should the value be converted to. (default: str)
default – What value should be used if the key is not in the config dict, or the value corresponding to the key is None. (default: None)

Returns:

The specified value, converted as needed.

treecorr.config.make_minimal_config(config, valid_params)[source]

Make a minimal version of a config dict, excluding any values that are the default.

Parameters:

config (dict) – The source config (will not be modified)
valid_params (dict) – A dict of valid parameters that are allowed for this usage.

Returns:

minimal_config The dict without any default values.

treecorr.config.merge_config(config, kwargs, valid_params, aliases=None)[source]

Merge in the values from kwargs into config.

If either of these is None, then the other one is returned. If they are both dicts, then the values in kwargs take precedence over ones in config if there are any keys that are in both. Also, the kwargs dict will be modified in this case.

Parameters:

config – The root config (will not be modified)
kwargs – A second dict with more or updated values
valid_params – A dict of valid parameters that are allowed for this usage. The config dict is allowed to have extra items, but kwargs is not.
aliases – An optional dict of aliases. (default: None)

Returns:

The merged dict, including only items that are in valid_params.

treecorr.config.parse(value, value_type, name)[source]

Parse the input value as the given type.

Parameters:

value – The value to parse.
value_type – The type expected for this.
name – The name of this value. Only used for error reporting.

Returns:

value

treecorr.config.parse_bool(value)[source]

Parse a value as a boolean.

Valid string values for True are: ‘true’, ‘yes’, ‘t’, ‘y’ Valid string values for False are: ‘false’, ‘no’, ‘f’, ‘n’, ‘none’ Capitalization is ignored.

If value is an integer (or a string that parses as an integer), the integer value is returned. This preserves special integer semantics used by some parameters.

Parameters:: value – The value to parse.
Returns:: The parsed boolean (or integer, as described above).

treecorr.config.parse_unit(value)[source]

Parse the input value as a string that should be one of the valid angle units in coord.AngleUnit.valid_names.

The value is allowed to merely start with one of the unit names. So ‘deg’, ‘degree’, ‘degrees’ all convert to ‘deg’ which is the name in coord.AngleUnit.valid_names. The return value in this case would be coord.AngleUnit.from_name(‘deg’).value, which has the value pi/180.

Parameters:: value – The unit as a string value to parse.
Returns:: The given unit in radians.

treecorr.config.parse_variable(config, v)[source]

Parse a configuration variable from a string that should look like ‘key = value’ and write that value to config[key].

Parameters:

config – The configuration dict to which to write the key,value pair
v – A string of the form ‘key = value’

treecorr.config.print_params(params)[source]

Print the information about the valid parameters, given by the given params dict. See check_config for the structure of the params dict.

Parameters:: params – A dict of valid parameters with information about each one.

treecorr.config.read_config(file_name, file_type='auto')[source]

Read a configuration dict from a file.

Parameters:

file_name – The file name from which the configuration dict should be read.
file_type – The type of config file. Options are ‘auto’, ‘yaml’, ‘json’, ‘params’. (default: ‘auto’, which tries to determine the type from the extension)

Returns:

A config dict built from the configuration file.

treecorr.config.setup_logger(verbose, log_file=None, name=None)[source]

Parse the integer verbosity level from the command line args into a logging_level string

Parameters:

verbose – An integer indicating what verbosity level to use.
log_file – If given, a file name to which to write the logging output. If omitted or None, then output to stdout.

Returns:

The logging.Logger object to use.

File Writers

class treecorr.writer.FitsWriter(file_name, *, logger=None)[source]

Writer interface for FITS files.

write(col_names, columns, *, params=None, ext=None)[source]

Write some columns to an output FITS table with the given column names.

If ext is not None, then it is used as the name of the extension for these data.

Parameters:

col_names – A list of column names for the given columns.
columns – A list of numpy arrays with the data to write.
params – A dict of extra parameters to write in the extension header.
ext – Optional extension name for these data. (default: None)

write_array(data, *, ext=None)[source]

Write a (typically 2-d) numpy array to the output file.

Parameters:

data – The array to write.
ext – Optional extension name for these data. (default: None)

class treecorr.writer.HdfWriter(file_name, *, logger=None)[source]

Writer interface for HDF5 files. Uses h5py to write columns, etc.

write(col_names, columns, *, params=None, ext=None)[source]

Write some columns to an output HDF5 group with the given column names.

If ext is not None, then it is used as the name of the group for these data.

Parameters:

col_names – A list of column names for the given columns.
columns – A list of numpy arrays with the data to write.
params – A dict of extra parameters to write in the group attributes.
ext – Optional group name for these data. (default: None)

write_array(data, *, ext=None)[source]

Write a (typically 2-d) numpy array to the output file.

Parameters:

data – The array to write.
ext – Optional group name for these data. (default: None)

class treecorr.writer.AsciiWriter(file_name, *, precision=4, logger=None)[source]

Write data to an ASCII (text) file.

write(col_names, columns, *, params=None, ext=None)[source]

Write some columns to an output ASCII file with the given column names.

Parameters:

col_names – A list of column names for the given columns. These will be written in a header comment line at the top of the output file.
columns – A list of numpy arrays with the data to write.
params – A dict of extra parameters to write at the top of the output file.
ext – Optional extension name for these data. (default: None)

write_array(data, *, ext=None)[source]

Write a (typically 2-d) numpy array to the output file.

Parameters:

data – The array to write.
ext – Optional extension name for these data. (default: None)

Using configuration files

The corr2 and corr3 executables

Python API and corr2/corr3 parity

The corr2 function from Python

The corr3 function from Python

Other utilities related to corr2 and corr3

Utilities related to the configuration dict

File Writers