Data Generation

Module to generate/prepare data, likelihood, and priors for parallel runs.

This will create a directory structure for your parallel runs to store the output files, logs and plots. It will also generate a data_dump that stores information on the run settings and data to be analysed.

Note that several of these arguments are inherited from bilby_pipe.

Command line interface for data generation

usage: parallel_bilby_generation [--version] [-n NLIVE] [--dlogz DLOGZ]
                                 [--n-effective N_EFFECTIVE]
                                 [--dynesty-sample DYNESTY_SAMPLE]
                                 [--dynesty-bound DYNESTY_BOUND]
                                 [--walks WALKS] [--maxmcmc MAXMCMC]
                                 [--nact NACT] [--min-eff MIN_EFF]
                                 [--facc FACC] [--vol-dec VOL_DEC]
                                 [--vol-check VOL_CHECK] [--enlarge ENLARGE]
                                 [--n-check-point N_CHECK_POINT]
                                 [--max-its MAX_ITS]
                                 [--max-run-time MAX_RUN_TIME]
                                 [--fast-mpi FAST_MPI]
                                 [--mpi-timing MPI_TIMING]
                                 [--mpi-timing-interval MPI_TIMING_INTERVAL]
                                 [--nestcheck] [--nsamples NSAMPLES]
                                 [--ntemps NTEMPS] [--nwalkers NWALKERS]
                                 [--max-iterations MAX_ITERATIONS]
                                 [--ncheck NCHECK]
                                 [--burn-in-nact BURN_IN_NACT]
                                 [--thin-by-nact THIN_BY_NACT]
                                 [--frac-threshold FRAC_THRESHOLD]
                                 [--nfrac NFRAC] [--min-tau MIN_TAU]
                                 [--Tmax TMAX] [--safety SAFETY]
                                 [--autocorr-c AUTOCORR_C]
                                 [--autocorr-tol AUTOCORR_TOL] [--adapt]
                                 [--bilby-zero-likelihood-mode]
                                 [--sampling-seed SAMPLING_SEED] [-c]
                                 [--no-plot] [--do-not-save-bounds-in-resume]
                                 [--check-point-deltaT CHECK_POINT_DELTAT]
                                 [-h] [-v]
                                 [--calibration-model {CubicSpline,None}]
                                 [--spline-calibration-envelope-dict SPLINE_CALIBRATION_ENVELOPE_DICT]
                                 [--spline-calibration-nodes SPLINE_CALIBRATION_NODES]
                                 [--spline-calibration-amplitude-uncertainty-dict SPLINE_CALIBRATION_AMPLITUDE_UNCERTAINTY_DICT]
                                 [--spline-calibration-phase-uncertainty-dict SPLINE_CALIBRATION_PHASE_UNCERTAINTY_DICT]
                                 [--ignore-gwpy-data-quality-check IGNORE_GWPY_DATA_QUALITY_CHECK]
                                 [--gps-tuple GPS_TUPLE] [--gps-file GPS_FILE]
                                 [--timeslide-file TIMESLIDE_FILE]
                                 [--timeslide-dict TIMESLIDE_DICT]
                                 [--trigger-time TRIGGER_TIME]
                                 [--gaussian-noise]
                                 [--n-simulation N_SIMULATION]
                                 [--data-dict DATA_DICT]
                                 [--data-format DATA_FORMAT]
                                 [--channel-dict CHANNEL_DICT]
                                 [--coherence-test] [--detectors DETECTORS]
                                 [--duration DURATION]
                                 [--generation-seed GENERATION_SEED]
                                 [--psd-dict PSD_DICT]
                                 [--psd-fractional-overlap PSD_FRACTIONAL_OVERLAP]
                                 [--post-trigger-duration POST_TRIGGER_DURATION]
                                 [--sampling-frequency SAMPLING_FREQUENCY]
                                 [--psd-length PSD_LENGTH]
                                 [--psd-maximum-duration PSD_MAXIMUM_DURATION]
                                 [--psd-method PSD_METHOD]
                                 [--psd-start-time PSD_START_TIME]
                                 [--maximum-frequency MAXIMUM_FREQUENCY]
                                 [--minimum-frequency MINIMUM_FREQUENCY]
                                 [--zero-noise ZERO_NOISE]
                                 [--tukey-roll-off TUKEY_ROLL_OFF]
                                 [--resampling-method {lal,gwpy}]
                                 [--injection]
                                 [--injection-dict INJECTION_DICT | --injection-file INJECTION_FILE]
                                 [--injection-numbers INJECTION_NUMBERS]
                                 [--injection-waveform-approximant INJECTION_WAVEFORM_APPROXIMANT]
                                 [--label LABEL] [--outdir OUTDIR]
                                 [--periodic-restart-time PERIODIC_RESTART_TIME]
                                 [--scheduler-analysis-time SCHEDULER_ANALYSIS_TIME]
                                 [--submit]
                                 [--condor-job-priority CONDOR_JOB_PRIORITY]
                                 [--log-directory LOG_DIRECTORY]
                                 [--distance-marginalization]
                                 [--distance-marginalization-lookup-table DISTANCE_MARGINALIZATION_LOOKUP_TABLE]
                                 [--phase-marginalization]
                                 [--time-marginalization]
                                 [--jitter-time JITTER_TIME]
                                 [--reference-frame REFERENCE_FRAME]
                                 [--time-reference TIME_REFERENCE]
                                 [--likelihood-type LIKELIHOOD_TYPE]
                                 [--roq-folder ROQ_FOLDER]
                                 [--roq-weights ROQ_WEIGHTS]
                                 [--roq-scale-factor ROQ_SCALE_FACTOR]
                                 [--extra-likelihood-kwargs EXTRA_LIKELIHOOD_KWARGS]
                                 [--create-plots] [--create-summary]
                                 [--notification NOTIFICATION]
                                 [--existing-dir EXISTING_DIR]
                                 [--webdir WEBDIR]
                                 [--summarypages-arguments SUMMARYPAGES_ARGUMENTS]
                                 [--default-prior DEFAULT_PRIOR]
                                 [--deltaT DELTAT]
                                 [--prior-file PRIOR_FILE | --prior-dict PRIOR_DICT]
                                 [--convert-to-flat-in-component-mass CONVERT_TO_FLAT_IN_COMPONENT_MASS]
                                 [--single-postprocessing-executable SINGLE_POSTPROCESSING_EXECUTABLE]
                                 [--single-postprocessing-arguments SINGLE_POSTPROCESSING_ARGUMENTS]
                                 [--n-parallel N_PARALLEL]
                                 [--waveform-generator WAVEFORM_GENERATOR]
                                 [--reference-frequency REFERENCE_FREQUENCY]
                                 [--waveform-approximant WAVEFORM_APPROXIMANT]
                                 [--catch-waveform-errors CATCH_WAVEFORM_ERRORS]
                                 [--pn-spin-order PN_SPIN_ORDER]
                                 [--pn-tidal-order PN_TIDAL_ORDER]
                                 [--pn-phase-order PN_PHASE_ORDER]
                                 [--pn-amplitude-order PN_AMPLITUDE_ORDER]
                                 [--mode-array MODE_ARRAY]
                                 [--frequency-domain-source-model FREQUENCY_DOMAIN_SOURCE_MODEL]
                                 [--sampler {dynesty,ptemcee}] --nodes NODES
                                 --ntasks-per-node NTASKS_PER_NODE --time TIME
                                 [--mem-per-cpu MEM_PER_CPU]
                                 [--extra-lines EXTRA_LINES]
                                 [--slurm-extra-lines SLURM_EXTRA_LINES]
                                 ini

Positional Arguments

ini

Configuration ini file

Named Arguments

--version

show program’s version number and exit

-v, --verbose

Verbose output

Default: False

--injection-dict

A single injection dictionary given in the ini file

--injection-file

Injection file to use. See bilby_pipe_create_injection_file –help for supported formats

--prior-file

The prior file

--prior-dict

A dictionary of priors

--sampler

Possible choices: dynesty, ptemcee

The parallelised sampler to use, defaults to dynesty

Default: “dynesty”

Dynesty Settings

-n, --nlive

Number of live points

Default: 1000

--dlogz

Stopping criteria: remaining evidence, (default=0.1)

Default: 0.1

--n-effective

Stopping criteria: effective number of samples, (default=inf)

Default: inf

--dynesty-sample

Dynesty sampling method (default=rwalk). Note, the dynesty rwalk method is overwritten by parallel bilby for an optimised version

Default: “rwalk”

--dynesty-bound

Dynesty bounding method (default=multi)

Default: “multi”

--walks

Minimum number of walks, defaults to 100

Default: 100

--maxmcmc

Maximum number of walks, defaults to 5000

Default: 5000

--nact

Number of autocorrelation times to take, defaults to 5

Default: 5

--min-eff

The minimum efficiency at which to switch from uniform sampling.

Default: 10

--facc

See dynesty.NestedSampler

Default: 0.5

--vol-dec

See dynesty.NestedSampler

Default: 0.5

--vol-check

See dynesty.NestedSampler

Default: 8

--enlarge

See dynesty.NestedSampler

Default: 1.5

--n-check-point

Steps to take before attempting checkpoint

Default: 100

--max-its

Maximum number of iterations to sample for (default=1.e10)

Default: 10000000000

--max-run-time

Maximum time to run for (default=1.e10 s)

Default: 10000000000.0

--fast-mpi

Fast MPI communication pattern (default=False)

Default: False

--mpi-timing

Print MPI timing when finished (default=False)

Default: False

--mpi-timing-interval

Interval to write timing snapshot to disk (default=0 – disabled)

Default: 0

--nestcheck

Save a ‘nestcheck’ pickle in the outdir (default=False). This pickle stores a nestcheck.data_processing.process_dynesty_run object, which can be used during post processing to compute the implementation and bootstrap errors explained by Higson et al (2018) in “Sampling Errors In Nested Sampling Parameter Estimation”.

Default: False

PTEmcee Settings

--nsamples

Number of samples to draw

Default: 10000

--ntemps

Number of temperatures

Default: 20

--nwalkers

Number of walkers

Default: 100

--max-iterations

Maximum number of iterations

Default: 100000

--ncheck

Period with which to check convergence

Default: 500

--burn-in-nact

Number of autocorrelation times to discard for burn-in

Default: 50.0

--thin-by-nact

Thin-by number of autocorrelation times

Default: 1.0

--frac-threshold

Threshold on the fractional change in ACT required for convergence

Default: 0.01

--nfrac

The number of checks passing the frac-threshold for convergence

Default: 5

--min-tau

The minimum tau to accept: used to prevent early convergence

Default: 30

--Tmax

The maximum temperature to use, default=10000

Default: 10000

--safety

Multiplicitive safety factor on the estimated tau

Default: 1.0

--autocorr-c

The step size for the window search when calculating tau. Default: 5

Default: 5.0

--autocorr-tol

The minimum number of autocorrelations needs to trust the autocorrelation estimate. Default: 0 (always return a result)

Default: 50.0

--adapt

If True, the temperature ladder is dynamically adapted as the sampler runs to achieve uniform swap acceptance ratios between adjacent chains. See arXiv:1501.05823 for details.

Default: False

Misc. Settings

--bilby-zero-likelihood-mode

Default: False

--sampling-seed

Random seed for sampling, parallel runs will be incremented

Default: 1234

-c, --clean

Run clean: ignore any resume files

Default: False

--no-plot

If true, don’t generate check-point plots

Default: False

--do-not-save-bounds-in-resume

If true, do not store bounds in the resume file. This can make resume files large (~GB)

Default: False

--check-point-deltaT

Write a checkpoint resume file and diagnostic plots every deltaT [s].

Default: 3600

Calibration arguments

Which calibration model and settings to use.

--calibration-model

Possible choices: CubicSpline, None

Choice of calibration model, if None, no calibration is used

--spline-calibration-envelope-dict

Dictionary pointing to the spline calibration envelope files

--spline-calibration-nodes

Number of calibration nodes

Default: 5

--spline-calibration-amplitude-uncertainty-dict

Dictionary of the amplitude uncertainties for the constant uncertainty model

--spline-calibration-phase-uncertainty-dict

Dictionary of the phase uncertainties for the constant uncertainty model

Data generation arguments

How to generate the data, e.g., from a list of gps times or simulated Gaussian noise.

--ignore-gwpy-data-quality-check

Ignores the check to see if data queried from GWpy (ie not gaussian noise) is obtained from time when the IFOs are in science mode.

Default: True

--gps-tuple

Tuple of the (start, step, number) of GPS start times. For example, (10, 1, 3) produces the gps start times [10, 11, 12]. If given, gps-file is ignored.

--gps-file

File containing segment GPS start times. This can be a multi-column file if (a) it is comma-separated and (b) the zeroth column contains the gps-times to use

--timeslide-file

File containing detector timeslides. Requires a GPS time file to also be provided. One column for each detector. Order of detectors specified by –detectors argument. Number of timeslides must correspond to the number of GPS times provided.

--timeslide-dict

Dictionary containing detector timeslides: applies a fixed offset per detector. E.g. to apply +1s in H1, {H1: 1}

--trigger-time

Either a GPS trigger time, or the event name (e.g. GW150914). For event names, the gwosc package is used to identify the trigger time

--gaussian-noise

If true, use simulated Gaussian noise

Default: False

--n-simulation

Number of simulated segments to use with gaussian-noise Note, this must match the number of injections specified

Default: 0

--data-dict

Dictionary of paths to gwf, or hdf5 data files

--data-format

If given, the data format to pass to `gwpy.timeseries.TimeSeries.read(), see gwpy.github.io/docs/stable/timeseries/io.html

--channel-dict

Channel dictionary: keys relate to the detector with values the channel name, e.g. ‘GDS-CALIB_STRAIN’. For GWOSC open data, set the channel-dict keys to ‘GWOSC’. Note, the dictionary should follow basic python dict syntax.

Detector arguments

How to set up the interferometers and power spectral density.

--coherence-test

Run the analysis for all detectors together and for each detector separately

Default: False

--detectors

The names of detectors to use. If given in the ini file, detectors are specified by detectors=[H1, L1]. If given at the command line, as –detectors H1 –detectors L1

--duration

The duration of data around the event to use

Default: 4

--generation-seed

Random seed used during data generation. If no generation seed provided, a random seed between 1 and 1e6 is selected. If a seed is provided, it is used as the base seed and all generation jobs will have their seeds set as {generation_seed = base_seed + job_idx}.

--psd-dict

Dictionary of PSD files to use

--psd-fractional-overlap

Fractional overlap of segments used in estimating the PSD

Default: 0.5

--post-trigger-duration

Time (in s) after the trigger_time to the end of the segment

Default: 2.0

--sampling-frequency

Default: 4096

--psd-length

Sets the psd duration (up to the psd-duration-maximum). PSD duration calculated by psd-length x duration [s]. Default is 32.

Default: 32

--psd-maximum-duration

The maximum allowed PSD duration in seconds, default is 1024s.

Default: 1024

--psd-method

PSD method see gwpy.timeseries.TimeSeries.psd for options

Default: “median”

--psd-start-time

Start time of data (relative to the segment start) used to generate the PSD. Defaults to psd-duration before the segment start time

--maximum-frequency

The maximum frequency, given either as a float for all detectors or as a dictionary (see minimum-frequency)

--minimum-frequency

The minimum frequency, given either as a float for all detectors or as a dictionary where all keys relate the detector with values of the minimum frequency, e.g. {H1: 10, L1: 20}. If the waveform generation should start the minimum frequency for any of the detectors, add another entry to the dictionary, e.g., {H1: 40, L1: 60, waveform: 20}.

Default: “20”

--zero-noise

Use a zero noise realisation

Default: False

--tukey-roll-off

Roll off duration of tukey window in seconds, default is 0.4s

Default: 0.4

--resampling-method

Possible choices: lal, gwpy

Resampling method to use: lal matches the resampling used by lalinference/BayesWave

Default: “lal”

Injection arguments

Whether to include software injections and how to generate them.

--injection

Create data from an injection file

Default: False

--injection-numbers

Specific injections rows to use from the injection_file, e.g. `injection_numbers=[0,3] selects the zeroth and third row

--injection-waveform-approximant

The name of the waveform approximant to use to create injections. If none is specified, then the waveform-approximant will be usedas the injection-waveform-approximant.

Job submission arguments

How the jobs should be formatted, e.g., which job scheduler to use.

--label

Output label

Default: “label”

--outdir

Output directory

Default: “.”

--periodic-restart-time

Time after which the job will self-evict when scheduler=condor. After this, condor will restart the job. Default is 28800. This is used to decrease the chance of HTCondor hard evictions

Default: 28800

--scheduler-analysis-time

Default: 7-00:00:00

--submit

Attempt to submit the job after the build

Default: False

--condor-job-priority

Job priorities allow a user to sort their HTCondor jobs to determine which are tried to be run first. A job priority can be any integer: larger values denote better priority. By default HTCondor job priority=0.

Default: 0

--log-directory

If given, an alternative path for the log output

Likelihood arguments

Options for setting up the likelihood.

--distance-marginalization

Boolean. If true, use a distance-marginalized likelihood

Default: False

--distance-marginalization-lookup-table

Path to the distance-marginalization lookup table

--phase-marginalization

Boolean. If true, use a phase-marginalized likelihood

Default: False

--time-marginalization

Boolean. If true, use a time-marginalized likelihood

Default: False

--jitter-time

Boolean. If true, and using a time-marginalized likelihood ‘time jittering’ will be performed

Default: True

--reference-frame

Reference frame for the sky parameterisation, either ‘sky’ (default) or, e.g., ‘H1L1’

Default: “sky”

--time-reference

Time parameter to sample in, either ‘geocent’ (default) or, e.g., ‘H1’

Default: “geocent”

--likelihood-type

The likelihood. Can be one of [GravitationalWaveTransient, ROQGravitationalWaveTransient] or python path to a bilby likelihood class available in the users installation. Need to specify –roq-folder if ROQ likelihood used

Default: “GravitationalWaveTransient”

--roq-folder

The data for ROQ

--roq-weights

If given, the ROQ weights to use (rather than building them). This must be given along with the roq-folder for checking

--roq-scale-factor

Rescaling factor for the ROQ, default is 1 (no rescaling)

Default: 1

--extra-likelihood-kwargs

Additional keyword arguments to pass to the likelihood. Any arguments which are named bilby_pipe arguments, e.g., distance_marginalization should NOT be included. This is only used if you are not using the GravitationalWaveTransient or ROQGravitationalWaveTransient likelihoods

Output arguments

What kind of output/summary to generate.

--create-plots

Create diagnostic and posterior plots

Default: False

--create-summary

Create a PESummary page

Default: False

--notification

Notification setting for HTCondor jobs. One of ‘Always’,’Complete’,’Error’,’Never’. If defined by ‘Always’, the owner will be notified whenever the job produces a checkpoint, as well as when the job completes. If defined by ‘Complete’, the owner will be notified when the job terminates. If defined by ‘Error’, the owner will only be notified if the job terminates abnormally, or if the job is placed on hold because of a failure, and not by user request. If defined by ‘Never’ (the default), the owner will not receive e-mail, regardless to what happens to the job. Note, an email arg is also required for notifications to be emailed.

Default: Never

--existing-dir

If given, add results to an directory with an an existing summary.html file

--webdir

Directory to store summary pages. If not given, defaults to outdir/results_page

--summarypages-arguments

Arguments (in the form of a dictionary) to pass to the summarypages executable

Prior arguments

Specify the prior settings.

--default-prior

The name of the prior set to base the prior on. Can be one of[PriorDict, BBHPriorDict, BNSPriorDict, CalibrationPriorDict]

Default: “BBHPriorDict”

--deltaT

The symmetric width (in s) around the trigger time to search over the coalesence time

Default: 0.2

--convert-to-flat-in-component-mass

Convert a flat-in chirp mass and mass-ratio prior file to flat in component mass during the post-processing. Note, the prior must be uniform in Mc and q with constraints in m1 and m2 for this to work

Default: False

Post processing arguments

What post-processing to perform.

--single-postprocessing-executable

An executable name for postprocessing. A single postprocessing job is run as a child for each analysis jobs: note the difference with respect postprocessing-executable

--single-postprocessing-arguments

Arguments to pass to the single postprocessing executable. The str ‘$RESULT’ will be replaced by the path to the individual result file

Sampler arguments

--n-parallel

Number of identical parallel jobs to run per event

Default: 1

Waveform arguments

Setting for the waveform generator

--waveform-generator

The waveform generator class, should be a python path. This will not be able to use any arguments not passed to the default.

Default: “bilby.gw.waveform_generator.WaveformGenerator”

--reference-frequency

The reference frequency

Default: 20

--waveform-approximant

The name of the waveform approximant to use for PE.

Default: “IMRPhenomPv2”

--catch-waveform-errors

Turns on waveform error catching

Default: False

--pn-spin-order

Post-newtonian order to use for the spin

Default: -1

--pn-tidal-order

Post-Newtonian order to use for tides

Default: -1

--pn-phase-order

post-Newtonian order to use for the phase

Default: -1

--pn-amplitude-order

Post-Newtonian order to use for the amplitude. Also used to determine the waveform starting frequency.

Default: 0

--mode-array

Array of modes to use for the waveform. Should be a list of lists, eg. [[2,2], [2,-2]]

--frequency-domain-source-model

Name of the frequency domain source model. Can be one of[lal_binary_black_hole, lal_binary_neutron_star,lal_eccentric_binary_black_hole_no_spins, sinegaussian, supernova, supernova_pca_model] or any python path to a bilby source function the users installation, e.g. examp.source.bbh

Default: “lal_binary_black_hole”

Slurm Settings

--nodes

Number of nodes to use

--ntasks-per-node

Number of tasks per node

--time

Maximum wall time (defaults to 24:00:00)

Default: “24:00:00”

--mem-per-cpu

Memory per CPU (defaults to None)

--extra-lines

Additional lines, separated by ‘;’, use for setting up conda env

--slurm-extra-lines

additional slurm args (args that need #SBATCH in front) of the form arg=val separated by sapce