Data Generation¶
Module to generate/prepare data, likelihood, and priors for parallel runs.
This will create a directory structure for your parallel runs to store the output files, logs and plots. It will also generate a data_dump that stores information on the run settings and data to be analysed.
Note that several of these arguments are inherited from bilby_pipe.
Command line interface for data generation¶
usage: parallel_bilby_generation [--version] [-n NLIVE] [--dlogz DLOGZ]
[--n-effective N_EFFECTIVE]
[--dynesty-sample DYNESTY_SAMPLE]
[--dynesty-bound DYNESTY_BOUND]
[--walks WALKS] [--maxmcmc MAXMCMC]
[--nact NACT] [--min-eff MIN_EFF]
[--facc FACC] [--vol-dec VOL_DEC]
[--vol-check VOL_CHECK] [--enlarge ENLARGE]
[--n-check-point N_CHECK_POINT]
[--max-its MAX_ITS]
[--max-run-time MAX_RUN_TIME]
[--fast-mpi FAST_MPI]
[--mpi-timing MPI_TIMING]
[--mpi-timing-interval MPI_TIMING_INTERVAL]
[--nestcheck] [--nsamples NSAMPLES]
[--ntemps NTEMPS] [--nwalkers NWALKERS]
[--max-iterations MAX_ITERATIONS]
[--ncheck NCHECK]
[--burn-in-nact BURN_IN_NACT]
[--thin-by-nact THIN_BY_NACT]
[--frac-threshold FRAC_THRESHOLD]
[--nfrac NFRAC] [--min-tau MIN_TAU]
[--Tmax TMAX] [--safety SAFETY]
[--autocorr-c AUTOCORR_C]
[--autocorr-tol AUTOCORR_TOL] [--adapt]
[--bilby-zero-likelihood-mode]
[--sampling-seed SAMPLING_SEED] [-c]
[--no-plot] [--do-not-save-bounds-in-resume]
[--check-point-deltaT CHECK_POINT_DELTAT]
[-h] [-v]
[--calibration-model {CubicSpline,None}]
[--spline-calibration-envelope-dict SPLINE_CALIBRATION_ENVELOPE_DICT]
[--spline-calibration-nodes SPLINE_CALIBRATION_NODES]
[--spline-calibration-amplitude-uncertainty-dict SPLINE_CALIBRATION_AMPLITUDE_UNCERTAINTY_DICT]
[--spline-calibration-phase-uncertainty-dict SPLINE_CALIBRATION_PHASE_UNCERTAINTY_DICT]
[--ignore-gwpy-data-quality-check IGNORE_GWPY_DATA_QUALITY_CHECK]
[--gps-tuple GPS_TUPLE] [--gps-file GPS_FILE]
[--timeslide-file TIMESLIDE_FILE]
[--timeslide-dict TIMESLIDE_DICT]
[--trigger-time TRIGGER_TIME]
[--gaussian-noise]
[--n-simulation N_SIMULATION]
[--data-dict DATA_DICT]
[--data-format DATA_FORMAT]
[--channel-dict CHANNEL_DICT]
[--coherence-test] [--detectors DETECTORS]
[--duration DURATION]
[--generation-seed GENERATION_SEED]
[--psd-dict PSD_DICT]
[--psd-fractional-overlap PSD_FRACTIONAL_OVERLAP]
[--post-trigger-duration POST_TRIGGER_DURATION]
[--sampling-frequency SAMPLING_FREQUENCY]
[--psd-length PSD_LENGTH]
[--psd-maximum-duration PSD_MAXIMUM_DURATION]
[--psd-method PSD_METHOD]
[--psd-start-time PSD_START_TIME]
[--maximum-frequency MAXIMUM_FREQUENCY]
[--minimum-frequency MINIMUM_FREQUENCY]
[--zero-noise ZERO_NOISE]
[--tukey-roll-off TUKEY_ROLL_OFF]
[--resampling-method {lal,gwpy}]
[--injection]
[--injection-dict INJECTION_DICT | --injection-file INJECTION_FILE]
[--injection-numbers INJECTION_NUMBERS]
[--injection-waveform-approximant INJECTION_WAVEFORM_APPROXIMANT]
[--label LABEL] [--outdir OUTDIR]
[--periodic-restart-time PERIODIC_RESTART_TIME]
[--scheduler-analysis-time SCHEDULER_ANALYSIS_TIME]
[--submit]
[--condor-job-priority CONDOR_JOB_PRIORITY]
[--log-directory LOG_DIRECTORY]
[--distance-marginalization]
[--distance-marginalization-lookup-table DISTANCE_MARGINALIZATION_LOOKUP_TABLE]
[--phase-marginalization]
[--time-marginalization]
[--jitter-time JITTER_TIME]
[--reference-frame REFERENCE_FRAME]
[--time-reference TIME_REFERENCE]
[--likelihood-type LIKELIHOOD_TYPE]
[--roq-folder ROQ_FOLDER]
[--roq-weights ROQ_WEIGHTS]
[--roq-scale-factor ROQ_SCALE_FACTOR]
[--extra-likelihood-kwargs EXTRA_LIKELIHOOD_KWARGS]
[--create-plots] [--create-summary]
[--notification NOTIFICATION]
[--existing-dir EXISTING_DIR]
[--webdir WEBDIR]
[--summarypages-arguments SUMMARYPAGES_ARGUMENTS]
[--default-prior DEFAULT_PRIOR]
[--deltaT DELTAT]
[--prior-file PRIOR_FILE | --prior-dict PRIOR_DICT]
[--convert-to-flat-in-component-mass CONVERT_TO_FLAT_IN_COMPONENT_MASS]
[--single-postprocessing-executable SINGLE_POSTPROCESSING_EXECUTABLE]
[--single-postprocessing-arguments SINGLE_POSTPROCESSING_ARGUMENTS]
[--n-parallel N_PARALLEL]
[--waveform-generator WAVEFORM_GENERATOR]
[--reference-frequency REFERENCE_FREQUENCY]
[--waveform-approximant WAVEFORM_APPROXIMANT]
[--catch-waveform-errors CATCH_WAVEFORM_ERRORS]
[--pn-spin-order PN_SPIN_ORDER]
[--pn-tidal-order PN_TIDAL_ORDER]
[--pn-phase-order PN_PHASE_ORDER]
[--pn-amplitude-order PN_AMPLITUDE_ORDER]
[--mode-array MODE_ARRAY]
[--frequency-domain-source-model FREQUENCY_DOMAIN_SOURCE_MODEL]
[--sampler {dynesty,ptemcee}] --nodes NODES
--ntasks-per-node NTASKS_PER_NODE --time TIME
[--mem-per-cpu MEM_PER_CPU]
[--extra-lines EXTRA_LINES]
[--slurm-extra-lines SLURM_EXTRA_LINES]
ini
Positional Arguments¶
- ini
Configuration ini file
Named Arguments¶
- --version
show program’s version number and exit
- -v, --verbose
Verbose output
Default: False
- --injection-dict
A single injection dictionary given in the ini file
- --injection-file
Injection file to use. See bilby_pipe_create_injection_file –help for supported formats
- --prior-file
The prior file
- --prior-dict
A dictionary of priors
- --sampler
Possible choices: dynesty, ptemcee
The parallelised sampler to use, defaults to dynesty
Default: “dynesty”
Dynesty Settings¶
- -n, --nlive
Number of live points
Default: 1000
- --dlogz
Stopping criteria: remaining evidence, (default=0.1)
Default: 0.1
- --n-effective
Stopping criteria: effective number of samples, (default=inf)
Default: inf
- --dynesty-sample
Dynesty sampling method (default=rwalk). Note, the dynesty rwalk method is overwritten by parallel bilby for an optimised version
Default: “rwalk”
- --dynesty-bound
Dynesty bounding method (default=multi)
Default: “multi”
- --walks
Minimum number of walks, defaults to 100
Default: 100
- --maxmcmc
Maximum number of walks, defaults to 5000
Default: 5000
- --nact
Number of autocorrelation times to take, defaults to 5
Default: 5
- --min-eff
The minimum efficiency at which to switch from uniform sampling.
Default: 10
- --facc
See dynesty.NestedSampler
Default: 0.5
- --vol-dec
See dynesty.NestedSampler
Default: 0.5
- --vol-check
See dynesty.NestedSampler
Default: 8
- --enlarge
See dynesty.NestedSampler
Default: 1.5
- --n-check-point
Steps to take before attempting checkpoint
Default: 100
- --max-its
Maximum number of iterations to sample for (default=1.e10)
Default: 10000000000
- --max-run-time
Maximum time to run for (default=1.e10 s)
Default: 10000000000.0
- --fast-mpi
Fast MPI communication pattern (default=False)
Default: False
- --mpi-timing
Print MPI timing when finished (default=False)
Default: False
- --mpi-timing-interval
Interval to write timing snapshot to disk (default=0 – disabled)
Default: 0
- --nestcheck
Save a ‘nestcheck’ pickle in the outdir (default=False). This pickle stores a nestcheck.data_processing.process_dynesty_run object, which can be used during post processing to compute the implementation and bootstrap errors explained by Higson et al (2018) in “Sampling Errors In Nested Sampling Parameter Estimation”.
Default: False
PTEmcee Settings¶
- --nsamples
Number of samples to draw
Default: 10000
- --ntemps
Number of temperatures
Default: 20
- --nwalkers
Number of walkers
Default: 100
- --max-iterations
Maximum number of iterations
Default: 100000
- --ncheck
Period with which to check convergence
Default: 500
- --burn-in-nact
Number of autocorrelation times to discard for burn-in
Default: 50.0
- --thin-by-nact
Thin-by number of autocorrelation times
Default: 1.0
- --frac-threshold
Threshold on the fractional change in ACT required for convergence
Default: 0.01
- --nfrac
The number of checks passing the frac-threshold for convergence
Default: 5
- --min-tau
The minimum tau to accept: used to prevent early convergence
Default: 30
- --Tmax
The maximum temperature to use, default=10000
Default: 10000
- --safety
Multiplicitive safety factor on the estimated tau
Default: 1.0
- --autocorr-c
The step size for the window search when calculating tau. Default: 5
Default: 5.0
- --autocorr-tol
The minimum number of autocorrelations needs to trust the autocorrelation estimate. Default: 0 (always return a result)
Default: 50.0
- --adapt
If
True
, the temperature ladder is dynamically adapted as the sampler runs to achieve uniform swap acceptance ratios between adjacent chains. See arXiv:1501.05823 for details.Default: False
Misc. Settings¶
- --bilby-zero-likelihood-mode
Default: False
- --sampling-seed
Random seed for sampling, parallel runs will be incremented
Default: 1234
- -c, --clean
Run clean: ignore any resume files
Default: False
- --no-plot
If true, don’t generate check-point plots
Default: False
- --do-not-save-bounds-in-resume
If true, do not store bounds in the resume file. This can make resume files large (~GB)
Default: False
- --check-point-deltaT
Write a checkpoint resume file and diagnostic plots every deltaT [s].
Default: 3600
Calibration arguments¶
Which calibration model and settings to use.
- --calibration-model
Possible choices: CubicSpline, None
Choice of calibration model, if None, no calibration is used
- --spline-calibration-envelope-dict
Dictionary pointing to the spline calibration envelope files
- --spline-calibration-nodes
Number of calibration nodes
Default: 5
- --spline-calibration-amplitude-uncertainty-dict
Dictionary of the amplitude uncertainties for the constant uncertainty model
- --spline-calibration-phase-uncertainty-dict
Dictionary of the phase uncertainties for the constant uncertainty model
Data generation arguments¶
How to generate the data, e.g., from a list of gps times or simulated Gaussian noise.
- --ignore-gwpy-data-quality-check
Ignores the check to see if data queried from GWpy (ie not gaussian noise) is obtained from time when the IFOs are in science mode.
Default: True
- --gps-tuple
Tuple of the (start, step, number) of GPS start times. For example, (10, 1, 3) produces the gps start times [10, 11, 12]. If given, gps-file is ignored.
- --gps-file
File containing segment GPS start times. This can be a multi-column file if (a) it is comma-separated and (b) the zeroth column contains the gps-times to use
- --timeslide-file
File containing detector timeslides. Requires a GPS time file to also be provided. One column for each detector. Order of detectors specified by –detectors argument. Number of timeslides must correspond to the number of GPS times provided.
- --timeslide-dict
Dictionary containing detector timeslides: applies a fixed offset per detector. E.g. to apply +1s in H1, {H1: 1}
- --trigger-time
Either a GPS trigger time, or the event name (e.g. GW150914). For event names, the gwosc package is used to identify the trigger time
- --gaussian-noise
If true, use simulated Gaussian noise
Default: False
- --n-simulation
Number of simulated segments to use with gaussian-noise Note, this must match the number of injections specified
Default: 0
- --data-dict
Dictionary of paths to gwf, or hdf5 data files
- --data-format
If given, the data format to pass to `gwpy.timeseries.TimeSeries.read(), see gwpy.github.io/docs/stable/timeseries/io.html
- --channel-dict
Channel dictionary: keys relate to the detector with values the channel name, e.g. ‘GDS-CALIB_STRAIN’. For GWOSC open data, set the channel-dict keys to ‘GWOSC’. Note, the dictionary should follow basic python dict syntax.
Detector arguments¶
How to set up the interferometers and power spectral density.
- --coherence-test
Run the analysis for all detectors together and for each detector separately
Default: False
- --detectors
The names of detectors to use. If given in the ini file, detectors are specified by detectors=[H1, L1]. If given at the command line, as –detectors H1 –detectors L1
- --duration
The duration of data around the event to use
Default: 4
- --generation-seed
Random seed used during data generation. If no generation seed provided, a random seed between 1 and 1e6 is selected. If a seed is provided, it is used as the base seed and all generation jobs will have their seeds set as {generation_seed = base_seed + job_idx}.
- --psd-dict
Dictionary of PSD files to use
- --psd-fractional-overlap
Fractional overlap of segments used in estimating the PSD
Default: 0.5
- --post-trigger-duration
Time (in s) after the trigger_time to the end of the segment
Default: 2.0
- --sampling-frequency
Default: 4096
- --psd-length
Sets the psd duration (up to the psd-duration-maximum). PSD duration calculated by psd-length x duration [s]. Default is 32.
Default: 32
- --psd-maximum-duration
The maximum allowed PSD duration in seconds, default is 1024s.
Default: 1024
- --psd-method
PSD method see gwpy.timeseries.TimeSeries.psd for options
Default: “median”
- --psd-start-time
Start time of data (relative to the segment start) used to generate the PSD. Defaults to psd-duration before the segment start time
- --maximum-frequency
The maximum frequency, given either as a float for all detectors or as a dictionary (see minimum-frequency)
- --minimum-frequency
The minimum frequency, given either as a float for all detectors or as a dictionary where all keys relate the detector with values of the minimum frequency, e.g. {H1: 10, L1: 20}. If the waveform generation should start the minimum frequency for any of the detectors, add another entry to the dictionary, e.g., {H1: 40, L1: 60, waveform: 20}.
Default: “20”
- --zero-noise
Use a zero noise realisation
Default: False
- --tukey-roll-off
Roll off duration of tukey window in seconds, default is 0.4s
Default: 0.4
- --resampling-method
Possible choices: lal, gwpy
Resampling method to use: lal matches the resampling used by lalinference/BayesWave
Default: “lal”
Injection arguments¶
Whether to include software injections and how to generate them.
- --injection
Create data from an injection file
Default: False
- --injection-numbers
Specific injections rows to use from the injection_file, e.g. `injection_numbers=[0,3] selects the zeroth and third row
- --injection-waveform-approximant
The name of the waveform approximant to use to create injections. If none is specified, then the waveform-approximant will be usedas the injection-waveform-approximant.
Job submission arguments¶
How the jobs should be formatted, e.g., which job scheduler to use.
- --label
Output label
Default: “label”
- --outdir
Output directory
Default: “.”
- --periodic-restart-time
Time after which the job will self-evict when scheduler=condor. After this, condor will restart the job. Default is 28800. This is used to decrease the chance of HTCondor hard evictions
Default: 28800
- --scheduler-analysis-time
Default: 7-00:00:00
- --submit
Attempt to submit the job after the build
Default: False
- --condor-job-priority
Job priorities allow a user to sort their HTCondor jobs to determine which are tried to be run first. A job priority can be any integer: larger values denote better priority. By default HTCondor job priority=0.
Default: 0
- --log-directory
If given, an alternative path for the log output
Likelihood arguments¶
Options for setting up the likelihood.
- --distance-marginalization
Boolean. If true, use a distance-marginalized likelihood
Default: False
- --distance-marginalization-lookup-table
Path to the distance-marginalization lookup table
- --phase-marginalization
Boolean. If true, use a phase-marginalized likelihood
Default: False
- --time-marginalization
Boolean. If true, use a time-marginalized likelihood
Default: False
- --jitter-time
Boolean. If true, and using a time-marginalized likelihood ‘time jittering’ will be performed
Default: True
- --reference-frame
Reference frame for the sky parameterisation, either ‘sky’ (default) or, e.g., ‘H1L1’
Default: “sky”
- --time-reference
Time parameter to sample in, either ‘geocent’ (default) or, e.g., ‘H1’
Default: “geocent”
- --likelihood-type
The likelihood. Can be one of [GravitationalWaveTransient, ROQGravitationalWaveTransient] or python path to a bilby likelihood class available in the users installation. Need to specify –roq-folder if ROQ likelihood used
Default: “GravitationalWaveTransient”
- --roq-folder
The data for ROQ
- --roq-weights
If given, the ROQ weights to use (rather than building them). This must be given along with the roq-folder for checking
- --roq-scale-factor
Rescaling factor for the ROQ, default is 1 (no rescaling)
Default: 1
- --extra-likelihood-kwargs
Additional keyword arguments to pass to the likelihood. Any arguments which are named bilby_pipe arguments, e.g., distance_marginalization should NOT be included. This is only used if you are not using the GravitationalWaveTransient or ROQGravitationalWaveTransient likelihoods
Output arguments¶
What kind of output/summary to generate.
- --create-plots
Create diagnostic and posterior plots
Default: False
- --create-summary
Create a PESummary page
Default: False
- --notification
Notification setting for HTCondor jobs. One of ‘Always’,’Complete’,’Error’,’Never’. If defined by ‘Always’, the owner will be notified whenever the job produces a checkpoint, as well as when the job completes. If defined by ‘Complete’, the owner will be notified when the job terminates. If defined by ‘Error’, the owner will only be notified if the job terminates abnormally, or if the job is placed on hold because of a failure, and not by user request. If defined by ‘Never’ (the default), the owner will not receive e-mail, regardless to what happens to the job. Note, an email arg is also required for notifications to be emailed.
Default: Never
- --existing-dir
If given, add results to an directory with an an existing summary.html file
- --webdir
Directory to store summary pages. If not given, defaults to outdir/results_page
- --summarypages-arguments
Arguments (in the form of a dictionary) to pass to the summarypages executable
Prior arguments¶
Specify the prior settings.
- --default-prior
The name of the prior set to base the prior on. Can be one of[PriorDict, BBHPriorDict, BNSPriorDict, CalibrationPriorDict]
Default: “BBHPriorDict”
- --deltaT
The symmetric width (in s) around the trigger time to search over the coalesence time
Default: 0.2
- --convert-to-flat-in-component-mass
Convert a flat-in chirp mass and mass-ratio prior file to flat in component mass during the post-processing. Note, the prior must be uniform in Mc and q with constraints in m1 and m2 for this to work
Default: False
Post processing arguments¶
What post-processing to perform.
- --single-postprocessing-executable
An executable name for postprocessing. A single postprocessing job is run as a child for each analysis jobs: note the difference with respect postprocessing-executable
- --single-postprocessing-arguments
Arguments to pass to the single postprocessing executable. The str ‘$RESULT’ will be replaced by the path to the individual result file
Sampler arguments¶
- --n-parallel
Number of identical parallel jobs to run per event
Default: 1
Waveform arguments¶
Setting for the waveform generator
- --waveform-generator
The waveform generator class, should be a python path. This will not be able to use any arguments not passed to the default.
Default: “bilby.gw.waveform_generator.WaveformGenerator”
- --reference-frequency
The reference frequency
Default: 20
- --waveform-approximant
The name of the waveform approximant to use for PE.
Default: “IMRPhenomPv2”
- --catch-waveform-errors
Turns on waveform error catching
Default: False
- --pn-spin-order
Post-newtonian order to use for the spin
Default: -1
- --pn-tidal-order
Post-Newtonian order to use for tides
Default: -1
- --pn-phase-order
post-Newtonian order to use for the phase
Default: -1
- --pn-amplitude-order
Post-Newtonian order to use for the amplitude. Also used to determine the waveform starting frequency.
Default: 0
- --mode-array
Array of modes to use for the waveform. Should be a list of lists, eg. [[2,2], [2,-2]]
- --frequency-domain-source-model
Name of the frequency domain source model. Can be one of[lal_binary_black_hole, lal_binary_neutron_star,lal_eccentric_binary_black_hole_no_spins, sinegaussian, supernova, supernova_pca_model] or any python path to a bilby source function the users installation, e.g. examp.source.bbh
Default: “lal_binary_black_hole”
Slurm Settings¶
- --nodes
Number of nodes to use
- --ntasks-per-node
Number of tasks per node
- --time
Maximum wall time (defaults to 24:00:00)
Default: “24:00:00”
- --mem-per-cpu
Memory per CPU (defaults to None)
- --extra-lines
Additional lines, separated by ‘;’, use for setting up conda env
- --slurm-extra-lines
additional slurm args (args that need #SBATCH in front) of the form arg=val separated by sapce