Using tf.data.Datasets for loading

Using datasets for loading is a little complicated because of the precise nature of the feeding of the data to different devices. There are several helpers though to try and make this a bit easier.

DatasetGradientIMNN

class imnn.DatasetGradientIMNN(n_s, n_d, n_params, n_summaries, input_shape, θ_fid, model, optimiser, key_or_state, main, remaining, host, devices, n_per_device, validation_main=None, validation_remaining=None)

Information maximising neural network fit using known derivatives

The outline of the fitting procedure is that a set of \(i\in[1, n_s]\) simulations \({\bf d}^i\) originally generated at fiducial model parameter \({\bf\theta}^\rm{fid}\), and their derivatives \(\partial{\bf d}^i/\partial\theta_\alpha\) with respect to model parameters are used. The fiducial simulations, \({\bf d}^i\), are passed through a network to obtain summaries, \({\bf x}^i\), and the jax automatic derivative of these summaries with respect to the inputs are calculated \(\partial{\bf x}^i\partial{\bf d}^j\delta_{ij}\). The chain rule is then used to calculate

\[\frac{\partial{\bf x}^i}{\partial\theta_\alpha} = \frac{\partial{\bf x}^i}{\partial{\bf d}^j} \frac{\partial{\bf d}^j}{\partial\theta_\alpha}\]

With \({\bf x}^i\) and \(\partial{{\bf x}^i}/\partial\theta_\alpha\) the covariance

\[C_{ab} = \frac{1}{n_s-1}\sum_{i=1}^{n_s}(x^i_a-\mu^i_a) (x^i_b-\mu^i_b)\]

and the derivative of the mean of the network outputs with respect to the model parameters

\[\frac{\partial\mu_a}{\partial\theta_\alpha} = \frac{1}{n_d} \sum_{i=1}^{n_d}\frac{\partial{x^i_a}}{\partial\theta_\alpha}\]

can be calculated and used form the Fisher information matrix

\[F_{\alpha\beta} = \frac{\partial\mu_a}{\partial\theta_\alpha} C^{-1}_{ab}\frac{\partial\mu_b}{\partial\theta_\beta}.\]

The loss function is then defined as

\[\Lambda = -\log|{\bf F}| + r(\Lambda_2) \Lambda_2\]

Since any linear rescaling of a sufficient statistic is also a sufficient statistic the negative logarithm of the determinant of the Fisher information matrix needs to be regularised to fix the scale of the network outputs. We choose to fix this scale by constraining the covariance of network outputs as

\[\Lambda_2 = ||{\bf C}-{\bf I}|| + ||{\bf C}^{-1}-{\bf I}||\]

Choosing this constraint is that it forces the covariance to be approximately parameter independent which justifies choosing the covariance independent Gaussian Fisher information as above. To avoid having a dual optimisation objective, we use a smooth and dynamic regularisation strength which turns off the regularisation to focus on maximising the Fisher information when the covariance has set the scale

\[r(\Lambda_2) = \frac{\lambda\Lambda_2}{\Lambda_2-\exp (-\alpha\Lambda_2)}.\]

To enable the use of large data (or networks) the whole procedure is aggregated. This means that the passing of the simulations through the network is farmed out to the desired XLA devices, and recollected, n_per_device inputs at a time. These are then used to calculate the automatic gradient of the loss function with respect to the calculated summaries and derivatives, \(\partial\Lambda/\partial{\bf x}^i\) (which is a fairly small computation as long as n_summaries and n_s {and n_d} are not huge). Once this is calculated, the simulations are passed through the network AGAIN this time calculating the Jacobian of the network output with respect to the network parameters \(\partial{\bf x}^i/\partial{\bf w}\) which is then combined via the chain rule to get

\[\frac{\partial\Lambda}{\partial{\bf w}} = \frac{\partial\Lambda}{\partial{\bf x}^i} \frac{\partial{\bf x}^i}{\partial{\bf w}}\]

This can then be passed to the optimiser.

In DatasetGradientIMNN the input datasets should be lists of n_devices tf.data.Datasets. Please note, due to the many various ways of constructing datasets to load data, there is no checking and any improperly made dataset will either fail (best result) or provide the wrong result (worst case scenario!). For this reason it is advised to use AggregatedGradientIMNN() if data will fit into CPU memory at least. If not, the next safest way is to construct a set of TFRecords and construct the dataset from that.

Examples

Here are various ways to construct the datasets for passing to the :func:~`DatasetGradientIMNN`. Note that these are not the only ways, but they should give something to follow to generate your own datasets. First we’ll generate some data (just random noise with zero mean and unit variance). We’ll generate 1000 simulations at the fiducial and we’ll use jax to calculate the derivatives with respect to the mean and variance for 100 of these. We’ll save each of these simulations into its own individual file (named by seed value).

import glob
import jax
import jax.numpy as np
import tensorflow as tf
from imnn import TFRecords
from functools import partial
from imnn.utils import value_and_jacfwd

n_s = 1000
n_d = 100
n_params = 2
input_shape = (10,)

def simulator(key, θ):
    return θ[0] + (jax.random.normal(key, shape=input_shape)
        * np.sqrt(θ[1]))

θ_fid = np.array([0., 1.])

get_sims_and_ders = value_and_jacfwd(simulator, argnums=1)

rng = jax.random.PRNGKey(0)
rng, data_key = jax.random.split(rng)
data_keys = np.array(jax.random.split(rng, num=2 * n_s))

fiducial, derivative = jax.vmap(get_sims_and_ders)(
    data_keys[:n_d], np.repeat(np.expand_dims(θ_fid, 0), n_d, axis=0))

remaining = jax.vmap(simulator)(
    data_keys[n_d:n_s],
    np.repeat(np.expand_dims(θ_fid, 0), n_s - n_d, axis=0))

validation_fiducial, validation_derivative = jax.vmap(
    get_sims_and_ders)(
        data_keys[n_s:n_s + n_d],
        np.repeat(np.expand_dims(θ_fid, 0), n_d, axis=0))

validation_remaining = jax.vmap(simulator)(
    data_keys[n_s + n_d:],
    np.repeat(np.expand_dims(θ_fid, 0), n_s - n_d, axis=0))

for i, (simulation, validation_simulation) in enumerate(zip(
        fiducial, validation_fiducial)):
    np.save(f"tmp/fiducial_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_fiducial_{i:04d}.npy",
            validation_simulation)

for i, (simulation, validation_simulation) in enumerate(zip(
        derivative, validation_derivative)):
    np.save(f"tmp/derivative_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_derivative_{i:04d}.npy",
            validation_simulation)

Now we’ll define how many devices to farm out our calculations to. We need to know this because we want to make a separate dataset for each device. We’ll also set the number of simulations which can be processed at once on each device, this should be as high as possible without running out of memory on any individual device for quickest fitting

devices = jax.devices("gpu")
n_devices = len(devices)
n_per_device = 100

To best accelerate the aggregation of the gradient calculation the computation is split into two parts, a main loop which loops through n_d simulation with its derivative with respect to model parameters, and a remaining loop of n_s - n_d iterations, where just simulations are looped through to calculate any other necessary summaries to estimate the covariance. Note this is true even if n_s = n_s the remaining loop just has zero iterations. So to construct the dataset define the shapes for the data to be reshaped into for proper construction of the datasets to be used when fitting the IMNN.

batch_shape = (
    n_devices,
    n_d // (n_devices * n_per_device),
    n_per_device) + input_shape

remaining_batch_shape = (
    n_devices,
    (n_s - n_d) // (n_devices * n_per_device),
    n_per_device) + input_shape

The simplest way to construct a dataset is simply using the numpy arrays from memory (note if you’re going to do this you should really just use AggregatedGradientIMNN, its more or less the same!), i.e.

main = [
    tf.data.Dataset.from_tensor_slices(
        (fiducial, derivative)).repeat().as_numpy_iterator()
    for fiducial, derivative in zip(
        fiducial.reshape(batch_shape),
        derivative.reshape(batch_shape + (n_params,)))]

remaining = [
    tf.data.Dataset.from_tensor_slices(fiducial
        ).repeat().as_numpy_iterator()
    for fiducial in remaining.reshape(
        remaining_batch_shape)]

validation_main = [
    tf.data.Dataset.from_tensor_slices(
        (fiducial, derivative)).repeat().as_numpy_iterator()
    for fiducial, derivative in zip(
        validation_fiducial.reshape(batch_shape),
        validation_derivative.reshape(batch_shape + (n_params,)))]

validation_remaining = [
    tf.data.Dataset.from_tensor_slices(fiducial
        ).repeat().as_numpy_iterator()
    for fiducial in validation_remaining.reshape(
        remaining_batch_shape)]

However, if the data is too large to fit in memory then we can use the npy files that we saved by loading them via a generator

def generator(directory, filename, total):
    i = 0
    while i < total:
        yield np.load(f"{directory}/{filename}_{i:04d}.npy")
        i += 1

We can then build the datasets like:

main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

remaining = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "remaining",
            n_s - n_d),
        tf.float32
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "validation_fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "validation_derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_remaining = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_remaining",
            n_s - n_d),
        tf.float32
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

The datasets must be built exactly like this, with the taking and batching and repeating. The zipping of the main datasets is equal as important to pass both the fiducial simulation and its derivative at once, which is needed to calculate the final gradient. To prefetch and cache the loaded files we can add the extra steps in the datasets, e.g.

main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).cache(
        ).prefetch(tf.data.AUTOTUNE
        ).repeat(
        ).as_numpy_iterator()

etc.

This loading will be quite slow because the files need to be opened each time, but we can build TFRecords which are quicker to load. There is a writer able to do the correct format. The TFRecords should be a couple hundred Mb for best flow-through, so we can keep filling the record until this size is reached.

record_size = 200 #Mb
writer = TFRecords.TFRecords(record_size=record_size)

We need a function which grabs single simulations from an array (or file) to add to the record

def get_simulation(seed, directory=None, filename=None):
    return np.load(f"{directory}/{filename}_{seed:04d}.npy")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="fiducial"),
    directory="tmp",
    filename="fiducial")

writer.write_record(
    n_sims=n_s - n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="remaining"),
    directory="tmp",
    filename="remaining")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="derivative"),
    directory="tmp",
    filename="derivative")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_fiducial"),
    directory="tmp",
    filename="validation_fiducial")

writer.write_record(
    n_sims=n_s - n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_remaining"),
    directory="tmp",
    filename="validation_remaining")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_derivative"),
    directory="tmp",
    filename="validation_derivative")

We can then read these to a dataset using the parser from the TFRecords class mapping the format of the data to a 32-bit float

fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices)
    for i in range(n_devices)]

main = [
    tf.data.Dataset.zip((
        fiducial[i],
        tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/derivative_*.tfrecords")),
            num_parallel_reads=1).map(
                lambda example: writer.derivative_parser(
                    example, n_params=n_params)))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

remaining = [
    fiducial[i].skip(n_d // n_devices
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices)
    for i in range(n_devices)]

validation_main = [
    tf.data.Dataset.zip((
        validation_fiducial[i],
        tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_derivative_*.tfrecords")),
            num_parallel_reads=1).map(
                lambda example: writer.derivative_parser(
                    example, n_params=n_params)))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_remaining = [
    validation_fiducial[i].skip(n_d // n_devices
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]
Parameters
  • main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • validation_main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • validation_remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for validation). Served n_per_device at time as a numpy iterator from a TensorFlow dataset.

  • n_remaining (int) – The number simulations where only the fiducial simulations are calculated. This is zero if n_s is equal to n_d.

  • n_iterations (int) – Number of iterations through the main summarising loop

  • n_remaining_iterations (int) – Number of iterations through the remaining simulations used for quick loops with no derivatives

  • batch_shape (tuple) – The shape which n_d should be reshaped to for aggregating. n_d // (n_devices * n_per_device), n_devices, n_per_device

  • remaining_batch_shape (tuple) – The shape which n_s - n_d should be reshaped to for aggregating. (n_s - n_d) // (n_devices * n_per_device), n_devices, n_per_device

Public Methods:

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

Inherited from AggregatedGradientIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

get_summary(input, w, θ[, derivative, gradient])

Returns a single summary of a simulation or its gradient

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_gradient(dΛ_dx, w[, key])

Aggregates gradients together to update the network parameters

Inherited from _AggregatedIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

fit(λ, ε[, rng, patience, min_iterations, …])

Fitting routine for the IMNN

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_gradient(dΛ_dx, w[, key])

Aggregates gradients together to update the network parameters

Inherited from GradientIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

Inherited from _IMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

fit(λ, ε[, rng, patience, min_iterations, …])

Fitting routine for the IMNN

get_α(λ, ε)

Calculate rate parameter for regularisation from closeness criterion

set_F_statistics([w, key, validate])

Set necessary attributes for calculating score compressed summaries

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_estimate(d)

Calculate score compressed parameter estimates from network outputs

plot([ax, expected_detF, colour, figsize, …])

Plot fitting history

Private Methods:

_set_data(fiducial, derivative, …)

Overwritten function to prevent setting fiducial attributes

_set_dataset([prefetch, cache])

Overwritten function to prevent building dataset, does list check

_set_prebuilt_dataset(main, remaining, …)

Set preconstructed dataset iterators

Inherited from AggregatedGradientIMNN

_set_shapes()

Calculates the shapes for batching over different devices

_set_dataset([prefetch, cache])

Overwritten function to prevent building dataset, does list check

_collect_input(key[, validate])

Returns validation or fitting sets

_split_dΛ_dx(dΛ_dx)

Returns the gradient of loss function wrt summaries (derivatives)

Inherited from _AggregatedIMNN

_set_devices(devices, n_per_device)

Checks that devices exist and that reshaping onto devices can occur

_set_batch_functions()

Creates jitted functions placed on desired XLA devices

_set_shapes()

Calculates the shapes for batching over different devices

_setup_progress_bar(print_rate, max_iterations)

Construct progress bar

_update_progress_bar(pbar, counter, …[, close])

Updates (and closes) progress bar

_collect_input(key[, validate])

Returns validation or fitting sets

_get_batch_summaries(inputs, w, θ[, …])

Vectorised batch calculation of summaries or gradients

_split_dΛ_dx(dΛ_dx)

Returns the gradient of loss function wrt summaries (derivatives)

_construct_gradient(layers[, aux, func])

Multiuse function to iterate over tuple of network parameters

Inherited from GradientIMNN

_set_data(fiducial, derivative, …)

Overwritten function to prevent setting fiducial attributes

Inherited from _IMNN

_initialise_parameters(n_s, n_d, n_params, …)

Performs type checking and initialisation of class attributes

_initialise_model(model, optimiser, key_or_state)

Initialises neural network parameters or loads optimiser state

_initialise_history()

Initialises history dictionary attribute

_set_history(results)

Places results from fitting into the history dictionary

_set_inputs(rng, max_iterations)

Builds list of inputs for the XLA compilable fitting routine

_get_fitting_keys(rng)

Generates random numbers for simulation generation if needed

_fit(inputs, λ=None, α=None[, min_iterations])

Single iteration fitting algorithm

_fit_cond(inputs, patience, max_iterations)

Stopping condition for the fitting loop

_update_loop_vars(inputs)

Updates input parameters if max_detF is increased

_check_loop_vars(inputs, min_iterations)

Updates patience_counter if max_detF not increased

_update_history(inputs, history, counter, ind)

Puts current fitting statistics into history arrays

_slogdet(matrix)

Combined summed logarithmic determinant

_construct_derivatives(derivatives)

Builds derivatives of the network outputs wrt model parameters

_get_F_statistics([w, key, validate])

Calculates the Fisher information and returns all statistics used

_calculate_F_statistics(summaries, derivatives)

Calculates the Fisher information matrix from network outputs

_get_regularisation_strength(Λ2, λ, α)

Coupling strength of the regularisation (amplified sigmoid)

_get_regularisation(C, invC)

Difference of the covariance (and its inverse) from identity

_get_loss(w, λ, α[, key])

Calculates the loss function and returns auxillary variables

_calculate_loss(summaries, derivatives, λ, α)

Calculates the loss function from network summaries and derivatives

_setup_plot([ax, expected_detF, figsize])

Builds axes for history plot


_set_data(fiducial, derivative, validation_fiducial, validation_derivative)

Overwritten function to prevent setting fiducial attributes

Parameters
  • fiducial (None) –

  • derivative (None) –

  • validation_fiducial (None) –

  • validation_derivative (None) –

_set_dataset(prefetch=None, cache=None)

Overwritten function to prevent building dataset, does list check

Parameters
  • prefetch (None) –

  • cache (None) –

_set_prebuilt_dataset(main, remaining, validation_main, validation_remaining)

Set preconstructed dataset iterators

Parameters
  • main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for fitting). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.

  • remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for fitting). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.

  • validation_main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for validation). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.

  • validation_remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for validation). Served n_per_device at time as a numpy iterator from a TensorFlow dataset.

Raises
  • ValueError – if main or remaining are None

  • ValueError – if length of any input list is not equal to number of devices

  • TypeError – if any input is not a list

DatasetNumericalGradientIMNN

class imnn.DatasetNumericalGradientIMNN(n_s, n_d, n_params, n_summaries, input_shape, θ_fid, model, optimiser, key_or_state, fiducial, derivative, δθ, host, devices, n_per_device, validation_fiducial=None, validation_derivative=None)

Information maximising neural network fit using numerical derivatives

The outline of the fitting procedure is that a set of \(i\in[1, n_s]\) simulations \({\bf d}^i\) originally generated at fiducial model parameter \({\bf\theta}^\rm{fid}\), and a set of \(i\in[1, n_d]\) simulations, \(\{{\bf d}_{\alpha^-}^i, {\bf d}_{\alpha^+}^i\}\), generated with the same seed at each \(i\) generated at \({\bf\theta}^\rm{fid}\) apart from at parameter label \(\alpha\) with values

\[\theta_{\alpha^-} = \theta_\alpha^\rm{fid}-\delta\theta_\alpha\]

and

\[\theta_{\alpha^+} = \theta_\alpha^\rm{fid}+\delta\theta_\alpha\]

where \(\delta\theta_\alpha\) is a \(n_{params}\) length vector with the \(\alpha\) element having a value which perturbs the parameter \(\theta^{\rm fid}_\alpha\). This means there are \(2\times n_{params}\times n_d\) simulations used to calculate the numerical derivatives (this is extremely cheap compared to other machine learning methods). All these simulations are passed through a network \(f_{{\bf w}}({\bf d})\) with network parameters \({\bf w}\) to obtain network outputs \({\bf x}^i\) and \(\{{\bf x}_{\alpha^-}^i,{\bf x}_{\alpha^+}^i\}\). These perturbed values are combined to obtain

\[\frac{\partial{{\bf x}^i}}{\partial\theta_\alpha} = \frac{{\bf x}_{\alpha^+}^i - {\bf x}_{\alpha^-}^i} {\delta\theta_\alpha}\]

With \({\bf x}^i\) and \(\partial{{\bf x}^i}/\partial\theta_\alpha\) the covariance

\[C_{ab} = \frac{1}{n_s-1}\sum_{i=1}^{n_s}(x^i_a-\mu^i_a) (x^i_b-\mu^i_b)\]

and the derivative of the mean of the network outputs with respect to the model parameters

\[\frac{\partial\mu_a}{\partial\theta_\alpha} = \frac{1}{n_d} \sum_{i=1}^{n_d}\frac{\partial{x^i_a}}{\partial\theta_\alpha}\]

can be calculated and used form the Fisher information matrix

\[F_{\alpha\beta} = \frac{\partial\mu_a}{\partial\theta_\alpha} C^{-1}_{ab}\frac{\partial\mu_b}{\partial\theta_\beta}.\]

The loss function is then defined as

\[\Lambda = -\log|{\bf F}| + r(\Lambda_2) \Lambda_2\]

Since any linear rescaling of a sufficient statistic is also a sufficient statistic the negative logarithm of the determinant of the Fisher information matrix needs to be regularised to fix the scale of the network outputs. We choose to fix this scale by constraining the covariance of network outputs as

\[\Lambda_2 = ||{\bf C}-{\bf I}|| + ||{\bf C}^{-1}-{\bf I}||\]

Choosing this constraint is that it forces the covariance to be approximately parameter independent which justifies choosing the covariance independent Gaussian Fisher information as above. To avoid having a dual optimisation objective, we use a smooth and dynamic regularisation strength which turns off the regularisation to focus on maximising the Fisher information when the covariance has set the scale

\[r(\Lambda_2) = \frac{\lambda\Lambda_2}{\Lambda_2-\exp (-\alpha\Lambda_2)}.\]

To enable the use of large data (or networks) the whole procedure is aggregated. This means that the passing of the simulations through the network is farmed out to the desired XLA devices, and recollected, n_per_device inputs at a time. These are then used to calculate the automatic gradient of the loss function with respect to the calculated summaries and derivatives, \(\partial\Lambda/\partial{\bf x}^i\) (which is a fairly small computation as long as n_summaries and n_s {and n_d} are not huge). Once this is calculated, the simulations are passed through the network AGAIN this time calculating the Jacobian of the network output with respect to the network parameters \(\partial{\bf x}^i/\partial{\bf w}\) which is then combined via the chain rule to get

\[\frac{\partial\Lambda}{\partial{\bf w}} = \frac{\partial\Lambda}{\partial{\bf x}^i} \frac{\partial{\bf x}^i}{\partial{\bf w}}\]

This can then be passed to the optimiser.

In DatasetNumericalGradientIMNN the input datasets should be lists of n_devices tf.data.Datasets. Please note, due to the many various ways of constructing datasets to load data, there is no checking and any improperly made dataset will either fail (best result) or provide the wrong result (worst case scenario!). For this reason it is advised to use AggregatedNumericalGradientIMNN() if data will fit into CPU memory at least. If not, the next safest way is to construct a set of TFRecords and construct the dataset from that.

Examples

Here are various ways to construct the datasets for passing to the :func:~`DatasetNumericalGradientIMNN`. Note that these are not the only ways, but they should give something to follow to generate your own datasets. First we’ll generate some data (just random noise with zero mean and unit variance) and perturb the mean and the variance of this noise to calculate numerical derivatives with respect to the model parameters. We’ll generate 1000 simulations at the fiducial and 100 for each parameter varied above and below the fiducial. We’ll save each of these simulations into its own individual file (named by seed value).

import glob
import jax
import jax.numpy as np
import tensorflow as tf
from imnn import TFRecords
from functools import partial

n_s = 1000
n_d = 100
n_params = 2
input_shape = (10,)

def simulator(key, θ):
    return θ[0] + (jax.random.normal(key, shape=input_shape)
        * np.sqrt(θ[1]))

θ_fid = np.array([0., 1.])
δθ = np.array([0.1, 0.1])
θ_der = (θ_fid
    + np.einsum(
        "i,jk->ijk",
        np.array([-1., 1.]),
        np.diag(δθ)
    / 2.)).reshape((-1, 2))

rng = jax.random.PRNGKey(0)
rng, data_key = jax.random.split(rng)
data_keys = np.array(jax.random.split(rng, num=2 * n_s))

fiducial = jax.vmap(simulator)(
    data_keys[:n_s],
    np.repeat(np.expand_dims(θ_fid, 0), n_s, axis=0))

validation_fiducial = jax.vmap(simulator)(
    data_keys[n_s:],
    np.repeat(np.expand_dims(θ_fid, 0), n_s, axis=0))

numerical_derivative = jax.vmap(simulator)(
    np.repeat(data_keys[:n_d], θ_der.shape[0], axis=0),
    np.tile(θ_der, (n_d, 1))).reshape(
        (n_d, 2, n_params) + input_shape)

validation_numerical_derivative = jax.vmap(simulator)(
    np.repeat(data_keys[n_s:n_d + n_s], θ_der.shape[0], axis=0),
    np.tile(θ_der, (n_d, 1))).reshape(
        (n_d, 2, n_params) + input_shape)

for i, (simulation, validation_simulation) in enumerate(
        zip(fiducial, validation_fiducial)):
    np.save(f"tmp/fiducial_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_fiducial_{i:04d}.npy",
            validation_simulation)

for i, (simulation, validation_simulation) in enumerate(
        zip(numerical_derivative, validation_numerical_derivative)):
    np.save(f"tmp/numerical_derivative_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_numerical_derivative_{i:04d}.npy",
            validation_simulation

Now we’ll define how many devices to farm out our calculations to. We need to know this because we want to make a separate dataset for each device. We’ll also set the number of simulations which can be processed at once on each device, this should be as high as possible without running out of memory on any individual device for quickest fitting

devices = jax.devices("gpu")
n_devices = len(devices)
n_per_device = 100

Using this we can define the shapes for the data to be reshaped into for proper construction of the datasets to be used when fitting the IMNN.

fiducial_shape = (
    n_devices,
    n_s // (n_devices * n_per_device),
    n_per_device) + input_shape

derivative_shape = (
    n_devices,
    2 * n_params * n_d // (n_devices * n_per_device),
    n_per_device) + input_shape

The simplest way to construct a dataset is simply using the numpy arrays from memory (note if you’re going to do this you should really just use AggregatedNumericalGradientIMNN, its more or less the same!), i.e.

fiducial = [
    tf.data.Dataset.from_tensor_slices(
        fid).repeat().as_numpy_iterator()
    for fid in fiducial.reshape(fiducial_shape)]

numerical_derivative = [
    tf.data.Dataset.from_tensor_slices(
        der).repeat().as_numpy_iterator()
    for der in numerical_derivative.reshape(derivative_shape)]

validation_fiducial = [
    tf.data.Dataset.from_tensor_slices(
        fid).repeat().as_numpy_iterator()
    for fid in validation_fiducial.reshape(fiducial_shape)]

validation_numerical_derivative = [
    tf.data.Dataset.from_tensor_slices(
        der).repeat().as_numpy_iterator()
    for der in validation_numerical_derivative.reshape(
        derivative_shape)]

However, if the data is too large to fit in memory then we can use the npy files that we saved by loading them via a generator

def generator(directory, filename, total):
    i = 0
    while i < total:
        yield np.load(f"{directory}/{filename}_{i:04d}.npy")
        i += 1

We can then build the datasets like:

fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

numerical_derivative = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "numerical_derivative",
            n_d),
        tf.float32
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_numerical_derivative = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_numerical_derivative",
            n_d),
        tf.float32
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

The datasets must be built exactly like this, with the taking and batching and repeating. Importantly both the flat_map over the datasets for the numerical derivatives are needed to unwrap the perturbation direction and the parameter direction in each numpy file. To prefetch and cache the loaded files we can add the extra steps in the datasets, e.g.

fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).cache(
        ).prefetch(tf.data.AUTOTUNE
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

This loading will be quite slow because the files need to be opened each time, but we can build TFRecords which are quicker to load. There is a writer able to do the correct format. The TFRecords should be a couple hundred Mb for best flow-through, so we can keep filling the record until this size is reached.

record_size = 200 #Mb
writer = TFRecords.TFRecords(record_size=record_size)

We need a function which grabs single simulations from an array (or file) to add to the record

def get_fiducial(seed, directory=None, filename=None):
    return np.load(f"{directory}/{filename}_{seed:04d}.npy")

def get_derivative(seed, der, params, directory=None, filename=None):
    return np.load(
        f"{directory}/{filename}_{seed:04d}.npy")[der, params]

writer.write_record(
    n_sims=n_s,
    get_simulation=lambda seed: get_fiducial(
        seed, directory="tmp", filename="fiducial"),
    fiducial=True,
    directory="tmp",
    filename="fiducial")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed, der, param: get_derivative(
        seed, der, param, directory="tmp",
        filename="numerical_derivative"),
    fiducial=False,
    n_params=n_params,
    directory="tmp",
    filename="numerical_derivative")

writer.write_record(
    n_sims=n_s,
        get_simulation=lambda seed: get_fiducial(
        seed, directory="tmp",
        filename="validation_fiducial"),
    fiducial=True,
    directory="tmp",
    filename="validation_fiducial")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed, der, param: get_derivative(
        seed, der, param, directory="tmp",
        filename="validation_numerical_derivative"),
    fiducial=False,
    n_params=n_params,
    directory="tmp",
    filename="validation_numerical_derivative")

We can then read these to a dataset using the parser from the TFRecords class mapping the format of the data to a 32-bit float

fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

numerical_derivative = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/numerical_derivative_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * 2 * n_params * n_d // n_devices
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_numerical_derivative = [
    tf.data.TFRecordDataset(
            sorted(glob.glob(
                "tmp/validation_numerical_derivative_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * 2 * n_params * n_d // n_devices
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]
Parameters
  • δθ (float(n_params,)) – Size of perturbation to model parameters for the numerical derivative

  • fiducial (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • derivative (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at parameter values perturbed from the fiducial used to calculate the numerical derivative of network outputs with respect to model parameters (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • validation_fiducial (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • validation_derivative (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at parameter values perturbed from the fiducial used to calculate the numerical derivative of network outputs with respect to model parameters (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.

  • fiducial_iterations (int) – The number of iterations over the fiducial dataset

  • derivative_iterations (int) – The number of iterations over the derivative dataset

  • derivative_output_shape (tuple) – The shape of the output of the derivatives from the network

  • fiducial_batch_shape (tuple) – The shape of each batch of fiducial simulations (without input or summary shape)

  • derivative_batch_shape (tuple) – The shape of each batch of derivative simulations (without input or summary shape)

Public Methods:

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

Inherited from AggregatedNumericalGradientIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

get_summary(inputs, w, θ[, derivative, gradient])

Returns a single summary of a simulation or its gradient

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_gradient(dΛ_dx, w[, key])

Aggregates gradients together to update the network parameters

Inherited from _AggregatedIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

fit(λ, ε[, rng, patience, min_iterations, …])

Fitting routine for the IMNN

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_gradient(dΛ_dx, w[, key])

Aggregates gradients together to update the network parameters

Inherited from NumericalGradientIMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

Inherited from _IMNN

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

fit(λ, ε[, rng, patience, min_iterations, …])

Fitting routine for the IMNN

get_α(λ, ε)

Calculate rate parameter for regularisation from closeness criterion

set_F_statistics([w, key, validate])

Set necessary attributes for calculating score compressed summaries

get_summaries(w[, key, validate])

Gets all network outputs and derivatives wrt model parameters

get_estimate(d)

Calculate score compressed parameter estimates from network outputs

plot([ax, expected_detF, colour, figsize, …])

Plot fitting history

Private Methods:

_set_data(δθ, fiducial, derivative, …)

Checks and sets data attributes with the correct shape

_set_dataset([prefetch, cache])

Overwritten function to prevent building dataset, does list check

Inherited from AggregatedNumericalGradientIMNN

_set_shapes()

Calculates the shapes for batching over different devices

_set_dataset([prefetch, cache])

Overwritten function to prevent building dataset, does list check

_set_batch_functions()

Creates jitted functions placed on desired XLA devices

_collect_input(key[, validate])

Returns validation or fitting sets

_split_dΛ_dx(dΛ_dx)

Returns the gradient of loss function wrt summaries (derivatives)

Inherited from _AggregatedIMNN

_set_devices(devices, n_per_device)

Checks that devices exist and that reshaping onto devices can occur

_set_batch_functions()

Creates jitted functions placed on desired XLA devices

_set_shapes()

Calculates the shapes for batching over different devices

_setup_progress_bar(print_rate, max_iterations)

Construct progress bar

_update_progress_bar(pbar, counter, …[, close])

Updates (and closes) progress bar

_collect_input(key[, validate])

Returns validation or fitting sets

_get_batch_summaries(inputs, w, θ[, …])

Vectorised batch calculation of summaries or gradients

_split_dΛ_dx(dΛ_dx)

Returns the gradient of loss function wrt summaries (derivatives)

_construct_gradient(layers[, aux, func])

Multiuse function to iterate over tuple of network parameters

Inherited from NumericalGradientIMNN

_set_data(δθ, fiducial, derivative, …)

Checks and sets data attributes with the correct shape

_collect_input(key[, validate])

Returns validation or fitting sets

_construct_derivatives(x_mp)

Builds derivatives of the network outputs wrt model parameters

Inherited from _IMNN

_initialise_parameters(n_s, n_d, n_params, …)

Performs type checking and initialisation of class attributes

_initialise_model(model, optimiser, key_or_state)

Initialises neural network parameters or loads optimiser state

_initialise_history()

Initialises history dictionary attribute

_set_history(results)

Places results from fitting into the history dictionary

_set_inputs(rng, max_iterations)

Builds list of inputs for the XLA compilable fitting routine

_get_fitting_keys(rng)

Generates random numbers for simulation generation if needed

_fit(inputs, λ=None, α=None[, min_iterations])

Single iteration fitting algorithm

_fit_cond(inputs, patience, max_iterations)

Stopping condition for the fitting loop

_update_loop_vars(inputs)

Updates input parameters if max_detF is increased

_check_loop_vars(inputs, min_iterations)

Updates patience_counter if max_detF not increased

_update_history(inputs, history, counter, ind)

Puts current fitting statistics into history arrays

_slogdet(matrix)

Combined summed logarithmic determinant

_construct_derivatives(x_mp)

Builds derivatives of the network outputs wrt model parameters

_get_F_statistics([w, key, validate])

Calculates the Fisher information and returns all statistics used

_calculate_F_statistics(summaries, derivatives)

Calculates the Fisher information matrix from network outputs

_get_regularisation_strength(Λ2, λ, α)

Coupling strength of the regularisation (amplified sigmoid)

_get_regularisation(C, invC)

Difference of the covariance (and its inverse) from identity

_get_loss(w, λ, α[, key])

Calculates the loss function and returns auxillary variables

_calculate_loss(summaries, derivatives, λ, α)

Calculates the loss function from network summaries and derivatives

_setup_plot([ax, expected_detF, figsize])

Builds axes for history plot


_set_data(δθ, fiducial, derivative, validation_fiducial, validation_derivative)

Checks and sets data attributes with the correct shape

Parameters
  • δθ (float(n_params,)) – Size of perturbation to model parameters for the numerical derivative

  • fiducial (list of tf.data.Datesets) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for fitting)

  • derivative (list of tf.data.Datesets) – The derivative of the simulations with respect to the model parameters (for fitting)

  • validation_fiducial (list of tf.data.Datesets or None, default=None) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for validation). Sets validate = True attribute if provided

  • validation_derivative (list of tf.data.Datesets or None, default=None) – The derivative of the simulations with respect to the model parameters (for validation). Sets validate = True attribute if provided

Raises
  • ValueError – if δθ is None

  • ValueError – if δθ has wrong shape

  • TypeError – if δθ has wrong type

Notes

No checking is done on the correctness of the tf.data.Dataset

_set_dataset(prefetch=None, cache=None)

Overwritten function to prevent building dataset, does list check

Raises
  • ValueError – if fiducial or derivative are None

  • ValueError – if any dataset has wrong shape

  • TypeError – if any dataset has wrong type

TFRecords (writer)

class imnn.TFRecords(record_size=150.0, padding=5)

Module for writing simulations to TFRecord to be used by the IMNN

Parameters
  • record_size (int) – approximate maximum size of an individual record (in Mb)

  • padding (int) – zero padding size for record name numbering

  • input_shape (tuple) – shape of a single simulation

  • file (str) – filename to save the records to

Public Methods:

__init__([record_size, padding])

Constructor method

write_record(n_sims, get_simulation[, …])

Write all simulations to set of tfrecords

fiducial_serialiser(seed, counter, …)

Serialises a fiducial simulation

derivative_serialiser(simulation, counter, …)

Serialises a numerical derivative simulation

get_serialiser(fiducial, get_simulation, …)

Returns the fiducial or derivative serialiser

parser(example)

Maps a serialised example simulation into its float32 representation

derivative_parser(example[, n_params])

Maps a serialised example derivative into its float32 representation

numerical_derivative_parser(example[, n_params])

Maps a serialised example derivative into its float32 representation

get_file(directory, filename, fiducial, …)

Constructs filepath and name that records will be saved to

check_size(counter)

Checks the size of the current record in Mb to see whether its full

get_initial_seed(fiducial, start)

Sets the initial seed index (or set of seeds) to get simulations at

get_seed(simulation, fiducial)

Gets the seed index depending on input

check_func(get_simulation, fiducial)

Checks the simulation grabber takes the correct number of arguments

check_params(n_params, fiducial)

Checks that n_params is actually given

Private Methods:

_bytes_feature(value)

Makes a serialised byte list from value (converts tensors to numpy)

_int64_feature(value)

Makes a serialised int list from value


_bytes_feature(value)

Makes a serialised byte list from value (converts tensors to numpy)

Parameters

value (float (possibly Tensor)) – the simulation to be serialised

Returns

the simulation as a string of bytes

Return type

byte_list

_int64_feature(value)

Makes a serialised int list from value

Parameters

value (int) – the seed to be serialised

Returns

the serialised int list

Return type

int_list

check_func(get_simulation, fiducial)

Checks the simulation grabber takes the correct number of arguments

Parameters
  • get_simulation (fn) – a function to get a single simulation

  • fiducial – if a fiducial simulation record is being constructed

check_params(n_params, fiducial)

Checks that n_params is actually given

Parameters
  • n_params (int) – the number of parameters in the derivative

  • fiducial – if a fiducial simulation record is being constructed

check_size(counter)

Checks the size of the current record in Mb to see whether its full

Parameters

counter (int) – The current open record being written to

Returns

True if the record has reached (exceeded) the preassigned record size

Return type

bool

derivative_parser(example, n_params=None)

Maps a serialised example derivative into its float32 representation

This is the parser for an analytical or automatic derivative of the simulation with respect to model parameters (should be serialised with fiducial_serialiser).

Parameters
  • example (str) – The serialised string to be parsed

  • n_params (int or None, default=None) – The number of parameters in the derivative. This is required but named here for use with functools.partial

Returns

The parsed numerical form of the serialised input

Return type

float(input_shape, n_params)

derivative_serialiser(simulation, counter, get_simulation, n_params)

Serialises a numerical derivative simulation

Takes a tuple containing a seed index, an index describing whether the corresponding simulation is produced below or above the fiducial parameter values and an index describing which parameter the simulation is going to be a derivative of. These are used to sequentially get simulations from a function until either all of the simulations for the derivative of a single seed are collected or until the record is full at which it breaks out of the loop to create the next record and continue there.

Parameters
  • simulation (tuple) –

    • (int) – index to grab a simulation at

    • (int) – index describing whether generated above or below fid

    • (int) – index for the parameter of interest

  • counter (int) – the number record which is being written to

  • get_simulation (fn) – a function which takes an index and returns a simulation corresponding to that index

  • n_params (int) – the number of parameters in the numerical derivative

Returns

  • (int) – index up to where the simulation was grabbed

  • (int) – index describing whether generated above or below fid

  • (int) – index for the parameter of interest

Return type

tuple

fiducial_serialiser(seed, counter, get_simulation)

Serialises a fiducial simulation

Takes a seed index and a function to get a simulation at that given index (and a counter for printing the number of files made). This seed and simulation are then serialised and written to file.

Parameters
  • seed (int) – index to grab a simulation at

  • counter (int) – the number record which is being written to

  • get_simulation (fn) – a function which takes an index and returns a simulation corresponding to that index

Returns

the input seed value increased by 1

Return type

int

get_file(directory, filename, fiducial, validation)

Constructs filepath and name that records will be saved to

Parameters
  • directory (str) – The full path to where the files should be saved

  • filename (str or None) – a filename to save the records to, if None default names are given depending on the value of fiducial and validation

  • fiducial (bool) – whether to call a file fiducial (if filename is None) or derivative

  • validation (bool) – whether to prepend validation_ to the filename (if filename is None)

Returns

the filename to save the record to

Return type

str

get_initial_seed(fiducial, start)

Sets the initial seed index (or set of seeds) to get simulations at

When constructing a record with a fiducial (or exact derivative) only a seed index for the simulation is needed whilst a derivative index and a parameter index are also needed if a numerical derivative simulation is being collected. Seed indexs will increase incrementally by 1.

Parameters
  • fiducial (bool) – if a fiducial simulation record is being constructed

  • start (int) – the initial seed index to collect the simulation at

Returns

  • (int) – the initial seed index to collect the simulations at

  • (tuple)
    • (int) – the initial seed index to collect the simulation

    • (int) – whether the simulation is generated below or above the fiducial parameter values

    • (int) – which respect to which parameter the simulation is used to calculate the numerical gradient

Return type

int or tuple

get_seed(simulation, fiducial)

Gets the seed index depending on input

With a fiducial (or exact derivative) only a seed index for the simulation is needed whilst a derivative index and a parameter index are also needed if a numerical derivative simulation is being collected

Parameters
  • simulation (tuple) –

    • (int) – the initial seed index to collect the simulation

    • (int) – whether the simulation is generated below or above the fiducial parameter values

    • (int) – which respect to which parameter the simulation is used to calculate the numerical gradient

  • fiducial (bool) – if a fiducial simulation record is being constructed

Returns

the seed index to collect the simulation at

Return type

int

get_serialiser(fiducial, get_simulation, n_params)

Returns the fiducial or derivative serialiser

Parameters
  • fiducial (bool) – whether the fiducial serialiser should be returned or not

  • get_simulation (fn) – the function which returns either a simulation or a part of a derivative simulation

  • n_params (int) – the number of parameters that the derivative is taken wrt

Returns

either the fiducial serialiser or the derivative serialiser

Return type

fn

numerical_derivative_parser(example, n_params=None)

Maps a serialised example derivative into its float32 representation

This is the parser for all the simulations necessary for making a numerical derivative of a simulation with respect to model parameters (should be serialised with derivative_serialiser).

Parameters
  • example (str) – The serialised string to be parsed

  • n_params (int or None, default=None) – The number of parameters in the derivative. This is required but named here for use with functools.partial

Returns

The parsed numerical form of the serialised input

Return type

float(2, n_params, input_shape)

parser(example)

Maps a serialised example simulation into its float32 representation

Parameters

example (str) – The serialised string to be parsed

Returns

The parsed numerical form of the serialised input

Return type

float(input_shape)

write_record(n_sims, get_simulation, fiducial=True, n_params=None, validation=False, directory=None, filename=None, start=0)

Write all simulations to set of tfrecords

Parameters
  • n_sims (int) – number of simulations to be written to record

  • get_simulation (func) – function (1 or 3 inputs for fiducial or derivative) which returns a single simulation as a numpy array

  • fiducial (bool (default True)) – whether the simulations are in the fiducial or derivative format

  • n_params (int (opt)) – number of parameters in the simulator model (for the derivative)

  • validation (bool (default False)) – tag to automatically prepend validation to the filename

  • directory (str (opt)) – directory to save records. defaults to current directory

  • filename (str (opt)) – filename to save records. defaults to fiducial and derivative depending on the value of fiducial

  • start (int (opt)) – value to start seed at