Using tf.data.Datasets for loading¶

Using datasets for loading is a little complicated because of the precise nature of the feeding of the data to different devices. There are several helpers though to try and make this a bit easier.

The available modules are:

DatasetGradientIMNN
DatasetNumericalGradientIMNN
TFRecords (writer)

DatasetGradientIMNN ¶

class imnn.DatasetGradientIMNN(n_s, n_d, n_params, n_summaries, input_shape, θ_fid, model, optimiser, key_or_state, main, remaining, host, devices, n_per_device, validation_main=None, validation_remaining=None)¶

Information maximising neural network fit using known derivatives

The outline of the fitting procedure is that a set of \(i\in[1, n_s]\) simulations \({\bf d}^i\) originally generated at fiducial model parameter \({\bf\theta}^\rm{fid}\), and their derivatives \(\partial{\bf d}^i/\partial\theta_\alpha\) with respect to model parameters are used. The fiducial simulations, \({\bf d}^i\), are passed through a network to obtain summaries, \({\bf x}^i\), and the jax automatic derivative of these summaries with respect to the inputs are calculated \(\partial{\bf x}^i\partial{\bf d}^j\delta_{ij}\). The chain rule is then used to calculate

\[\frac{\partial{\bf x}^i}{\partial\theta_\alpha} = \frac{\partial{\bf x}^i}{\partial{\bf d}^j} \frac{\partial{\bf d}^j}{\partial\theta_\alpha}\]

With \({\bf x}^i\) and \(\partial{{\bf x}^i}/\partial\theta_\alpha\) the covariance

\[C_{ab} = \frac{1}{n_s-1}\sum_{i=1}^{n_s}(x^i_a-\mu^i_a) (x^i_b-\mu^i_b)\]

and the derivative of the mean of the network outputs with respect to the model parameters

\[\frac{\partial\mu_a}{\partial\theta_\alpha} = \frac{1}{n_d} \sum_{i=1}^{n_d}\frac{\partial{x^i_a}}{\partial\theta_\alpha}\]

can be calculated and used form the Fisher information matrix

\[F_{\alpha\beta} = \frac{\partial\mu_a}{\partial\theta_\alpha} C^{-1}_{ab}\frac{\partial\mu_b}{\partial\theta_\beta}.\]

The loss function is then defined as

\[\Lambda = -\log|{\bf F}| + r(\Lambda_2) \Lambda_2\]

Since any linear rescaling of a sufficient statistic is also a sufficient statistic the negative logarithm of the determinant of the Fisher information matrix needs to be regularised to fix the scale of the network outputs. We choose to fix this scale by constraining the covariance of network outputs as

\[\Lambda_2 = ||{\bf C}-{\bf I}|| + ||{\bf C}^{-1}-{\bf I}||\]

Choosing this constraint is that it forces the covariance to be approximately parameter independent which justifies choosing the covariance independent Gaussian Fisher information as above. To avoid having a dual optimisation objective, we use a smooth and dynamic regularisation strength which turns off the regularisation to focus on maximising the Fisher information when the covariance has set the scale

\[r(\Lambda_2) = \frac{\lambda\Lambda_2}{\Lambda_2-\exp (-\alpha\Lambda_2)}.\]

To enable the use of large data (or networks) the whole procedure is aggregated. This means that the passing of the simulations through the network is farmed out to the desired XLA devices, and recollected, n_per_device inputs at a time. These are then used to calculate the automatic gradient of the loss function with respect to the calculated summaries and derivatives, \(\partial\Lambda/\partial{\bf x}^i\) (which is a fairly small computation as long as n_summaries and n_s {and n_d} are not huge). Once this is calculated, the simulations are passed through the network AGAIN this time calculating the Jacobian of the network output with respect to the network parameters \(\partial{\bf x}^i/\partial{\bf w}\) which is then combined via the chain rule to get

\[\frac{\partial\Lambda}{\partial{\bf w}} = \frac{\partial\Lambda}{\partial{\bf x}^i} \frac{\partial{\bf x}^i}{\partial{\bf w}}\]

This can then be passed to the optimiser.

In DatasetGradientIMNN the input datasets should be lists of n_devices tf.data.Datasets. Please note, due to the many various ways of constructing datasets to load data, there is no checking and any improperly made dataset will either fail (best result) or provide the wrong result (worst case scenario!). For this reason it is advised to use AggregatedGradientIMNN() if data will fit into CPU memory at least. If not, the next safest way is to construct a set of TFRecords and construct the dataset from that.

Examples

Here are various ways to construct the datasets for passing to the :func:~`DatasetGradientIMNN`. Note that these are not the only ways, but they should give something to follow to generate your own datasets. First we’ll generate some data (just random noise with zero mean and unit variance). We’ll generate 1000 simulations at the fiducial and we’ll use jax to calculate the derivatives with respect to the mean and variance for 100 of these. We’ll save each of these simulations into its own individual file (named by seed value).

import glob
import jax
import jax.numpy as np
import tensorflow as tf
from imnn import TFRecords
from functools import partial
from imnn.utils import value_and_jacfwd

n_s = 1000
n_d = 100
n_params = 2
input_shape = (10,)

def simulator(key, θ):
    return θ[0] + (jax.random.normal(key, shape=input_shape)
        * np.sqrt(θ[1]))

θ_fid = np.array([0., 1.])

get_sims_and_ders = value_and_jacfwd(simulator, argnums=1)

rng = jax.random.PRNGKey(0)
rng, data_key = jax.random.split(rng)
data_keys = np.array(jax.random.split(rng, num=2 * n_s))

fiducial, derivative = jax.vmap(get_sims_and_ders)(
    data_keys[:n_d], np.repeat(np.expand_dims(θ_fid, 0), n_d, axis=0))

remaining = jax.vmap(simulator)(
    data_keys[n_d:n_s],
    np.repeat(np.expand_dims(θ_fid, 0), n_s - n_d, axis=0))

validation_fiducial, validation_derivative = jax.vmap(
    get_sims_and_ders)(
        data_keys[n_s:n_s + n_d],
        np.repeat(np.expand_dims(θ_fid, 0), n_d, axis=0))

validation_remaining = jax.vmap(simulator)(
    data_keys[n_s + n_d:],
    np.repeat(np.expand_dims(θ_fid, 0), n_s - n_d, axis=0))

for i, (simulation, validation_simulation) in enumerate(zip(
        fiducial, validation_fiducial)):
    np.save(f"tmp/fiducial_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_fiducial_{i:04d}.npy",
            validation_simulation)

for i, (simulation, validation_simulation) in enumerate(zip(
        derivative, validation_derivative)):
    np.save(f"tmp/derivative_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_derivative_{i:04d}.npy",
            validation_simulation)

Now we’ll define how many devices to farm out our calculations to. We need to know this because we want to make a separate dataset for each device. We’ll also set the number of simulations which can be processed at once on each device, this should be as high as possible without running out of memory on any individual device for quickest fitting

devices = jax.devices("gpu")
n_devices = len(devices)
n_per_device = 100

To best accelerate the aggregation of the gradient calculation the computation is split into two parts, a main loop which loops through n_d simulation with its derivative with respect to model parameters, and a remaining loop of n_s - n_d iterations, where just simulations are looped through to calculate any other necessary summaries to estimate the covariance. Note this is true even if n_s = n_s the remaining loop just has zero iterations. So to construct the dataset define the shapes for the data to be reshaped into for proper construction of the datasets to be used when fitting the IMNN.

batch_shape = (
    n_devices,
    n_d // (n_devices * n_per_device),
    n_per_device) + input_shape

remaining_batch_shape = (
    n_devices,
    (n_s - n_d) // (n_devices * n_per_device),
    n_per_device) + input_shape

The simplest way to construct a dataset is simply using the numpy arrays from memory (note if you’re going to do this you should really just use AggregatedGradientIMNN, its more or less the same!), i.e.

main = [
    tf.data.Dataset.from_tensor_slices(
        (fiducial, derivative)).repeat().as_numpy_iterator()
    for fiducial, derivative in zip(
        fiducial.reshape(batch_shape),
        derivative.reshape(batch_shape + (n_params,)))]

remaining = [
    tf.data.Dataset.from_tensor_slices(fiducial
        ).repeat().as_numpy_iterator()
    for fiducial in remaining.reshape(
        remaining_batch_shape)]

validation_main = [
    tf.data.Dataset.from_tensor_slices(
        (fiducial, derivative)).repeat().as_numpy_iterator()
    for fiducial, derivative in zip(
        validation_fiducial.reshape(batch_shape),
        validation_derivative.reshape(batch_shape + (n_params,)))]

validation_remaining = [
    tf.data.Dataset.from_tensor_slices(fiducial
        ).repeat().as_numpy_iterator()
    for fiducial in validation_remaining.reshape(
        remaining_batch_shape)]

However, if the data is too large to fit in memory then we can use the npy files that we saved by loading them via a generator

def generator(directory, filename, total):
    i = 0
    while i < total:
        yield np.load(f"{directory}/{filename}_{i:04d}.npy")
        i += 1

We can then build the datasets like:

main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

remaining = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "remaining",
            n_s - n_d),
        tf.float32
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "validation_fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "validation_derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_remaining = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_remaining",
            n_s - n_d),
        tf.float32
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

The datasets must be built exactly like this, with the taking and batching and repeating. The zipping of the main datasets is equal as important to pass both the fiducial simulation and its derivative at once, which is needed to calculate the final gradient. To prefetch and cache the loaded files we can add the extra steps in the datasets, e.g.

main = [
    tf.data.Dataset.zip((
         tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "fiducial",
                 n_d),
             tf.float32),
        tf.data.Dataset.from_generator(
             partial(
                 generator,
                 "tmp",
                 "derivative",
                 n_d),
             tf.float32))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).cache(
        ).prefetch(tf.data.AUTOTUNE
        ).repeat(
        ).as_numpy_iterator()

etc.

This loading will be quite slow because the files need to be opened each time, but we can build TFRecords which are quicker to load. There is a writer able to do the correct format. The TFRecords should be a couple hundred Mb for best flow-through, so we can keep filling the record until this size is reached.

record_size = 200 #Mb
writer = TFRecords.TFRecords(record_size=record_size)

We need a function which grabs single simulations from an array (or file) to add to the record

def get_simulation(seed, directory=None, filename=None):
    return np.load(f"{directory}/{filename}_{seed:04d}.npy")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="fiducial"),
    directory="tmp",
    filename="fiducial")

writer.write_record(
    n_sims=n_s - n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="remaining"),
    directory="tmp",
    filename="remaining")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="derivative"),
    directory="tmp",
    filename="derivative")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_fiducial"),
    directory="tmp",
    filename="validation_fiducial")

writer.write_record(
    n_sims=n_s - n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_remaining"),
    directory="tmp",
    filename="validation_remaining")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed: get_simulation(
        seed, directory="tmp", filename="validation_derivative"),
    directory="tmp",
    filename="validation_derivative")

We can then read these to a dataset using the parser from the TFRecords class mapping the format of the data to a 32-bit float

fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices)
    for i in range(n_devices)]

main = [
    tf.data.Dataset.zip((
        fiducial[i],
        tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/derivative_*.tfrecords")),
            num_parallel_reads=1).map(
                lambda example: writer.derivative_parser(
                    example, n_params=n_params)))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

remaining = [
    fiducial[i].skip(n_d // n_devices
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices)
    for i in range(n_devices)]

validation_main = [
    tf.data.Dataset.zip((
        validation_fiducial[i],
        tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_derivative_*.tfrecords")),
            num_parallel_reads=1).map(
                lambda example: writer.derivative_parser(
                    example, n_params=n_params)))
        ).take(n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_remaining = [
    validation_fiducial[i].skip(n_d // n_devices
        ).take((n_s - n_d) // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

Parameters

main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
validation_main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
validation_remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for validation). Served n_per_device at time as a numpy iterator from a TensorFlow dataset.
n_remaining (int) – The number simulations where only the fiducial simulations are calculated. This is zero if n_s is equal to n_d.
n_iterations (int) – Number of iterations through the main summarising loop
n_remaining_iterations (int) – Number of iterations through the remaining simulations used for quick loops with no derivatives
batch_shape (tuple) – The shape which n_d should be reshaped to for aggregating. n_d // (n_devices * n_per_device), n_devices, n_per_device
remaining_batch_shape (tuple) – The shape which n_s - n_d should be reshaped to for aggregating. (n_s - n_d) // (n_devices * n_per_device), n_devices, n_per_device

Public Methods:

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

Inherited from AggregatedGradientIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`get_summary`(input, w, θ[, derivative, gradient])	Returns a single summary of a simulation or its gradient
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_gradient`(dΛ_dx, w[, key])	Aggregates gradients together to update the network parameters

Inherited from _AggregatedIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`fit`(λ, ε[, rng, patience, min_iterations, …])	Fitting routine for the IMNN
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_gradient`(dΛ_dx, w[, key])	Aggregates gradients together to update the network parameters

Inherited from GradientIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters

Inherited from _IMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`fit`(λ, ε[, rng, patience, min_iterations, …])	Fitting routine for the IMNN
`get_α`(λ, ε)	Calculate rate parameter for regularisation from closeness criterion
`set_F_statistics`([w, key, validate])	Set necessary attributes for calculating score compressed summaries
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_estimate`(d)	Calculate score compressed parameter estimates from network outputs
`plot`([ax, expected_detF, colour, figsize, …])	Plot fitting history

Private Methods:

`_set_data`(fiducial, derivative, …)	Overwritten function to prevent setting fiducial attributes
`_set_dataset`([prefetch, cache])	Overwritten function to prevent building dataset, does list check
`_set_prebuilt_dataset`(main, remaining, …)	Set preconstructed dataset iterators

Inherited from AggregatedGradientIMNN

`_set_shapes`()	Calculates the shapes for batching over different devices
`_set_dataset`([prefetch, cache])	Overwritten function to prevent building dataset, does list check
`_collect_input`(key[, validate])	Returns validation or fitting sets
`_split_dΛ_dx`(dΛ_dx)	Returns the gradient of loss function wrt summaries (derivatives)

Inherited from _AggregatedIMNN

`_set_devices`(devices, n_per_device)	Checks that devices exist and that reshaping onto devices can occur
`_set_batch_functions`()	Creates jitted functions placed on desired XLA devices
`_set_shapes`()	Calculates the shapes for batching over different devices
`_setup_progress_bar`(print_rate, max_iterations)	Construct progress bar
`_update_progress_bar`(pbar, counter, …[, close])	Updates (and closes) progress bar
`_collect_input`(key[, validate])	Returns validation or fitting sets
`_get_batch_summaries`(inputs, w, θ[, …])	Vectorised batch calculation of summaries or gradients
`_split_dΛ_dx`(dΛ_dx)	Returns the gradient of loss function wrt summaries (derivatives)
`_construct_gradient`(layers[, aux, func])	Multiuse function to iterate over tuple of network parameters

Inherited from GradientIMNN

_set_data(fiducial, derivative, …)

Overwritten function to prevent setting fiducial attributes

Inherited from _IMNN

`_initialise_parameters`(n_s, n_d, n_params, …)	Performs type checking and initialisation of class attributes
`_initialise_model`(model, optimiser, key_or_state)	Initialises neural network parameters or loads optimiser state
`_initialise_history`()	Initialises history dictionary attribute
`_set_history`(results)	Places results from fitting into the history dictionary
`_set_inputs`(rng, max_iterations)	Builds list of inputs for the XLA compilable fitting routine
`_get_fitting_keys`(rng)	Generates random numbers for simulation generation if needed
`_fit`(inputs, λ=None, α=None[, min_iterations])	Single iteration fitting algorithm
`_fit_cond`(inputs, patience, max_iterations)	Stopping condition for the fitting loop
`_update_loop_vars`(inputs)	Updates input parameters if `max_detF` is increased
`_check_loop_vars`(inputs, min_iterations)	Updates `patience_counter` if `max_detF` not increased
`_update_history`(inputs, history, counter, ind)	Puts current fitting statistics into history arrays
`_slogdet`(matrix)	Combined summed logarithmic determinant
`_construct_derivatives`(derivatives)	Builds derivatives of the network outputs wrt model parameters
`_get_F_statistics`([w, key, validate])	Calculates the Fisher information and returns all statistics used
`_calculate_F_statistics`(summaries, derivatives)	Calculates the Fisher information matrix from network outputs
`_get_regularisation_strength`(Λ2, λ, α)	Coupling strength of the regularisation (amplified sigmoid)
`_get_regularisation`(C, invC)	Difference of the covariance (and its inverse) from identity
`_get_loss`(w, λ, α[, key])	Calculates the loss function and returns auxillary variables
`_calculate_loss`(summaries, derivatives, λ, α)	Calculates the loss function from network summaries and derivatives
`_setup_plot`([ax, expected_detF, figsize])	Builds axes for history plot

_set_data(fiducial, derivative, validation_fiducial, validation_derivative)¶

Overwritten function to prevent setting fiducial attributes

Parameters

fiducial (None) –
derivative (None) –
validation_fiducial (None) –
validation_derivative (None) –

_set_dataset(prefetch=None, cache=None)¶

Overwritten function to prevent building dataset, does list check

Parameters

prefetch (None) –
cache (None) –

_set_prebuilt_dataset(main, remaining, validation_main, validation_remaining)¶

Set preconstructed dataset iterators

Parameters

main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for fitting). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.
remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for fitting). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.
validation_main (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs and their derivatives with respect to the physical model parameters (for validation). These are served n_per_device at at time as a numpy iterator from a TensorFlow dataset.
validation_remaining (list of tf.data.Dataset()as_numpy_iterators()) – The n_s - n_d simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs with a derivative counterpart (for validation). Served n_per_device at time as a numpy iterator from a TensorFlow dataset.

Raises

ValueError – if main or remaining are None
ValueError – if length of any input list is not equal to number of devices
TypeError – if any input is not a list

DatasetNumericalGradientIMNN ¶

class imnn.DatasetNumericalGradientIMNN(n_s, n_d, n_params, n_summaries, input_shape, θ_fid, model, optimiser, key_or_state, fiducial, derivative, δθ, host, devices, n_per_device, validation_fiducial=None, validation_derivative=None)¶

Information maximising neural network fit using numerical derivatives

The outline of the fitting procedure is that a set of \(i\in[1, n_s]\) simulations \({\bf d}^i\) originally generated at fiducial model parameter \({\bf\theta}^\rm{fid}\), and a set of \(i\in[1, n_d]\) simulations, \(\{{\bf d}_{\alpha^-}^i, {\bf d}_{\alpha^+}^i\}\), generated with the same seed at each \(i\) generated at \({\bf\theta}^\rm{fid}\) apart from at parameter label \(\alpha\) with values

\[\theta_{\alpha^-} = \theta_\alpha^\rm{fid}-\delta\theta_\alpha\]

and

\[\theta_{\alpha^+} = \theta_\alpha^\rm{fid}+\delta\theta_\alpha\]

where \(\delta\theta_\alpha\) is a \(n_{params}\) length vector with the \(\alpha\) element having a value which perturbs the parameter \(\theta^{\rm fid}_\alpha\). This means there are \(2\times n_{params}\times n_d\) simulations used to calculate the numerical derivatives (this is extremely cheap compared to other machine learning methods). All these simulations are passed through a network \(f_{{\bf w}}({\bf d})\) with network parameters \({\bf w}\) to obtain network outputs \({\bf x}^i\) and \(\{{\bf x}_{\alpha^-}^i,{\bf x}_{\alpha^+}^i\}\). These perturbed values are combined to obtain

\[\frac{\partial{{\bf x}^i}}{\partial\theta_\alpha} = \frac{{\bf x}_{\alpha^+}^i - {\bf x}_{\alpha^-}^i} {\delta\theta_\alpha}\]

With \({\bf x}^i\) and \(\partial{{\bf x}^i}/\partial\theta_\alpha\) the covariance

\[C_{ab} = \frac{1}{n_s-1}\sum_{i=1}^{n_s}(x^i_a-\mu^i_a) (x^i_b-\mu^i_b)\]

and the derivative of the mean of the network outputs with respect to the model parameters

\[\frac{\partial\mu_a}{\partial\theta_\alpha} = \frac{1}{n_d} \sum_{i=1}^{n_d}\frac{\partial{x^i_a}}{\partial\theta_\alpha}\]

can be calculated and used form the Fisher information matrix

\[F_{\alpha\beta} = \frac{\partial\mu_a}{\partial\theta_\alpha} C^{-1}_{ab}\frac{\partial\mu_b}{\partial\theta_\beta}.\]

The loss function is then defined as

\[\Lambda = -\log|{\bf F}| + r(\Lambda_2) \Lambda_2\]

Since any linear rescaling of a sufficient statistic is also a sufficient statistic the negative logarithm of the determinant of the Fisher information matrix needs to be regularised to fix the scale of the network outputs. We choose to fix this scale by constraining the covariance of network outputs as

\[\Lambda_2 = ||{\bf C}-{\bf I}|| + ||{\bf C}^{-1}-{\bf I}||\]

Choosing this constraint is that it forces the covariance to be approximately parameter independent which justifies choosing the covariance independent Gaussian Fisher information as above. To avoid having a dual optimisation objective, we use a smooth and dynamic regularisation strength which turns off the regularisation to focus on maximising the Fisher information when the covariance has set the scale

\[r(\Lambda_2) = \frac{\lambda\Lambda_2}{\Lambda_2-\exp (-\alpha\Lambda_2)}.\]

To enable the use of large data (or networks) the whole procedure is aggregated. This means that the passing of the simulations through the network is farmed out to the desired XLA devices, and recollected, n_per_device inputs at a time. These are then used to calculate the automatic gradient of the loss function with respect to the calculated summaries and derivatives, \(\partial\Lambda/\partial{\bf x}^i\) (which is a fairly small computation as long as n_summaries and n_s {and n_d} are not huge). Once this is calculated, the simulations are passed through the network AGAIN this time calculating the Jacobian of the network output with respect to the network parameters \(\partial{\bf x}^i/\partial{\bf w}\) which is then combined via the chain rule to get

\[\frac{\partial\Lambda}{\partial{\bf w}} = \frac{\partial\Lambda}{\partial{\bf x}^i} \frac{\partial{\bf x}^i}{\partial{\bf w}}\]

This can then be passed to the optimiser.

In DatasetNumericalGradientIMNN the input datasets should be lists of n_devices tf.data.Datasets. Please note, due to the many various ways of constructing datasets to load data, there is no checking and any improperly made dataset will either fail (best result) or provide the wrong result (worst case scenario!). For this reason it is advised to use AggregatedNumericalGradientIMNN() if data will fit into CPU memory at least. If not, the next safest way is to construct a set of TFRecords and construct the dataset from that.

Examples

Here are various ways to construct the datasets for passing to the :func:~`DatasetNumericalGradientIMNN`. Note that these are not the only ways, but they should give something to follow to generate your own datasets. First we’ll generate some data (just random noise with zero mean and unit variance) and perturb the mean and the variance of this noise to calculate numerical derivatives with respect to the model parameters. We’ll generate 1000 simulations at the fiducial and 100 for each parameter varied above and below the fiducial. We’ll save each of these simulations into its own individual file (named by seed value).

import glob
import jax
import jax.numpy as np
import tensorflow as tf
from imnn import TFRecords
from functools import partial

n_s = 1000
n_d = 100
n_params = 2
input_shape = (10,)

def simulator(key, θ):
    return θ[0] + (jax.random.normal(key, shape=input_shape)
        * np.sqrt(θ[1]))

θ_fid = np.array([0., 1.])
δθ = np.array([0.1, 0.1])
θ_der = (θ_fid
    + np.einsum(
        "i,jk->ijk",
        np.array([-1., 1.]),
        np.diag(δθ)
    / 2.)).reshape((-1, 2))

rng = jax.random.PRNGKey(0)
rng, data_key = jax.random.split(rng)
data_keys = np.array(jax.random.split(rng, num=2 * n_s))

fiducial = jax.vmap(simulator)(
    data_keys[:n_s],
    np.repeat(np.expand_dims(θ_fid, 0), n_s, axis=0))

validation_fiducial = jax.vmap(simulator)(
    data_keys[n_s:],
    np.repeat(np.expand_dims(θ_fid, 0), n_s, axis=0))

numerical_derivative = jax.vmap(simulator)(
    np.repeat(data_keys[:n_d], θ_der.shape[0], axis=0),
    np.tile(θ_der, (n_d, 1))).reshape(
        (n_d, 2, n_params) + input_shape)

validation_numerical_derivative = jax.vmap(simulator)(
    np.repeat(data_keys[n_s:n_d + n_s], θ_der.shape[0], axis=0),
    np.tile(θ_der, (n_d, 1))).reshape(
        (n_d, 2, n_params) + input_shape)

for i, (simulation, validation_simulation) in enumerate(
        zip(fiducial, validation_fiducial)):
    np.save(f"tmp/fiducial_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_fiducial_{i:04d}.npy",
            validation_simulation)

for i, (simulation, validation_simulation) in enumerate(
        zip(numerical_derivative, validation_numerical_derivative)):
    np.save(f"tmp/numerical_derivative_{i:04d}.npy", simulation)
    np.save(f"tmp/validation_numerical_derivative_{i:04d}.npy",
            validation_simulation

Now we’ll define how many devices to farm out our calculations to. We need to know this because we want to make a separate dataset for each device. We’ll also set the number of simulations which can be processed at once on each device, this should be as high as possible without running out of memory on any individual device for quickest fitting

devices = jax.devices("gpu")
n_devices = len(devices)
n_per_device = 100

Using this we can define the shapes for the data to be reshaped into for proper construction of the datasets to be used when fitting the IMNN.

fiducial_shape = (
    n_devices,
    n_s // (n_devices * n_per_device),
    n_per_device) + input_shape

derivative_shape = (
    n_devices,
    2 * n_params * n_d // (n_devices * n_per_device),
    n_per_device) + input_shape

The simplest way to construct a dataset is simply using the numpy arrays from memory (note if you’re going to do this you should really just use AggregatedNumericalGradientIMNN, its more or less the same!), i.e.

fiducial = [
    tf.data.Dataset.from_tensor_slices(
        fid).repeat().as_numpy_iterator()
    for fid in fiducial.reshape(fiducial_shape)]

numerical_derivative = [
    tf.data.Dataset.from_tensor_slices(
        der).repeat().as_numpy_iterator()
    for der in numerical_derivative.reshape(derivative_shape)]

validation_fiducial = [
    tf.data.Dataset.from_tensor_slices(
        fid).repeat().as_numpy_iterator()
    for fid in validation_fiducial.reshape(fiducial_shape)]

validation_numerical_derivative = [
    tf.data.Dataset.from_tensor_slices(
        der).repeat().as_numpy_iterator()
    for der in validation_numerical_derivative.reshape(
        derivative_shape)]

However, if the data is too large to fit in memory then we can use the npy files that we saved by loading them via a generator

def generator(directory, filename, total):
    i = 0
    while i < total:
        yield np.load(f"{directory}/{filename}_{i:04d}.npy")
        i += 1

We can then build the datasets like:

fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

numerical_derivative = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "numerical_derivative",
            n_d),
        tf.float32
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_numerical_derivative = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "validation_numerical_derivative",
            n_d),
        tf.float32
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).flat_map(
            lambda x: tf.data.Dataset.from_tensor_slices(x)
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

The datasets must be built exactly like this, with the taking and batching and repeating. Importantly both the flat_map over the datasets for the numerical derivatives are needed to unwrap the perturbation direction and the parameter direction in each numpy file. To prefetch and cache the loaded files we can add the extra steps in the datasets, e.g.

fiducial = [
    tf.data.Dataset.from_generator(
        partial(
            generator,
            "tmp",
            "fiducial",
            n_s),
        tf.float32
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).cache(
        ).prefetch(tf.data.AUTOTUNE
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

This loading will be quite slow because the files need to be opened each time, but we can build TFRecords which are quicker to load. There is a writer able to do the correct format. The TFRecords should be a couple hundred Mb for best flow-through, so we can keep filling the record until this size is reached.

record_size = 200 #Mb
writer = TFRecords.TFRecords(record_size=record_size)

We need a function which grabs single simulations from an array (or file) to add to the record

def get_fiducial(seed, directory=None, filename=None):
    return np.load(f"{directory}/{filename}_{seed:04d}.npy")

def get_derivative(seed, der, params, directory=None, filename=None):
    return np.load(
        f"{directory}/{filename}_{seed:04d}.npy")[der, params]

writer.write_record(
    n_sims=n_s,
    get_simulation=lambda seed: get_fiducial(
        seed, directory="tmp", filename="fiducial"),
    fiducial=True,
    directory="tmp",
    filename="fiducial")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed, der, param: get_derivative(
        seed, der, param, directory="tmp",
        filename="numerical_derivative"),
    fiducial=False,
    n_params=n_params,
    directory="tmp",
    filename="numerical_derivative")

writer.write_record(
    n_sims=n_s,
        get_simulation=lambda seed: get_fiducial(
        seed, directory="tmp",
        filename="validation_fiducial"),
    fiducial=True,
    directory="tmp",
    filename="validation_fiducial")

writer.write_record(
    n_sims=n_d,
    get_simulation=lambda seed, der, param: get_derivative(
        seed, der, param, directory="tmp",
        filename="validation_numerical_derivative"),
    fiducial=False,
    n_params=n_params,
    directory="tmp",
    filename="validation_numerical_derivative")

We can then read these to a dataset using the parser from the TFRecords class mapping the format of the data to a 32-bit float

fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

numerical_derivative = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/numerical_derivative_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * 2 * n_params * n_d // n_devices
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for i in range(n_devices)]

validation_fiducial = [
    tf.data.TFRecordDataset(
            sorted(glob.glob("tmp/validation_fiducial_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * n_s // n_devices
        ).take(n_s // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

validation_numerical_derivative = [
    tf.data.TFRecordDataset(
            sorted(glob.glob(
                "tmp/validation_numerical_derivative_*.tfrecords")),
            num_parallel_reads=1
        ).map(writer.parser
        ).skip(i * 2 * n_params * n_d // n_devices
        ).take(2 * n_params * n_d // n_devices
        ).batch(n_per_device
        ).repeat(
        ).as_numpy_iterator()
    for _ in range(n_devices)]

Parameters

δθ (float(n_params,)) – Size of perturbation to model parameters for the numerical derivative
fiducial (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
derivative (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at parameter values perturbed from the fiducial used to calculate the numerical derivative of network outputs with respect to model parameters (for fitting). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
validation_fiducial (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
validation_derivative (list of tf.data.Dataset()as_numpy_iterators()) – The simulations generated at parameter values perturbed from the fiducial used to calculate the numerical derivative of network outputs with respect to model parameters (for validation). These are served n_per_device at a time as a numpy iterator from a TensorFlow dataset.
fiducial_iterations (int) – The number of iterations over the fiducial dataset
derivative_iterations (int) – The number of iterations over the derivative dataset
derivative_output_shape (tuple) – The shape of the output of the derivatives from the network
fiducial_batch_shape (tuple) – The shape of each batch of fiducial simulations (without input or summary shape)
derivative_batch_shape (tuple) – The shape of each batch of derivative simulations (without input or summary shape)

Public Methods:

__init__(n_s, n_d, n_params, n_summaries, …)

Constructor method

Inherited from AggregatedNumericalGradientIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`get_summary`(inputs, w, θ[, derivative, gradient])	Returns a single summary of a simulation or its gradient
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_gradient`(dΛ_dx, w[, key])	Aggregates gradients together to update the network parameters

Inherited from _AggregatedIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`fit`(λ, ε[, rng, patience, min_iterations, …])	Fitting routine for the IMNN
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_gradient`(dΛ_dx, w[, key])	Aggregates gradients together to update the network parameters

Inherited from NumericalGradientIMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters

Inherited from _IMNN

`__init__`(n_s, n_d, n_params, n_summaries, …)	Constructor method
`fit`(λ, ε[, rng, patience, min_iterations, …])	Fitting routine for the IMNN
`get_α`(λ, ε)	Calculate rate parameter for regularisation from closeness criterion
`set_F_statistics`([w, key, validate])	Set necessary attributes for calculating score compressed summaries
`get_summaries`(w[, key, validate])	Gets all network outputs and derivatives wrt model parameters
`get_estimate`(d)	Calculate score compressed parameter estimates from network outputs
`plot`([ax, expected_detF, colour, figsize, …])	Plot fitting history

Private Methods:

`_set_data`(δθ, fiducial, derivative, …)	Checks and sets data attributes with the correct shape
`_set_dataset`([prefetch, cache])	Overwritten function to prevent building dataset, does list check

Inherited from AggregatedNumericalGradientIMNN

`_set_shapes`()	Calculates the shapes for batching over different devices
`_set_dataset`([prefetch, cache])	Overwritten function to prevent building dataset, does list check
`_set_batch_functions`()	Creates jitted functions placed on desired XLA devices
`_collect_input`(key[, validate])	Returns validation or fitting sets
`_split_dΛ_dx`(dΛ_dx)	Returns the gradient of loss function wrt summaries (derivatives)

Inherited from _AggregatedIMNN

`_set_devices`(devices, n_per_device)	Checks that devices exist and that reshaping onto devices can occur
`_set_batch_functions`()	Creates jitted functions placed on desired XLA devices
`_set_shapes`()	Calculates the shapes for batching over different devices
`_setup_progress_bar`(print_rate, max_iterations)	Construct progress bar
`_update_progress_bar`(pbar, counter, …[, close])	Updates (and closes) progress bar
`_collect_input`(key[, validate])	Returns validation or fitting sets
`_get_batch_summaries`(inputs, w, θ[, …])	Vectorised batch calculation of summaries or gradients
`_split_dΛ_dx`(dΛ_dx)	Returns the gradient of loss function wrt summaries (derivatives)
`_construct_gradient`(layers[, aux, func])	Multiuse function to iterate over tuple of network parameters

Inherited from NumericalGradientIMNN

`_set_data`(δθ, fiducial, derivative, …)	Checks and sets data attributes with the correct shape
`_collect_input`(key[, validate])	Returns validation or fitting sets
`_construct_derivatives`(x_mp)	Builds derivatives of the network outputs wrt model parameters

Inherited from _IMNN

`_initialise_parameters`(n_s, n_d, n_params, …)	Performs type checking and initialisation of class attributes
`_initialise_model`(model, optimiser, key_or_state)	Initialises neural network parameters or loads optimiser state
`_initialise_history`()	Initialises history dictionary attribute
`_set_history`(results)	Places results from fitting into the history dictionary
`_set_inputs`(rng, max_iterations)	Builds list of inputs for the XLA compilable fitting routine
`_get_fitting_keys`(rng)	Generates random numbers for simulation generation if needed
`_fit`(inputs, λ=None, α=None[, min_iterations])	Single iteration fitting algorithm
`_fit_cond`(inputs, patience, max_iterations)	Stopping condition for the fitting loop
`_update_loop_vars`(inputs)	Updates input parameters if `max_detF` is increased
`_check_loop_vars`(inputs, min_iterations)	Updates `patience_counter` if `max_detF` not increased
`_update_history`(inputs, history, counter, ind)	Puts current fitting statistics into history arrays
`_slogdet`(matrix)	Combined summed logarithmic determinant
`_construct_derivatives`(x_mp)	Builds derivatives of the network outputs wrt model parameters
`_get_F_statistics`([w, key, validate])	Calculates the Fisher information and returns all statistics used
`_calculate_F_statistics`(summaries, derivatives)	Calculates the Fisher information matrix from network outputs
`_get_regularisation_strength`(Λ2, λ, α)	Coupling strength of the regularisation (amplified sigmoid)
`_get_regularisation`(C, invC)	Difference of the covariance (and its inverse) from identity
`_get_loss`(w, λ, α[, key])	Calculates the loss function and returns auxillary variables
`_calculate_loss`(summaries, derivatives, λ, α)	Calculates the loss function from network summaries and derivatives
`_setup_plot`([ax, expected_detF, figsize])	Builds axes for history plot

_set_data(δθ, fiducial, derivative, validation_fiducial, validation_derivative)¶

Checks and sets data attributes with the correct shape

Parameters

δθ (float(n_params,)) – Size of perturbation to model parameters for the numerical derivative
fiducial (list of tf.data.Datesets) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for fitting)
derivative (list of tf.data.Datesets) – The derivative of the simulations with respect to the model parameters (for fitting)
validation_fiducial (list of tf.data.Datesets or None, default=None) – The simulations generated at the fiducial model parameter values used for calculating the covariance of network outputs (for validation). Sets validate = True attribute if provided
validation_derivative (list of tf.data.Datesets or None, default=None) – The derivative of the simulations with respect to the model parameters (for validation). Sets validate = True attribute if provided

Raises

ValueError – if δθ is None
ValueError – if δθ has wrong shape
TypeError – if δθ has wrong type

Notes

No checking is done on the correctness of the tf.data.Dataset

_set_dataset(prefetch=None, cache=None)¶

Overwritten function to prevent building dataset, does list check

Raises

ValueError – if fiducial or derivative are None
ValueError – if any dataset has wrong shape
TypeError – if any dataset has wrong type

TFRecords (writer)¶

class imnn.TFRecords(record_size=150.0, padding=5)¶

Module for writing simulations to TFRecord to be used by the IMNN

Parameters

record_size (int) – approximate maximum size of an individual record (in Mb)
padding (int) – zero padding size for record name numbering
input_shape (tuple) – shape of a single simulation
file (str) – filename to save the records to

Public Methods:

`__init__`([record_size, padding])	Constructor method
`write_record`(n_sims, get_simulation[, …])	Write all simulations to set of tfrecords
`fiducial_serialiser`(seed, counter, …)	Serialises a fiducial simulation
`derivative_serialiser`(simulation, counter, …)	Serialises a numerical derivative simulation
`get_serialiser`(fiducial, get_simulation, …)	Returns the fiducial or derivative serialiser
`parser`(example)	Maps a serialised example simulation into its float32 representation
`derivative_parser`(example[, n_params])	Maps a serialised example derivative into its float32 representation
`numerical_derivative_parser`(example[, n_params])	Maps a serialised example derivative into its float32 representation
`get_file`(directory, filename, fiducial, …)	Constructs filepath and name that records will be saved to
`check_size`(counter)	Checks the size of the current record in Mb to see whether its full
`get_initial_seed`(fiducial, start)	Sets the initial seed index (or set of seeds) to get simulations at
`get_seed`(simulation, fiducial)	Gets the seed index depending on input
`check_func`(get_simulation, fiducial)	Checks the simulation grabber takes the correct number of arguments
`check_params`(n_params, fiducial)	Checks that n_params is actually given

Private Methods:

`_bytes_feature`(value)	Makes a serialised byte list from value (converts tensors to numpy)
`_int64_feature`(value)	Makes a serialised int list from value

_bytes_feature(value)¶

Makes a serialised byte list from value (converts tensors to numpy)

Parameters: value (float (possibly Tensor)) – the simulation to be serialised
Returns: the simulation as a string of bytes
Return type: byte_list

_int64_feature(value)¶

Makes a serialised int list from value

Parameters: value (int) – the seed to be serialised
Returns: the serialised int list
Return type: int_list

check_func(get_simulation, fiducial)¶

Checks the simulation grabber takes the correct number of arguments

Parameters

get_simulation (fn) – a function to get a single simulation
fiducial – if a fiducial simulation record is being constructed

check_params(n_params, fiducial)¶

Checks that n_params is actually given

Parameters

n_params (int) – the number of parameters in the derivative
fiducial – if a fiducial simulation record is being constructed

check_size(counter)¶

Checks the size of the current record in Mb to see whether its full

Parameters: counter (int) – The current open record being written to
Returns: True if the record has reached (exceeded) the preassigned record size
Return type: bool

derivative_parser(example, n_params=None)¶

Maps a serialised example derivative into its float32 representation

This is the parser for an analytical or automatic derivative of the simulation with respect to model parameters (should be serialised with fiducial_serialiser).

Parameters

example (str) – The serialised string to be parsed
n_params (int or None, default=None) – The number of parameters in the derivative. This is required but named here for use with functools.partial

Returns

The parsed numerical form of the serialised input

Return type

float(input_shape, n_params)

derivative_serialiser(simulation, counter, get_simulation, n_params)¶

Serialises a numerical derivative simulation

Takes a tuple containing a seed index, an index describing whether the corresponding simulation is produced below or above the fiducial parameter values and an index describing which parameter the simulation is going to be a derivative of. These are used to sequentially get simulations from a function until either all of the simulations for the derivative of a single seed are collected or until the record is full at which it breaks out of the loop to create the next record and continue there.

Parameters

simulation (tuple) –
- (int) – index to grab a simulation at
- (int) – index describing whether generated above or below fid
- (int) – index for the parameter of interest
counter (int) – the number record which is being written to
get_simulation (fn) – a function which takes an index and returns a simulation corresponding to that index
n_params (int) – the number of parameters in the numerical derivative

Returns

(int) – index up to where the simulation was grabbed
(int) – index describing whether generated above or below fid
(int) – index for the parameter of interest

Return type

tuple

fiducial_serialiser(seed, counter, get_simulation)¶

Serialises a fiducial simulation

Takes a seed index and a function to get a simulation at that given index (and a counter for printing the number of files made). This seed and simulation are then serialised and written to file.

Parameters

seed (int) – index to grab a simulation at
counter (int) – the number record which is being written to
get_simulation (fn) – a function which takes an index and returns a simulation corresponding to that index

Returns

the input seed value increased by 1

Return type

int

get_file(directory, filename, fiducial, validation)¶

Constructs filepath and name that records will be saved to

Parameters

directory (str) – The full path to where the files should be saved
filename (str or None) – a filename to save the records to, if None default names are given depending on the value of fiducial and validation
fiducial (bool) – whether to call a file fiducial (if filename is None) or derivative
validation (bool) – whether to prepend validation_ to the filename (if filename is None)

Returns

the filename to save the record to

Return type

str

get_initial_seed(fiducial, start)¶

Sets the initial seed index (or set of seeds) to get simulations at

When constructing a record with a fiducial (or exact derivative) only a seed index for the simulation is needed whilst a derivative index and a parameter index are also needed if a numerical derivative simulation is being collected. Seed indexs will increase incrementally by 1.

Parameters

fiducial (bool) – if a fiducial simulation record is being constructed
start (int) – the initial seed index to collect the simulation at

Returns

(int) – the initial seed index to collect the simulations at
(tuple)
- (int) – the initial seed index to collect the simulation
- (int) – whether the simulation is generated below or above the fiducial parameter values
- (int) – which respect to which parameter the simulation is used to calculate the numerical gradient

Return type

int or tuple

get_seed(simulation, fiducial)¶

Gets the seed index depending on input

With a fiducial (or exact derivative) only a seed index for the simulation is needed whilst a derivative index and a parameter index are also needed if a numerical derivative simulation is being collected

Parameters

simulation (tuple) –
- (int) – the initial seed index to collect the simulation
- (int) – whether the simulation is generated below or above the fiducial parameter values
- (int) – which respect to which parameter the simulation is used to calculate the numerical gradient
fiducial (bool) – if a fiducial simulation record is being constructed

Returns

the seed index to collect the simulation at

Return type

int

get_serialiser(fiducial, get_simulation, n_params)¶

Returns the fiducial or derivative serialiser

Parameters

fiducial (bool) – whether the fiducial serialiser should be returned or not
get_simulation (fn) – the function which returns either a simulation or a part of a derivative simulation
n_params (int) – the number of parameters that the derivative is taken wrt

Returns

either the fiducial serialiser or the derivative serialiser

Return type

fn

numerical_derivative_parser(example, n_params=None)¶

Maps a serialised example derivative into its float32 representation

This is the parser for all the simulations necessary for making a numerical derivative of a simulation with respect to model parameters (should be serialised with derivative_serialiser).

Parameters

example (str) – The serialised string to be parsed
n_params (int or None, default=None) – The number of parameters in the derivative. This is required but named here for use with functools.partial

Returns

The parsed numerical form of the serialised input

Return type

float(2, n_params, input_shape)

parser(example)¶

Maps a serialised example simulation into its float32 representation

Parameters: example (str) – The serialised string to be parsed
Returns: The parsed numerical form of the serialised input
Return type: float(input_shape)

write_record(n_sims, get_simulation, fiducial=True, n_params=None, validation=False, directory=None, filename=None, start=0)¶

Write all simulations to set of tfrecords

Parameters

n_sims (int) – number of simulations to be written to record
get_simulation (func) – function (1 or 3 inputs for fiducial or derivative) which returns a single simulation as a numpy array
fiducial (bool (default True)) – whether the simulations are in the fiducial or derivative format
n_params (int (opt)) – number of parameters in the simulator model (for the derivative)
validation (bool (default False)) – tag to automatically prepend validation to the filename
directory (str (opt)) – directory to save records. defaults to current directory
filename (str (opt)) – filename to save records. defaults to fiducial and derivative depending on the value of fiducial
start (int (opt)) – value to start seed at

Using tf.data.Datasets for loading¶

DatasetGradientIMNN ¶

DatasetNumericalGradientIMNN ¶

TFRecords (writer)¶

Information Maximising Neural Network

Navigation

Related Topics

Using tf.data.Datasets for loading¶

DatasetGradientIMNN¶

DatasetNumericalGradientIMNN¶

TFRecords (writer)¶

DatasetGradientIMNN ¶

DatasetNumericalGradientIMNN ¶