profit.sur.gp.gpy_surrogate

Module Contents

Classes

GPySurrogate

Surrogate for https://github.com/SheffieldML/GPy.

CoregionalizedGPySurrogate

Surrogate for https://github.com/SheffieldML/GPy.

class profit.sur.gp.gpy_surrogate.GPySurrogate[source]

Bases: profit.sur.gp.GaussianProcess

Surrogate for https://github.com/SheffieldML/GPy.

model

Model object of GPy.

Type:

GPy.models

train(X, y, kernel=defaults['kernel'], hyperparameters=defaults['hyperparameters'], fixed_sigma_n=base_defaults['fixed_sigma_n'])[source]

Trains the model on the dataset.

After initializing the model with a kernel function and initial hyperparameters, it can be trained on input data X and observed output data y by optimizing the model’s hyperparameters. This is done by minimizing the negative log likelihood.

Parameters:
  • X (ndarray) – (n, d) array of input training data.

  • y (ndarray) – (n, D) array of training output.

  • kernel (str/object) – Identifier of kernel like ‘RBF’ or directly the kernel object of the surrogate.

  • hyperparameters (dict) – Hyperparameters such as length scale, variance and noise. Taken either from given parameter, config file or inferred from the training data. The hyperparameters can be different depending on the kernel. E.g. The length scale can be a scalar, a vector of the size of the training data, or for the custom LinearEmbedding kernel a matrix.

  • fixed_sigma_n (bool) – Indicates if the data noise should be optimized or not.

  • return_hess_inv (bool) – Whether to the attribute hess_inv after optimization. This is important for active learning.

add_training_data(X, y)[source]

Adds training points to the existing dataset.

This is important for Active Learning. The data is added but the hyperparameters are not optimized yet.

Parameters:
  • X (ndarray) – Input points to add.

  • y (ndarray) – Observed output to add.

set_ytrain(y)[source]

Set the observed training outputs. This is important for active learning.

Parameters: y (np.array): Full training output data.

predict(Xpred, add_data_variance=True)[source]

Predicts the output at test points Xpred.

Parameters:
  • Xpred (ndarray/list) – Input points for prediction.

  • add_data_variance (bool) – Adds the data noise \(\sigma_n^2\) to the prediction variance. This is especially useful for plotting.

Returns:

a tuple containing:
  • ymean (ndarray) Predicted output values at the test input points.

  • yvar (ndarray): Diagonal of the predicted covariance matrix.

Return type:

tuple

save_model(path)[source]

Save the model as dict to a .hdf5 file.

Parameters:

path (str) – Path including the file name, where the model should be saved.

classmethod load_model(path)[source]

Loads a saved model from a .hdf5 file and updates its attributes. In case of a multi-output model, the .pkl file is loaded, since .hdf5 is not supported yet.

Parameters:

path (str) – Path including the file name, from where the model should be loaded.

Returns:

Instantiated surrogate model.

Return type:

GPy.models

select_kernel(kernel)[source]

Get the GPy.kern kernel by matching the given string kernel identifier.

Parameters:

kernel (str) – Kernel string such as ‘RBF’ or depending on the surrogate also product and sum kernels such as ‘RBF+Matern52’.

Returns:

GPy kernel object. Currently, for sum and product kernels, the initial hyperparameters are the same for all kernels.

Return type:

GPy.kern

optimize(return_hess_inv=False, **opt_kwargs)[source]

For hyperparameter optimization the GPy base optimization is used.

Currently, the inverse Hessian can not be retrieved, which limits the active learning effectivity.

Parameters:
  • return_hess_inv (bool) – Is not considered currently.

  • opt_kwargs – Keyword arguments used directly in the GPy base optimization.

_set_hyperparameters_from_model()[source]

Helper function to set the hyperparameter dict from the model.

It depends on whether it is a single kernel or a combined one.

special_hyperparameter_decoding(key, value)[source]
class profit.sur.gp.gpy_surrogate.CoregionalizedGPySurrogate[source]

Bases: GPySurrogate

Surrogate for https://github.com/SheffieldML/GPy.

model

Model object of GPy.

Type:

GPy.models

pre_train(X, y, kernel=defaults['kernel'], hyperparameters=defaults['hyperparameters'], fixed_sigma_n=base_defaults['fixed_sigma_n'])[source]

Check the training data, initialize the hyperparameters and set the kernel either from the given parameter, from config or from the default values.

Parameters:
  • X – (n, d) or (n,) array of input training data.

  • y – (n, D) or (n,) array of training output.

  • kernel (str/object) – Identifier of kernel like ‘RBF’ or directly the kernel object of the specific surrogate.

  • hyperparameters (dict) – Hyperparameters such as length scale, variance and noise. Taken either from given parameter, config file or inferred from the training data. The hyperparameters can be different depending on the kernel. E.g. The length scale can be a scalar, a vector of the size of the training data, or for the custom LinearEmbedding kernel a matrix.

  • fixed_sigma_n (bool/float/ndarray) – Indicates if the data noise should be optimized or not. If an ndarray is given, its length must match the training data.

train(X, y, kernel=defaults['kernel'], hyperparameters=defaults['hyperparameters'], fixed_sigma_n=base_defaults['fixed_sigma_n'])[source]

Trains the model on the dataset.

After initializing the model with a kernel function and initial hyperparameters, it can be trained on input data X and observed output data y by optimizing the model’s hyperparameters. This is done by minimizing the negative log likelihood.

Parameters:
  • X (ndarray) – (n, d) array of input training data.

  • y (ndarray) – (n, D) array of training output.

  • kernel (str/object) – Identifier of kernel like ‘RBF’ or directly the kernel object of the surrogate.

  • hyperparameters (dict) – Hyperparameters such as length scale, variance and noise. Taken either from given parameter, config file or inferred from the training data. The hyperparameters can be different depending on the kernel. E.g. The length scale can be a scalar, a vector of the size of the training data, or for the custom LinearEmbedding kernel a matrix.

  • fixed_sigma_n (bool) – Indicates if the data noise should be optimized or not.

  • return_hess_inv (bool) – Whether to the attribute hess_inv after optimization. This is important for active learning.

add_training_data(X, y)[source]

Adds training points to the existing dataset.

This is important for Active Learning. The data is added but the hyperparameters are not optimized yet.

Parameters:
  • X (ndarray) – Input points to add.

  • y (ndarray) – Observed output to add.

set_ytrain(y)[source]

Set the observed training outputs. This is important for active learning.

Parameters: y (np.array): Full training output data.

predict(Xpred, add_data_variance=True)[source]

Predicts the output at test points Xpred.

Parameters:
  • Xpred (ndarray/list) – Input points for prediction.

  • add_data_variance (bool) – Adds the data noise \(\sigma_n^2\) to the prediction variance. This is especially useful for plotting.

Returns:

a tuple containing:
  • ymean (ndarray) Predicted output values at the test input points.

  • yvar (ndarray): Diagonal of the predicted covariance matrix.

Return type:

tuple

save_model(path)[source]

Save the model as dict to a .hdf5 file.

Parameters:

path (str) – Path including the file name, where the model should be saved.

classmethod load_model(path)[source]

Loads a saved model from a .hdf5 file and updates its attributes. In case of a multi-output model, the .pkl file is loaded, since .hdf5 is not supported yet.

Parameters:

path (str) – Path including the file name, from where the model should be loaded.

Returns:

Instantiated surrogate model.

Return type:

GPy.models

special_hyperparameter_decoding(key, value)[source]