profit.sur.encoders

Module Contents

Classes

Encoder

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

ExcludeEncoder

Excludes specific columns from the fit. Afterwards they are inserted at the same position.

Log10Encoder

Transforms the specified columns with $log_{10}$. This is done for LogUniform variables by default.

Normalization

Normalization of the specified columns. Usually this is done for all input and output,

PCA

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

KarhunenLoeve

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

class profit.sur.encoders.Encoder(columns, parameters=None)[source]

Bases: profit.util.base_class.CustomABC

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

The base class itself does nothing. It delegates the encoding process to the childs which are called by their registered labels.

Parameters:
  • columns (list[int]) – Columns of the data the encoder acts on.

  • parameters (dict) – Miscellaneous parameters stored during encoding, which are needed for decoding. E.g. the scaling factor during normalization.

label

Label of the encoder class.

Type:

str

property repr

Easy to handle representation of the encoder for saving and loading. :returns:

List of all relevant information to reconstruct the encoder.

(label, columns, parameters dict)

Return type:

list

labels
encode(x)[source]

Applies the encoding function on given columns.

Parameters:

x (ndarray) – Array to which the encoding is applied.

Returns:

An encoded copy of the array x.

Return type:

ndarray

decode(x)[source]

Applies the decoding function on given columns.

Parameters:

x (ndarray) – Array to which the decoding is applied.

Returns:

A decoded copy of the array x.

Return type:

ndarray

decode_hyperparameters(value)[source]

Decoder for the surrogate hyperparameters, as the direct model uses encoded values. As a default, the unchanged value is returned.

Parameters:

value (np.array) – The (encoded) value of the hyperparameter.

Returns:

Decoded value.

Return type:

np.array

decode_variance(variance)[source]
encode_func(x)[source]
Returns:

Function used for decoding the data. E.g. \(\log_{10}(x)\).

Return type:

ndarray

decode_func(x)[source]
Returns:

Inverse transform of the encoding function. For an encoding of \(\log_{10}(x)\) this

would be \(10^x\).

Return type:

ndarray

class profit.sur.encoders.ExcludeEncoder(columns, parameters=None)[source]

Bases: Encoder

Excludes specific columns from the fit. Afterwards they are inserted at the same position.

Variables:

excluded_values (np.array): Slice of the input data which is excluded.

encode(x)[source]

Applies the encoding function on given columns.

Parameters:

x (ndarray) – Array to which the encoding is applied.

Returns:

An encoded copy of the array x.

Return type:

ndarray

decode(x)[source]

Applies the decoding function on given columns.

Parameters:

x (ndarray) – Array to which the decoding is applied.

Returns:

A decoded copy of the array x.

Return type:

ndarray

class profit.sur.encoders.Log10Encoder(columns, parameters=None)[source]

Bases: Encoder

Transforms the specified columns with \(log_{10}\). This is done for LogUniform variables by default.

encode_func(x)[source]
Returns:

Function used for decoding the data. E.g. \(\log_{10}(x)\).

Return type:

ndarray

decode_func(x)[source]
Returns:

Inverse transform of the encoding function. For an encoding of \(\log_{10}(x)\) this

would be \(10^x\).

Return type:

ndarray

class profit.sur.encoders.Normalization(columns, parameters=None)[source]

Bases: Encoder

Normalization of the specified columns. Usually this is done for all input and output,

so the surrogate can fit on a (0, 1)^n cube with zero mean and unit variance.

\[\begin{split} \begin{align} x' &= (x - x_{min}) / (x_{max} - x_{min}) \\ x & = (x_{max} - x_{min}) * x' + x_{min} \end{align} \end{split}\]

Parameters:
  • xmax (np.array) – Max. value of the data for each column.

  • xmin (np.array) – Min. value of the data for each column.

  • xmean (np.array) – Mean value of the data for each column.

  • xstd (np.array) – Standard deviation of the data for each column.

  • xmax_centered (np.array) – Max. value of the data after mean and variance standardization.

  • xmin_centered (np.array) – Min. value of the data after mean and variance standardization.

encode(x)[source]

Applies the encoding function on given columns.

Parameters:

x (ndarray) – Array to which the encoding is applied.

Returns:

An encoded copy of the array x.

Return type:

ndarray

encode_func(x)[source]
Returns:

Function used for decoding the data. E.g. \(\log_{10}(x)\).

Return type:

ndarray

decode_func(x)[source]
Returns:

Inverse transform of the encoding function. For an encoding of \(\log_{10}(x)\) this

would be \(10^x\).

Return type:

ndarray

decode_hyperparameters(value)[source]

Decode surrogate’s hyperparameters. Distinguish between length_scale (only input_encoders) and sigma_f, sigma_n (only output_encoders) done in profit.sur.gp.gaussian_process.GaussianProcess.

Parameters:

value (np.array) – The (encoded) value of the hyperparameter.

Returns:

Decoded value.

Return type:

np.array

decode_variance(variance)[source]
class profit.sur.encoders.PCA(columns=(), parameters=None)[source]

Bases: Encoder

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

The base class itself does nothing. It delegates the encoding process to the childs which are called by their registered labels.

Parameters:
  • columns (list[int]) – Columns of the data the encoder acts on.

  • parameters (dict) – Miscellaneous parameters stored during encoding, which are needed for decoding. E.g. the scaling factor during normalization.

label

Label of the encoder class.

Type:

str

property features

Returns: neig feature vectors of length N.

init_eigvalues(y)[source]
encode(y)[source]
Parameters:

y – ntest sample vectors of length N.

Returns:

Expansion coefficients of y in eigenbasis.

decode(z)[source]
Parameters:

z – Expansion coefficients of y in eigenbasis.

Returns:

Reconstructed ntest sample vectors of length N.

decode_variance(variance)[source]
class profit.sur.encoders.KarhunenLoeve(columns=(), parameters=None)[source]

Bases: PCA

Base class to handle encoding and decoding of the input and output data before creating the surrogate model.

The base class itself does nothing. It delegates the encoding process to the childs which are called by their registered labels.

Parameters:
  • columns (list[int]) – Columns of the data the encoder acts on.

  • parameters (dict) – Miscellaneous parameters stored during encoding, which are needed for decoding. E.g. the scaling factor during normalization.

label

Label of the encoder class.

Type:

str

property features

Returns: neig feature vectors of length N.

encode(y)[source]
Parameters:

y – ntest sample vectors of length N.

Returns:

Expansion coefficients of y in eigenbasis.

init_eigvalues(y)[source]
decode(z)[source]
Parameters:

z – Expansion coefficients of y in eigenbasis.

Returns:

Reconstructed ntest sample vectors of length N.

decode_variance(variance)[source]