alibi_detect.ad.adversarialae module¶
-
class
alibi_detect.ad.adversarialae.
AdversarialAE
(threshold=None, ae=None, model=None, encoder_net=None, decoder_net=None, model_hl=None, hidden_layer_kld=None, w_model_hl=None, temperature=1.0, data_type=None)[source]¶ Bases:
alibi_detect.base.BaseDetector
,alibi_detect.base.FitMixin
,alibi_detect.base.ThresholdMixin
-
__init__
(threshold=None, ae=None, model=None, encoder_net=None, decoder_net=None, model_hl=None, hidden_layer_kld=None, w_model_hl=None, temperature=1.0, data_type=None)[source]¶ Autoencoder (AE) based adversarial detector.
- Parameters
threshold (
Optional
[float
]) – Threshold used for adversarial score to determine adversarial instances.ae (
Optional
[tensorflow.keras.Model]) – A trained tf.keras autoencoder model if available.model (
Optional
[tensorflow.keras.Model]) – A trained tf.keras classification model.encoder_net (
Optional
[tensorflow.keras.Sequential]) – Layers for the encoder wrapped in a tf.keras.Sequential class if no ‘ae’ is specified.decoder_net (
Optional
[tensorflow.keras.Sequential]) – Layers for the decoder wrapped in a tf.keras.Sequential class if no ‘ae’ is specified.model_hl (
Optional
[List
[tensorflow.keras.Model]]) – List with tf.keras models for the hidden layer K-L divergence computation.hidden_layer_kld (
Optional
[dict
]) – Dictionary with as keys the hidden layer(s) of the model which are extracted and used during training of the AE, and as values the output dimension for the hidden layer.w_model_hl (
Optional
[list
]) – Weights assigned to the loss of each model in model_hl.temperature (
float
) – Temperature used for model prediction scaling. Temperature <1 sharpens the prediction probability distribution.data_type (
Optional
[str
]) – Optionally specifiy the data type (tabular, image or time-series). Added to metadata.
- Return type
None
-
correct
(X, batch_size=10000000000, return_instance_score=True, return_all_predictions=True)[source]¶ Correct adversarial instances if the adversarial score is above the threshold.
- Parameters
- Return type
- Returns
Dict with corrected predictions and information whether an instance is adversarial or not.
-
fit
(X, loss_fn=<function loss_adv_ae>, w_model=1.0, w_recon=0.0, optimizer=tensorflow.keras.optimizers.Adam, epochs=20, batch_size=128, verbose=True, log_metric=None, callbacks=None, preprocess_fn=None)[source]¶ Train Adversarial AE model.
- Parameters
X (numpy.ndarray) – Training batch.
loss_fn (tensorflow.keras.losses) – Loss function used for training.
w_model (
float
) – Weight on model prediction loss term.w_recon (
float
) – Weight on MSE reconstruction error loss term.optimizer (tensorflow.keras.optimizers) – Optimizer used for training.
epochs (
int
) – Number of training epochs.batch_size (
int
) – Batch size used for training.verbose (
bool
) – Whether to print training progress.log_metric (
Optional
[Tuple
[str
, tensorflow.keras.metrics]]) – Additional metrics whose progress will be displayed if verbose equals True.callbacks (
Optional
[tensorflow.keras.callbacks]) – Callbacks used during training.preprocess_fn (
Optional
[Callable
]) – Preprocessing function applied to each training batch.
- Return type
None
-
infer_threshold
(X, threshold_perc=99.0, margin=0.0, batch_size=10000000000)[source]¶ Update threshold by a value inferred from the percentage of instances considered to be adversarial in a sample of the dataset.
- Parameters
X (numpy.ndarray) – Batch of instances.
threshold_perc (
float
) – Percentage of X considered to be normal based on the adversarial score.margin (
float
) – Add margin to threshold. Useful if adversarial instances have significantly higher scores and there is no adversarial instance in X.batch_size (
int
) – Batch size used when computing scores.
- Return type
None
-
predict
(X, batch_size=10000000000, return_instance_score=True)[source]¶ Predict whether instances are adversarial instances or not.
- Parameters
- Return type
- Returns
Dictionary containing ‘meta’ and ‘data’ dictionaries.
’meta’ has the model’s metadata.
’data’ contains the adversarial predictions and instance level adversarial scores.
-
-
class
alibi_detect.ad.adversarialae.
DenseHidden
(model, hidden_layer, output_dim, hidden_dim=None)[source]¶ Bases:
tensorflow.keras.Model