seldonio / alibi-detect Goto Github PK
View Code? Open in Web Editor NEWAlgorithms for outlier, adversarial and drift detection
Home Page: https://docs.seldon.io/projects/alibi-detect/en/stable/
License: Other
Algorithms for outlier, adversarial and drift detection
Home Page: https://docs.seldon.io/projects/alibi-detect/en/stable/
License: Other
The example can follow the existing ones for outlier detection in the repo.
Hello everyone.
I have been trying to run the tutorial https://docs.seldon.io/projects/alibi-detect/en/stable/examples/ad_advvae_mnist.html#
However, I have been running into some issues. I will list them below:
If I try to download the model from the google cloud library, the load_detector(filepath_ad) does not work, even though the same procedure using load_tf_model worked for loading the MNIST model. It seems that the problem lies with namings, because I tried to trace it back in the source code and the naming goes from AdversarialVAE to AdversarialAE, so I think there is a problem in the pickle files maybe?
I then tried to train by myself the adversarial detector, following the code in the link. However, I start getting again some issues. When calling the init method for AdversarialAE (not AdversarialVAE like written in tutorial), I write:
ad = AdversarialVAE(threshold=.5, # threshold for adversarial score
model=model,
encoder_net=encoder_net, # can also pass VAE model instead
decoder_net=decoder_net, # of separate encoder and decoder
latent_dim=latent_dim,
samples=2, # nb of samples drawn by VAE
beta=0. # weight on KL-divergence loss term of latent space
)
However I get an error saying that there is no latent_dim keyword, and if I remove that line I get an error saying that there is no samples keyword method. Looking at the source code, it seems true:
def __init__(self,
threshold: float = None,
ae: tf.keras.Model = None,
model: tf.keras.Model = None,
encoder_net: tf.keras.Sequential = None,
decoder_net: tf.keras.Sequential = None,
model_hl: List[tf.keras.Model] = None,
hidden_layer_kld: dict = None,
w_model_hl: list = None,
temperature: float = 1.,
data_type: str = None
) -> None:
If I remove all the missing keywords, calling the fit method results in dimension error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [2048,512], In[1]: [50,1568] [Op:MatMul] name: ae/decoder/sequential_1/dense/Tensordot/MatMul/
Could someone look into this?
The std_clip parameter makes the mean and covariance updates more stable. This can be important when the outliers arrive in batches instead of e.g. uniform. The effect is already visualized in https://github.com/SeldonIO/seldon-core/blob/master/components/outlier-detection/mahalanobis/doc.ipynb.
From TF 2.2, override Model.train_step
with custom logic. This allows custom training loops while keeping all the model.fit
functionality such as callbacks.
rename ODCD server to alibi-detect
links on README not working and typo in example
setup.py contains url's to odcd/ocdc instead of alibi-detect
renaming also needs to happen in kfserving samples; e.g. model.py and makefiles
README examples: add signs example
pyaload
and a not an outlier
rename mnistar directory to mnistad
change tensorflow_probability
requirement to same as in main library (>=0.8)
Download the MNIST model:
to Download the traffic signs model:
python #label = 'Width: {}\nHeight: {}'.format(widths[col], heights[row]) #ax.annotate(label, (0.1, 0.5), xycoords='axes fraction', va='center')
CIFAR10 Outlier Model
:
print statements for debugging?
Hi,
Is there a reason why there is no vanilla AE outlier detector in od
module? This would be nice to have as a baseline.
Hi,
Why is it reduce_sum and not reduce_mean in the elbo loss? Doesn't it overbalance the KL divergence loss term?
Can put a temporary fix in place by wiping the stan backend logger as an attribute: facebook/prophet#1361
when I install alibi-detect,it turns out:
ERROR: Could not find a version that satisfies the requirement tensorboard<2.2.0,>=2.1.0 (from tensorflow>=2->alibi-detect) (from versions: 1.6.0rc0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.11.0, 1.12.0, 1.12.1, 1.12.2, 1.13.0, 1.13.1, 1.14.0, 1.15.0, 2.0.0, 2.0.1, 2.0.2)
ERROR: No matching distribution found for tensorboard<2.2.0,>=2.1.0 (from tensorflow>=2->alibi-detect)
my environment is
Python 3.6.2 |Anaconda custom (64-bit)| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] on win32
We should use tensorflow.keras
exclusively from now on. Mixingkeras
and tf.keras
operations can break stuff:
Benchmark outlier and adversarial detection algorithms across datasets covering at least tabular (numerical/categorical features and low/high dimensionality), image and time series datasets.
I tried to pip alibi-detect , results in successful install. But i am not able to import outlierProphet files.
Second issue not being able to import KSD drift although all requirements are satisfied.
I am getting following exception:
from .creme_to_sklearn import convert_creme_to_sklearn
File "t/lib/python3.5/site-packages/creme/compat/creme_to_sklearn.py", line 221
raise ValueError(f'n_classes is more than 2 but {self.creme_estimator} is a ' +
^
SyntaxError: invalid syntax
ImportError Traceback (most recent call last)
<ipython-input-38-254c7647900a> in <module>
----> 1 from alibi_detect.cd import KSDrift
ImportError: cannot import name 'KSDrift'
Thanks
I have a problem running the "AEGMM and VAEGMM outlier detection on KDD Cup '99 dataset" notebook. For either model, after downloading its parameters from the cloud, loading fails with this error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-9-7cf85ac01c21> in <module>
1 filepath = './od_aegmm_kddcup/' # change to directory where model is downloaded
2 if load_outlier_detector: # load pretrained outlier detector
----> 3 od = load_detector(filepath)
4 else: # define model, initialize, train and save outlier detector
5 # the model defined here is similar to the one defined in the original paper
~/dev/alibi-detect/alibi_detect/utils/saving.py in load_detector(filepath)
414
415 # load outlier detector specific parameters
--> 416 state_dict = pickle.load(open(os.path.join(filepath, detector_name + '.pickle'), 'rb'))
417
418 # initialize outlier detector
ModuleNotFoundError: No module named 'odcd'
The holiday package (dependence of fbprophe) returns an error (https://github.com/dr-prodigy/python-holidays/issues/277). I fixed the issue by pinning the holiday package to version 0.9.11 (pip install --upgrade holidays==0.9.11
)
Investigate whether Isolation Forests (and more generally naive decision tree models like in scikit-learn) can be improved by first mapping the categorical features to numerical space using ABDM and multidim scaling, similar to the example in #83 or the Mahalanobis outlier detector. The reason is that the splitting of ordinally encoded categorical variables is more natural since the preprocessing technique infers an order between the categories.
It would be nice to be able to access prediction scores as preds.data.instance_score
instead of preds["data"]["instance_score"]
. Could be accomplished with addict
https://github.com/mewwts/addict (also Bunch
could be replaced with addict
if it doesn't serve other purpose). This is just a personal opinion though, feel free to ignore. :)
I was hoping to use Alibi-detect in order to detect concept drift in a text datastream. However, based on the documentation, it doesn't appear that this is possible. If possible, can you suggest any other project(s) that would allow for this functionality or would your library work if I were to tokenize my data first?
We've got a resnet32 version of the cifar10 classifier in h5 format. Would be great to get in tfserving format like other models we have so we can run as a SeldonDeployment.
Hi,
When training AEGMM on the KDDCUP99 data I often get:
InvalidArgumentError: Cholesky decomposition was not successful. The input might not be valid. [Op:Cholesky]
(raised when computing gmm_params in the loss function).
Do you know what might cause this? From my experiments seems to be related to the size of latent_dim
.
Cheers :)
Basic functionality to cover:
Add separate section to docs with utility functions which could be useful as standalone functions (e.g. fetching the datasets, transforming categorical to numerical values, data perturbation functions etc).
Current extrapolation method to get the estimated points after the last timestep T ignores e.g. potential seasonalities and can be improved upon.
Installing fbprophet
seems always taking a long time and it seems to be only used by one detector.
I suggest making it optional dependency, defining in extras_require
and adding simple check that would disable that one detector if fbprophet
is not installed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.