Giter Club home page Giter Club logo

astetik's Introduction

Build Status Coverage Status Dependency Status PEP8

Autonomio provides a very high level abstraction layer for rapidly testing research ideas and instantly creating neural network based decision making models. Autonomio is built on top of Keras, using Tensorflow as a backend and spaCy for word vectorization. Autonomio brings deep learning and state-of-the-art linguistic processing accessible to anyone with basic computer skills. This document focus on an overview of Autonomio's capabilities.

If you want something higher level visit the website.

Getting Started

The simplest way is to install with pip from the repo directly.

pip install git+https://github.com/autonomio/core-module.git

User Documentation

You can find a comprehensive user documentation with code examples here.

Contribution Guidelines

Contributions are most welcome, read more here.

Examples

  • capabilities overview link
  • data transformation link
  • hyperparameter search link

(more examples coming soon / dated 31st of July, 2017)

Key Features

  • intuitive single-command user interface
  • hyper parameter grid search
  • comprehensive automated data transformation
  • optimized for Jupyter notebook use
  • NN shape selection and other unique configurations
  • create MLP, LSTM and Regression models
  • seamlessly integrates word2vec with Keras deep learning
  • interactive plots specifically designed for deep learning model evaluation

For most use cases successfully running a neural network works out of the box with zero configuration yielding a model that can be used to predict outcomes later.

Out-of-the-box use cases

Autonomio is the only deep learning workbench 100% focused on data science applications as opposed to perception problems (e.g. image detection), and have been used in a wide range of industrial and academic use cases.

  • Sentiment analysis
  • Social media account classification
  • Spam detection
  • Website classification
  • Fraud detection
  • Employee satisfaction evaluation
  • Popular Kaggle challenges (e.g. Titanic)

One line use examples

Training a model

First take care of the imports:

from autonomio.commands import train, predictor
%matplotlib inline

Then train the model:

train(x, y, data)

Training an LSTM model is even simpler:

train(x,model='lstm')

Making a prediction

predictor(data, saved_model_name)  

Visualization

Standard Training Output

mlp and regression training result

LSTM Training output

lstm training output

Hyperscan Output

4 dimensional hyperscan result

Tested Systems

Autonomio have been tested in several Mac OSX and Ubuntu environments (both server and desktop). Travis builds use Ubuntu Precise.

Minimum Hardware

You need a machine with at least 4gb of memory if you want to do text processing, and othewrise 2gb is totally fine and 1gb might be ok. Actually very low spec AWS instance runs Autonomio just fine.

Recommended setup

For research and production environments we recommend one server with at least 4gb memory as a 'work station' and a separate instance with high-end CUDA supported GPU. The GPU instance costs roughly $1 per hour, and can be shut down when not used. As setting up the GPU station from ground can be a bit of a headache, we recommend using the AWS Machine Learning AMI to get setup quickly.

Dependencies

Data Manipulation

Numpy

Pandas

Word Processing

spaCy

Deep Learning

Keras

Tensorflow

Visualization

Matplotlib

mpld3

Major credits to all the contributors to these amazing packages. Autonomio would definitely not be possible without them.

astetik's People

Contributors

c0ntribute avatar mikkokotila avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

astetik's Issues

compare visual bug

If there are many values in label_col, the height of the graphic is not right.

KeyError: 'savefig.frameon is not a valid rc parameter (see rcParams.keys() for a list of valid parameters)'

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/wip/lib/python3.8/site-packages/matplotlib/__init__.py in __setitem__(self, key, val)
    676             try:
--> 677                 cval = self.validate[key](val)
    678             except ValueError as ve:

KeyError: 'savefig.frameon'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-11-3c6b5ba0f4ae> in <module>
----> 1 a.plot_box(t.data.activation, t.data.val_f1score)

~/dev/talos/talos/commands/analyze.py in plot_box(self, x, y, hue)
    137         try:
    138             import astetik as ast
--> 139             return ast.box(self.data, x, y, hue)
    140         except RuntimeError:
    141             print('Matplotlib Runtime Error. Plots will not work.')

~/miniconda3/envs/wip/lib/python3.8/site-packages/astetik/plots/box.py in box(data, x, y, hue, palette, style, dpi, title, sub_title, x_label, y_label, legend, x_scale, y_scale, x_limit, y_limit, save)
    114 
    115     # HEADER STARTS >>>
--> 116     palette = _header(palette, style, n_colors=n, dpi=dpi)  # NOTE: y exception
    117     # <<< HEADER ENDS
    118     p, ax = plt.subplots(figsize=(params()['fig_width'],

~/miniconda3/envs/wip/lib/python3.8/site-packages/astetik/style/template.py in _header(palette, style, n_colors, dpi, fig_width, fig_height)
     48     style_dic = styles(dpi)
     49     for key in style_dic.keys():
---> 50         rcParams[key] = style_dic[key]
     51 
     52     return palette

~/miniconda3/envs/wip/lib/python3.8/site-packages/matplotlib/__init__.py in __setitem__(self, key, val)
    680             dict.__setitem__(self, key, cval)
    681         except KeyError as err:
--> 682             raise KeyError(
    683                 f"{key} is not a valid rc parameter (see rcParams.keys() for "
    684                 f"a list of valid parameters)") from err

KeyError: 'savefig.frameon is not a valid rc parameter (see rcParams.keys() for a list of valid parameters)'

scaling needs to be checked for the line plot

Maybe in general scaling (to log and so forth) would be better to have two options; one where it's done on the plot level after creating the axes, and the other where it's done on the data before.

handling of long x labels and legend labels

There seems to be three options that the user should have:

  • truncate (to a fixed length)
  • rotate (does not work for legends)
  • insert manually shorter ones (already added to line())

OLS should allow more features

Instead of just the current 3 dv, there should be a way to add as many as one likes. The other option is to drop OLS entirely.

Change Legend location and/or Legend Visualization in bargrid-function

Hey,

first of all, thanks for creating and supporting talos/astetik! I came to astetik via Talos and both tools are highly useful in machine learning for someone not that advanced in coding like me.

While creating bargrid-plots i noticed, that the legend of the hue-variable often partially blocks some of the sub-plots (picture 1).

grafik

The docs to bargrid-function showed, that there was once a legend_position variable, which is currently commented out.

My question are:

  • if it´s possible to (re-)enable the setting of the legend-position?
  • or how/if it´s feasible to add an option with which it´s possible to change the legend-background to solid like you did in your machinelearningmastery article "Hyperparameter Optimization with Keras" (picture 2)

grafik

I think, that adding this functionality (back?) can greatly help the visualization, especially if you share it with other people.
If my questions are already implemented and just couldn´t find them i apologize. I´m quite new to python data visualization and problably didn´t notice the solutions.

Greetings

pip install astetik is failing due to sklearn in requirements.txt instead of scikit-learn

pip install astetik is failing because it is unable to install the deprecated sklearn dependency. In the pypi page of sklearn, there is a notice to start using scikit-learn instead of sklearn .
Request to change this in the requirements.txt and setup.py

Error message of pip installation in a fresh python environment

pip install astetik
Collecting astetik
  Using cached astetik-1.13-py3-none-any.whl (5.4 MB)
Collecting wrangle
  Using cached wrangle-0.7.2-py3-none-any.whl (52 kB)
Collecting pandas
  Using cached pandas-2.0.0-cp310-cp310-win_amd64.whl (11.2 MB)
Collecting geonamescache
  Using cached geonamescache-1.5.0-py3-none-any.whl (26.4 MB)
Collecting seaborn
  Using cached seaborn-0.12.2-py3-none-any.whl (293 kB)
Collecting statsmodels
  Using cached statsmodels-0.13.5-cp310-cp310-win_amd64.whl (9.1 MB)
Collecting sklearn
  Using cached sklearn-0.0.post4.tar.gz (3.6 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\mathavraj.j\AppData\Local\Temp\pip-install-bz3pwm5u\sklearn_08f62083d18c41bc8e66d870ee553b3f\setup.py", line 10, in <module>
          LONG_DESCRIPTION = f.read()
        File "C:\Users\mathavraj.j\AppData\Local\Programs\HCLTech\AION\2.7.0.3\python\lib\encodings\cp1252.py", line 23, in decode
          return codecs.charmap_decode(input,self.errors,decoding_table)[0]
      UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 7: character maps to <undefined>
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.```

Issue with images in docs

There seems to be some issue with the images on the main page of the docs:

image

I'm visualizing with Microsoft Edge 95.0.1020.53 64-bit

add scaling for 'log' and 'symlog' for special cases

Some plots like multikde() are not using the standard scaling module. Such cases need to have a separate scaling function. Maybe it make sense to completely take out scaling from matplotlib and handle it all in numpy before passing the data. More options for scaling this way too.

Indetion of labels in analyze_object.correlate

Hello,

For some reason I get Indeted labels (different indention for different Scan runs). In the figure there is a lable indetion of the heatmap correlation figure for example.

How could this be solved?

Thank you

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.