Sample application for ATK space shuttle demo. The default version of that application uses gateway and kafka as a streaming source. If you want to use a mqtt instead look here
- Application space-shuttle-demo listens to kafka topic and waits for feature vectors.
- When a kafka message appears, application asks Scoring Engine to classify received feature vector.
- Application stores scoring result in InfluxDB.
- Web application asks backend application (space-shuttle-demo) for a anomalies chart.
- Space-shuttle-demo gets anomalies (classes different than 1) count per minute from InfluxDB.
-
Upload the model to HDFS:
- Download already prepared model from https://s3.amazonaws.com/trustedanalytics/v0.7.1/models/space-shuttle-model.tar
- Login to TAP console and select
Data catalog
page in the navigation panel - Select
Submit Transfer
tab - Select input type:
Local path
and choose previously downloaded model - Enter
Title
- Click
Upload
Alternatively, you can create TAP Analytics Toolkit model by yourself. Please refer to the instructions.
-
Create required service instances (if they do not exist already). Application will connect to these service instances using Spring Cloud Connectors. Note: If you use the recommended names of the required service instances they will be bound automatically with the application when it is pushed to Cloud Foundry. Otherwise, service instances names will need to be adjusted in
manifest.yml
file or removed frommanifest.yml
and bound manually after application is pushed to Cloud Foundry.- Instance of InfluxDB (recommended name:
space-shuttle-db
) - Instance of Zookeeper (recommended name:
space-shuttle-zookeeper
) - Instance of Gateway called (recommended name:
space-shuttle-gateway
) - Instance of Scoring Engine with recommended name:
space-shuttle-scoring-engine
. Instructions below describe how to create the Scoring Engine service instance.
To create Scoring Engine service instance take the following actions:
- Select
Marketplace
tab in TAP Console navigation panel - Select
TAP Scoring Engine
service offering - Type name
space-shuttle-scoring-engine
- Click
+ Add an extra parameter
and add TAP Analytics Toolkit model url: key:uri
value:hdfs://path_to_model
- Click
Create new instance
- Instance of InfluxDB (recommended name:
-
Create Java package:
mvn package
- (Optional) Edit the auto-generated
manifest.yml
. If you created service instances with different names than recommended, adjust names of service instances inservices
section to match those that you've created. You can also removeservices
section and bind them manually later. You may also want to change the application host/name. - Push application to the platform using command:
cf push
- (Optional) If you removed
services
section frommanifest.yml
application will not be started yet. Bind required service instances (cf bind-service
) to the application and restage (cf restage
) the application. - The application is up and running
- Switch to
deploy
directory:cd deploy
- Download the model and rename it to
model.tar
- Install tox:
sudo -E pip install --upgrade tox
- Run:
tox
- Activate virtualenv with installed dependencies:
. .tox/py27/bin/activate
- Run deployment script:
python deploy.py
, the script will use parameters provided on input. Alternatively, provide parameters when running script. (python deploy.py -h
to check script parameters with their descriptions).
To send data to kafka through a gateway you can either push space_shuttle_client from client directory to space with existing gateway instance or use python space_shuttle_client.py
locally passing gateway url as a parameter.
- Login to space containing
space-shuttle-gateway
. - Go to:
client/
- Push app to Cloud Foundry using
cf push
.
Note: in case of name conflict during push add name parametercf push <custom_name>
- Python 2.7
- tox (installation details)
To determine URL of the gateway you are going to send data to:
- Go to
Applications
page - Search for
space-shuttle-gateway
- Copy the application URL
- Go to:
client/
- Run tox:
tox
- Activate created virtualenv:
. .tox/py27/bin/activate
- Run:
python space_shuttle_client.py --gateway-url <gateway_url>
##Creating TAP Analytics Toolkit model To create the model for Scoring Engine take the following actions:
- Login to TAP console and select
Data catalog
page in the navigation panel - Select
Submit Transfer
tab - Select input type:
Local path
- Select sample training data file which can be found here: src/main/atkmodelgenerator/train-data.csv)
- Enter
Title
- Click
Upload
When upload is completed, click Data sets
tab and view the details of uploaded data set by clicking its name.
Copy the value of targetUri
which contains path to the uploaded data set in HDFS - you will need this to create TAP Analytics Toolkit model in Jupyter notebook.
- In TAP console select
Data Science
and thenTAP Analytics Toolkit
tab - If there is an instance of
TAP Analytics Toolkit
installed you will see it in an instances list - no action needed. If there are no instances you will be asked if you want to create one - selectYes
, wait until the application is created (it can take about a minute or two). The application will appear inTAP Analytics Toolkit
instances list
- In
Data Science
tab selectJupyter
. Create newJupyter
instance. - Copy the password for created Jupyter instance.
- Login to Jupyter by clicking
App Url
link.
- Create new notebook
- Install TAP Analytics Toolkit client in your notebook by executing command:
!pip install <link-to-atk-server>/client
.<link-to-atk-server>
can be copied from URL column inTAP Analytics Toolkit
instances list.
- Copy the contents from src/main/atkmodelgenerator/atk_model_generator.py into your notebook.
- After imports section set the URI to TAP Analytics Toolkit server:
ta.server.uri = <link-to-atk-server>
- Set the value of
ds
as the link to the data set previously uploaded to HDFS (targetUri
). - Run the script. The link to the created model in HDFS will be printed in the output.
- The application can be run in three different configurations depending on chosen data provider (streaming source).
- There is one special Spring
@Profile
(local) which was created to enable local development - cloud, kafka and mqtt profiles should be inactive while local development
- random profile should be active instead while local development. It uses a simple random number generator instead of streaming source like kafka or mqtt
- Note: Streaming data here are random numbers so it generates a lof of anomalies
- You can find instruction how to install and run InfluxDB here: http://influxdb.com/docs/v0.8/introduction/installation.html
- The easiest way is to run it in into docker container
docker run -d -p 8083:8083 -p 8086:8086 tutum/influxdb:0.8.8
- Note:
influxdb:0.9
is NOT backwards compatible with0.8.x.
Instruction on how to push scoring engine on the platform: instruction
-
- Make sure that both local and random profiles are active
-
export SE_URL <scoring engine URL>
NOTE: link should not containhttp://
protocol
-
mvn spring-boot:run
-
- In a web browser enter
localhost:8080
- In a web browser enter