konduitai / konduit-serving-docs Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 2.0 1.86 MB

Documentation for https://github.com/KonduitAI/konduit-serving

License: Apache License 2.0

docs documentation examples java konduit konduit-serving konduitai python

konduit-serving-docs's People

Contributors

Stargazers

Watchers

Forkers

roshanchittoor venkat-nidrive

konduit-serving-docs's Issues

Clarify when to use NDARRAY as data type

TL;DR It is not clear when to use NDARRAYs as data types, and why.

In the documentation, we list a set of possible data types, of which NDARRAY is one. There are several problems with the current approach:

NDARRAY is not always acceptable. For example, TensorFlow steps accept a limited set of values, and NDARRAY is not one of these (org.nd4j.tensorflow.conversion.TensorDataType).
It's not clear what happens when in a Python step, what happens if a non-NDARRAY data type is used for a NUMPY-formatted input. Presumably the scalar type of the NumPy arrays are cast, but this should be explicit.

I don't have a clear action plan in mind, but eventually it should be clear to the user what format-type combinations should be used for their use case.

Create a FAQ or troubleshooting page for common problems users might encounter

Things like:

common pitfalls
known issues
how to debug a pipeline (need to discuss it more)

Documentation on implicit or explicit data conversions between pipeline steps

Although the pipeline pass records between steps, there are a few bits in which a record data is converted from one type into another. For instance, this could happen when we send NDArrayWritable to PythonStep. The array data is converted into numpy which can be consumed by a python script. Also, there are implicit conversions between python and java lists.

Need to look at other such type of implicit or explicit (configurable ones) conversions that we can do and document them

Delete or fill up empty placeholder pages in the documentation site

The main konduit.ai site doesn't link to docs

Need to add documentation links (https://serving.oss.konduit.ai) on https://konduit.ai

Document PMML converter endpoints

Can be looked at over here: https://github.com/KonduitAI/konduit-serving/blob/b247b211d5e2441e781ddc960bfed12dff446890/konduit-serving-converter/src/main/java/ai/konduit/serving/input/converter/ConverterVerticle.java

Update Gitbook page (CSS) style for easier readability and better visuals.

This can include a bunch of items like:

Code blocks styling
Main and accent colors

GitBook reverses changes

Keeping this issue open for reference.

Editing on the GitBook web interface can be risky, because GitBook doesn't save just the changes you make - it saves the page as it is. On the web interface, if you create a draft for a portion of the page (draft 1), edit a portion of a page (draft 2) and merge draft 2 before draft 1 is merged, merging draft 1 would remove the changes made in draft 2.

From now on I'll make changes exclusively in GitHub.

Think through how to present setting up instructions

Alongside the examples, there are statements such as

Before running this notebook, run the build_jar.py script and copy the JAR (konduit.jar) to this folder. Refer to the Python SDK README for details.

Packaging decisions are being determined, and a PyPI install of Konduit Serving may install the uberjar automatically, such that the user does not need to run build_jar.py or konduit build.

Once this is finalised, make sure all references to installation are updated pointing to the Installation page, and the installation steps are comprehensive.

Documentation Improvements

Recently I read through the Konduit Serving documentation available on gitbook (and synced to this repo): https://serving.oss.konduit.ai/
Overall, what has been done so far is pretty good. However, there's some things I think we can improve, most of them easy fixes. We can split some of these items out into their own github issues for tracking if necessary.

Also I could be wrong on some of this (still learning codebase and current status).

General thoughts/comments:

https://github.com/KonduitAI/konduit-serving - readme.md and repo description - doesn't link to docs at serving.oss.konduit.ai
konduit.ai - doesn't link to serving.oss.konduit.ai (not sure if intentional, give we haven't officially launched/released yet?)
~~No mention of the "data type conversion between pipeline steps" that (I believe) Konduit serving supports~~ Edit: looks like this is for the output only (not between intermediate pipeline steps). Although, there are some implicit conversions between dl4j related and python steps. (For example, ND4J -> numpy and vice versa).
No mention of PMML conversion endpoints mentioned in docs (not sure if CLI supports this too?)
GitBook styling - I'm personally not a fan of the dark blue code blocks... IMO the strong contrast draws too much attention to the code relative to the text. Maybe light gray instead?

Later/future:

Should we have a technical explanation page?
Add debugging page - self-help for common issues?
PMML examples
"Konduit Zoo" / "Konduit Hub" - easy setup/recipes for standard pretrained models?
Finish all empty/placeholder pages

Page: https://serving.oss.konduit.ai/

IMO we need a diagram (flow chart) here showing pipeline steps, with logos - TF, ONNX, PMML, DL4J, Python, Java etc. Maybe Numpy, Arrow, CUDA etc.
- First page is only text ATM; explaining it visually too should help users to more quickly understand what it and why they should care
Clarify what PMML means in practice - i.e., "can be used to serve models from libraries X, Y and Z (using built-in conversion)"
Maybe mention also Java as an option for pipeline steps (1 line after datavec but before usage probably)

Page: https://serving.oss.konduit.ai/quickstart/quickstart-python

Is it possible to define the configuration programmatically, instead of via a YAML file? If so, we should show this first/too
A list of API methods on the server and client would be nice. I see server.start(), client.predict() - let's get the full list in a table perhaps?

Page: https://serving.oss.konduit.ai/installation

Mentions CUDA 10.1. But TF <2.1.0 is CUDA 10.0, so not correct? (>=2.1.0 is 10.1)
Clarify that PMML doesn't support CUDA
Do I need to have CUDA installed? CUDNN?
"Ensure that you have JDK 8.0 installed"
- This should probably be 8.0 or higher
- Do we need a link on how get Java? (I'm just thinking for Python devs who have never touched Java before)

Page: https://serving.oss.konduit.ai/building-from-source

Clarify at the very top: When does / doesn't the user need this? If they don't need it, what's the easier alternative?
There's lots of different modules here... as a user/developer, how do I decide which ones I need? Table, diagram, and/or flow chart etc?
"Python bundling is not encouraged on ARM platforms" - that's vague. Why? Is it unstable? Broken? Limited?
"> konduit init --os " - is cross-compilation possible? - i.e., say I want to build on mac but deyloy on Linux+CUDA (clarify either way)
"Known issues - konduit init fails for linux-86_64-gpu (#115)" - issue is already closed.

Page: https://serving.oss.konduit.ai/yaml-configurations

Clarify what "NDARRAY" type actually means. Is this NumPy format?

Page: https://serving.oss.konduit.ai/model-monitoring/monitoring-grafana

"...to assist with troubleshooting" -> mention monitoring and debugging; it's not just (or even primarily) about problems
Maybe a better example image of the pipeline charts would be good, so it's a bit more appealing to users?

Page: https://serving.oss.konduit.ai/examples/python/tensorflow-model-serving/tf-bert

"This notebook illustrates..." -> it's a page, not a notebook

Create PMML examples

There are no PMML related examples with konduit-serving. Need to create a dedicated page for that for documentation.

Refactor or update present documentation pages

Based on Alex's review:

Page: https://serving.oss.konduit.ai/

IMO we need a diagram (flow chart) here showing pipeline steps, with logos - TF, ONNX, PMML, DL4J, Python, Java etc. Maybe Numpy, Arrow, CUDA etc.
- First page is only text ATM; explaining it visually too should help users to more quickly understand what it and why they should care
Clarify what PMML means in practice - i.e., "can be used to serve models from libraries X, Y and Z (using built-in conversion)"
Maybe mention also Java as an option for pipeline steps (1 line after datavec but before usage probably)

Page: https://serving.oss.konduit.ai/quickstart/quickstart-python

Is it possible to define the configuration programmatically, instead of via a YAML file? If so, we should show this first/too
A list of API methods on the server and client would be nice. I see server.start(), client.predict() - let's get the full list in a table perhaps?

Page: https://serving.oss.konduit.ai/installation

Mentions CUDA 10.1. But TF <2.1.0 is CUDA 10.0, so not correct? (>=2.1.0 is 10.1)
Clarify that PMML doesn't support CUDA
Do I need to have CUDA installed? CUDNN?
"Ensure that you have JDK 8.0 installed"
- This should probably be 8.0 or higher
- Do we need a link on how get Java? (I'm just thinking for Python devs who have never touched Java before)

Page: https://serving.oss.konduit.ai/building-from-source

Clarify at the very top: When does / doesn't the user need this? If they don't need it, what's the easier alternative?
There's lots of different modules here... as a user/developer, how do I decide which ones I need? Table, diagram, and/or flow chart etc?
"Python bundling is not encouraged on ARM platforms" - that's vague. Why? Is it unstable? Broken? Limited?
"> konduit init --os " - is cross-compilation possible? - i.e., say I want to build on mac but deyloy on Linux+CUDA (clarify either way)
"Known issues - konduit init fails for linux-86_64-gpu (#115)" - issue is already closed.

Page: https://serving.oss.konduit.ai/yaml-configurations

Clarify what "NDARRAY" type actually means. Is this NumPy format?

Page: https://serving.oss.konduit.ai/model-monitoring/monitoring-grafana

"...to assist with troubleshooting" -> mention monitoring and debugging; it's not just (or even primarily) about problems
Maybe a better example image of the pipeline charts would be good, so it's a bit more appealing to users?

Page: https://serving.oss.konduit.ai/examples/python/tensorflow-model-serving/tf-bert

"This notebook illustrates..." -> it's a page, not a notebook

Documentation on commonly used pipelines that can be used as templates.

Something like a Konduit Zoo where folks can pick up stuff that falls in a common day to day workflow.

Link konduit-serving to the documentation page.

There's no link at the main https://github.com/KonduitAI/konduit-serving repo at the moment that directs to the documentation site (https://serving.oss.konduit.ai). Need to add the links at the repo description and readme.md