Giter Club home page Giter Club logo

konduit-serving-docs's People

Contributors

philip-khor avatar roshanchittoor avatar shamsulazeem avatar venkat-nidrive avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

konduit-serving-docs's Issues

Clarify when to use NDARRAY as data type

TL;DR It is not clear when to use NDARRAYs as data types, and why.

In the documentation, we list a set of possible data types, of which NDARRAY is one. There are several problems with the current approach:

  1. NDARRAY is not always acceptable. For example, TensorFlow steps accept a limited set of values, and NDARRAY is not one of these (org.nd4j.tensorflow.conversion.TensorDataType).
  2. It's not clear what happens when in a Python step, what happens if a non-NDARRAY data type is used for a NUMPY-formatted input. Presumably the scalar type of the NumPy arrays are cast, but this should be explicit.

I don't have a clear action plan in mind, but eventually it should be clear to the user what format-type combinations should be used for their use case.

Documentation on implicit or explicit data conversions between pipeline steps

Although the pipeline pass records between steps, there are a few bits in which a record data is converted from one type into another. For instance, this could happen when we send NDArrayWritable to PythonStep. The array data is converted into numpy which can be consumed by a python script. Also, there are implicit conversions between python and java lists.

Need to look at other such type of implicit or explicit (configurable ones) conversions that we can do and document them

GitBook reverses changes

Keeping this issue open for reference.

Editing on the GitBook web interface can be risky, because GitBook doesn't save just the changes you make - it saves the page as it is. On the web interface, if you create a draft for a portion of the page (draft 1), edit a portion of a page (draft 2) and merge draft 2 before draft 1 is merged, merging draft 1 would remove the changes made in draft 2.

From now on I'll make changes exclusively in GitHub.

Think through how to present setting up instructions

Alongside the examples, there are statements such as

Before running this notebook, run the build_jar.py script and copy the JAR (konduit.jar) to this folder. Refer to the Python SDK README for details.

Packaging decisions are being determined, and a PyPI install of Konduit Serving may install the uberjar automatically, such that the user does not need to run build_jar.py or konduit build.

Once this is finalised, make sure all references to installation are updated pointing to the Installation page, and the installation steps are comprehensive.

Documentation Improvements

Recently I read through the Konduit Serving documentation available on gitbook (and synced to this repo): https://serving.oss.konduit.ai/
Overall, what has been done so far is pretty good. However, there's some things I think we can improve, most of them easy fixes. We can split some of these items out into their own github issues for tracking if necessary.

Also I could be wrong on some of this (still learning codebase and current status).

General thoughts/comments:

  • https://github.com/KonduitAI/konduit-serving - readme.md and repo description - doesn't link to docs at serving.oss.konduit.ai
  • konduit.ai - doesn't link to serving.oss.konduit.ai (not sure if intentional, give we haven't officially launched/released yet?)
  • No mention of the "data type conversion between pipeline steps" that (I believe) Konduit serving supports Edit: looks like this is for the output only (not between intermediate pipeline steps). Although, there are some implicit conversions between dl4j related and python steps. (For example, ND4J -> numpy and vice versa).
  • No mention of PMML conversion endpoints mentioned in docs (not sure if CLI supports this too?)
  • GitBook styling - I'm personally not a fan of the dark blue code blocks... IMO the strong contrast draws too much attention to the code relative to the text. Maybe light gray instead?

Later/future:

  • Should we have a technical explanation page?
  • Add debugging page - self-help for common issues?
  • PMML examples
  • "Konduit Zoo" / "Konduit Hub" - easy setup/recipes for standard pretrained models?
  • Finish all empty/placeholder pages


Page: https://serving.oss.konduit.ai/

  • IMO we need a diagram (flow chart) here showing pipeline steps, with logos - TF, ONNX, PMML, DL4J, Python, Java etc. Maybe Numpy, Arrow, CUDA etc.
    • First page is only text ATM; explaining it visually too should help users to more quickly understand what it and why they should care
  • Clarify what PMML means in practice - i.e., "can be used to serve models from libraries X, Y and Z (using built-in conversion)"
  • Maybe mention also Java as an option for pipeline steps (1 line after datavec but before usage probably)

Page: https://serving.oss.konduit.ai/quickstart/quickstart-python

  • Is it possible to define the configuration programmatically, instead of via a YAML file? If so, we should show this first/too
  • A list of API methods on the server and client would be nice. I see server.start(), client.predict() - let's get the full list in a table perhaps?

Page: https://serving.oss.konduit.ai/installation

  • Mentions CUDA 10.1. But TF <2.1.0 is CUDA 10.0, so not correct? (>=2.1.0 is 10.1)
  • Clarify that PMML doesn't support CUDA
  • Do I need to have CUDA installed? CUDNN?
  • "Ensure that you have JDK 8.0 installed"
    • This should probably be 8.0 or higher
    • Do we need a link on how get Java? (I'm just thinking for Python devs who have never touched Java before)

Page: https://serving.oss.konduit.ai/building-from-source

  • Clarify at the very top: When does / doesn't the user need this? If they don't need it, what's the easier alternative?
  • There's lots of different modules here... as a user/developer, how do I decide which ones I need? Table, diagram, and/or flow chart etc?
  • "Python bundling is not encouraged on ARM platforms" - that's vague. Why? Is it unstable? Broken? Limited?
  • "> konduit init --os " - is cross-compilation possible? - i.e., say I want to build on mac but deyloy on Linux+CUDA (clarify either way)
  • "Known issues - konduit init fails for linux-86_64-gpu (#115)" - issue is already closed.

Page: https://serving.oss.konduit.ai/yaml-configurations

  • Clarify what "NDARRAY" type actually means. Is this NumPy format?

Page: https://serving.oss.konduit.ai/model-monitoring/monitoring-grafana

  • "...to assist with troubleshooting" -> mention monitoring and debugging; it's not just (or even primarily) about problems
  • Maybe a better example image of the pipeline charts would be good, so it's a bit more appealing to users?

Page: https://serving.oss.konduit.ai/examples/python/tensorflow-model-serving/tf-bert

  • "This notebook illustrates..." -> it's a page, not a notebook

Create PMML examples

There are no PMML related examples with konduit-serving. Need to create a dedicated page for that for documentation.

Refactor or update present documentation pages

Based on Alex's review:

Page: https://serving.oss.konduit.ai/

  • IMO we need a diagram (flow chart) here showing pipeline steps, with logos - TF, ONNX, PMML, DL4J, Python, Java etc. Maybe Numpy, Arrow, CUDA etc.
    • First page is only text ATM; explaining it visually too should help users to more quickly understand what it and why they should care
  • Clarify what PMML means in practice - i.e., "can be used to serve models from libraries X, Y and Z (using built-in conversion)"
  • Maybe mention also Java as an option for pipeline steps (1 line after datavec but before usage probably)

Page: https://serving.oss.konduit.ai/quickstart/quickstart-python

  • Is it possible to define the configuration programmatically, instead of via a YAML file? If so, we should show this first/too
  • A list of API methods on the server and client would be nice. I see server.start(), client.predict() - let's get the full list in a table perhaps?

Page: https://serving.oss.konduit.ai/installation

  • Mentions CUDA 10.1. But TF <2.1.0 is CUDA 10.0, so not correct? (>=2.1.0 is 10.1)
  • Clarify that PMML doesn't support CUDA
  • Do I need to have CUDA installed? CUDNN?
  • "Ensure that you have JDK 8.0 installed"
    • This should probably be 8.0 or higher
    • Do we need a link on how get Java? (I'm just thinking for Python devs who have never touched Java before)

Page: https://serving.oss.konduit.ai/building-from-source

  • Clarify at the very top: When does / doesn't the user need this? If they don't need it, what's the easier alternative?
  • There's lots of different modules here... as a user/developer, how do I decide which ones I need? Table, diagram, and/or flow chart etc?
  • "Python bundling is not encouraged on ARM platforms" - that's vague. Why? Is it unstable? Broken? Limited?
  • "> konduit init --os " - is cross-compilation possible? - i.e., say I want to build on mac but deyloy on Linux+CUDA (clarify either way)
  • "Known issues - konduit init fails for linux-86_64-gpu (#115)" - issue is already closed.

Page: https://serving.oss.konduit.ai/yaml-configurations

  • Clarify what "NDARRAY" type actually means. Is this NumPy format?

Page: https://serving.oss.konduit.ai/model-monitoring/monitoring-grafana

  • "...to assist with troubleshooting" -> mention monitoring and debugging; it's not just (or even primarily) about problems
  • Maybe a better example image of the pipeline charts would be good, so it's a bit more appealing to users?

Page: https://serving.oss.konduit.ai/examples/python/tensorflow-model-serving/tf-bert

  • "This notebook illustrates..." -> it's a page, not a notebook

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.