Giter Club home page Giter Club logo

Comments (15)

kirilg avatar kirilg commented on May 16, 2024

We haven't done any work with CUDA in a docker image yet, but I can give some pointers of where to start, and external contributions to make it work are welcomed. Since TensorFlow already has working docker files with GPU support and their files are similar to ours, that would be a good example to use.

Our dockerfile is based on TensorFlow's Dockerfile.devel. If you compare the differences and see what had to be changed, you can apply similar changes to Tensorflow's Dockerfile.devel-gpu and we can add it here. Since they figured out all issues with CUDA and cudnn in their dockerfile, hopefully basing it on their example will work here as well.

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

@kirilg I have compiled my steps to set up TF serving with CUDA on docker: https://gist.github.com/revilokeb/58f3419340652bbe73ab07903fdc9426.
However those steps unfortunately still lead to the following error at the last step bazel build -c opt --config=cuda tensorflow_serving/...:

ERROR: /root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe01608c/external/org_tensorflow/tensorflow/core/kernels/BUILD:72:1: undeclared inclusion(s) in rule '@org_tensorflow//tensorflow/core/kernels:concat_lib_gpu':
this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/core/kernels/concat_lib_gpu.cu.cc':
'/serving/tensorflow/third_party/eigen3/Eigen/Core'.

Any idea what could be missing?
(- compilation without CUDA and running mnist and inception examples is working smoothly, - I needed to manually set CROSSTOOL as in #17, else that error occured)

from serving.

kirilg avatar kirilg commented on May 16, 2024

Took a look and I'm still not entirely sure why this is happening. I see your dockerfile is using a different technique to install CUDA and Bazel than what Tensorflow is doing. Perhaps installing Bazel using the tested way is worth a try since it may be different from what's installed using apt-get.

@damienmg do you know why this error is showing up in a Docker container using CUDA?

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

Thanks for your reply. I went back and installed bazel-0.2.0 as described in https://tensorflow.github.io/serving/setup and then also bazel-0.3.0 as in the tested way, but neither does affect the above error showing up.

On the other hand I have no problem building tensorflow with GPU, i.e. git clone --recursive https://github.com/tensorflow/tensorflow.git and then ./configure && \ bazel build -c opt --config=cuda tensorflow/tools/pip_package:build_pip_package.

So error seems not related to bazel and tensorflow part.

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

I have now used Craig Citro's recipe (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docker/Dockerfile.devel-gpu) to build docker container. I can confirm that this works perfectly for tensorflow, but I am getting the exact same error as above when trying to build TF serving.

from serving.

LiberiFatali avatar LiberiFatali commented on May 16, 2024

I got to build Serving docker GPU image successfully after some modifications on

https://github.com/tensorflow/serving/blob/master/tensorflow_serving/tools/docker/Dockerfile.devel

  • ChangedFROM ubuntu:14.04 to FROM nvidia/cuda:7.5
  • Manually added cxx_builtin_include_directory: "/usr/local/cuda-7.5/include" in

$SERVING_ROOT/tensorflow/third_party/gpus/crosstool

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

@LiberiFatali good to hear! Unfortunately I cannot confirm it from my side as the above error keeps appearing. I have also checked with a second machine, but again the same error.

I have tried your recipe - here is what I did:

  • docker build -t tfs_cuda . (with https://github.com/tensorflow/serving/blob/master/tensorflow_serving/tools/docker/Dockerfile.devel as Dockerfile in current directory)
  • docker run --rm -v /myhosthome:/mydockerhome -ti tfs_cuda:latest bash
  • copy cudnn5 from /mydockerhome to /usr/local/cuda/include and /usr/local/cuda/lib64
  • git clone --recurse-submodules https://github.com/tensorflow/serving
  • set cxx_builtin_include_directory: "/usr/local/cuda-7.5/targets/x86_64-linux/include" (setting to "/usr/local/cuda-7.5/include" does not work for me, i.e. gives error "this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.cu.cc': '/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_runtime.h' etc")
  • cd /serving/tensorflow; ./configure; cd ..
  • bazel build -c opt --config=cuda tensorflow_serving/...

Results in: ERROR: /root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe01608c/external/org_tensorflow/tensorflow/core/kernels/BUILD:585:1: undeclared inclusion(s) in rule '@org_tensorflow//tensorflow/core/kernels:transpose_functor_gpu':
this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/core/kernels/transpose_functor_gpu.cu.cc':
'/serving/tensorflow/third_party/eigen3/Eigen/Core'.

Anything obvious that you have done differently?

from serving.

LiberiFatali avatar LiberiFatali commented on May 16, 2024

@revilokeb I use CUDA 7.5 with cudnn-7.0-linux-x64-v4.0-prod on Ubuntu 14.04.
My local Tensorflow Serving is not up-to-date with official github repo, I searched my local repo and there is no build rule that contains "tensorflow/core/kernels:transpose_functor_gpu"

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

@LiberiFatali many thanks! No clue yet why this strange error is coming back to me consistently on two different machines. Will keep investigating...

from serving.

nhe150 avatar nhe150 commented on May 16, 2024

LiberiFatali, how do you run the docker image build? using nvidia-docker?

from serving.

LiberiFatali avatar LiberiFatali commented on May 16, 2024

Not with nvidia-docker command, although I went through the process of installing nvidia-docker.

I ran it like this:
docker build --pull -t tfserving_gpu -f tensorflow_serving/tools/docker/Dockerfile.devel-gpu .

Note that my local repo is not the lastest github code.

from serving.

LiberiFatali avatar LiberiFatali commented on May 16, 2024

I have tried latest code and also got error "undeclared inclusion(s) in rule"

Here is the version of my local repo that can be used to build Docker GPU image successfully:

b4e9815

from serving.

revilokeb avatar revilokeb commented on May 16, 2024

@LiberiFatali many thanks for checking latest code, too, and also providing your local repo version. I can confirm that with your above version I am also able to build TF serving on docker with GPU.

Complete steps using nvidia-docker can be found here. Of course using nvidia-docker is only one way of doing it.

I was then curious if I can use that docker image to run the inception example, in particular have the server on docker and client on host (or any other IP). It is indeed working, steps are here.

I have had a quick look at the diff between b4e9815 and current master branch, but so far have not been able resolve the above build error.

from serving.

nhe150 avatar nhe150 commented on May 16, 2024

Great work, revilokeb. I tried it out, except you need to specify LD_LIBRARY_PATH, everything worked.

from serving.

kirilg avatar kirilg commented on May 16, 2024

Closing out old issue. There a GPU docker container here now.

from serving.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.