Feature Request: Add Git and GitPython about docker HOT 3 CLOSED

ghdl commented on August 16, 2024

Feature Request: Add Git and GitPython

from docker.

Comments (3)

eine commented on August 16, 2024

Hi @GlenNicholls! First off, thanks a lot for reaching us. This is the first issue opened by 'just a new user', so it will be really helpful to see the issues you find on your path. Do not hesitate to ask for any further clarification.

I am trying to use ghdl/ext:vunit-master and am running into issues because my VUnit run.py scripts expect git and GitPython to be present.

We try not to provide images with non-required tools to keep them as small as possible. We feel that there are too many 'heavy' tools which would be suitable for HDL design, so we expect other users to build their custom development images, using ours as a base.

Furthermore, we expect almost every user to have git available on the host. In this context, adding git in the image involves transferring and using a few additional hundreds of MB (more than the size of GHDL itself). The target with docker is to avoid installing tools on the host, not to duplicate installations.

Nevertheless, I think there are several ways to address your use case, without modifying our images:

In my use case, I am using a python function that uses GitPython to figure out where my git root directory is to enable each run.py script to grab the correct libraries/packages/source .vhd files for each test.

Do you really need to know the location of the root of your git directory? Is it not enough to know the location of your run.py file? I expect this to be in some fixed path which is relative to the root of the repo. Let's say that it is at <root_of_git_repo>/test/py/run.py. You can use root = join(dirname(__file__, '..', '..', '..')), and keep the same code for the VHDL files.

E.g., below is an example of a run.py script that needs to know where the git root directory is to find all my .vhd files without hard-coding every file path:

I think that the example is not complete: root = get_git_root().

Moreover, you might want to simplify the code to:

vu.add_library("<some_lib>").add_source_files([
    join(root, "src/utils/src/*.vhd"),
    join(root, "src/misc/src/*.vhd"),
    join(root, "src/memory/src/*.vhd"),
    join(root, "src/bert/src/*.vhd"),
    join(root, "src/bert/test/*.vhd")
])

vu.add_library("<some_lib>").add_source_files([
    join(root, "src/**/src/*.vhd"),
    join(root, "src/bert/test/*.vhd")
])

At the same time, note that those paths will not work on Windows. You should use join(root, "src", "utils", "src", "*.vhd") instead of join(root, "src/utils/src/*.vhd").

Once my repo is more stable, I will be using a different method for finding libraries and it will be extremely important for my build environment to know where the git root is.

I think you should be able to have a stable layout without requiring git to know the location of the root. As a rule of thumb, build/run scripts should not depend on git, which is a versioning tool. It is ok to make it optional, tho; i.e., to override some defaults, if present.

Nevertheless, you can have your custom image:

FROM ghdl/ext:vunit-master

RUN apt-get update -qq \
  && DEBIAN_FRONTEND=noninteractive apt-get -y install --no-install-recommends \
      ca-certificates \
      git \
  && apt-get autoclean && apt-get clean && apt-get -y autoremove \
  && update-ca-certificates \
  && pip3 install GitPython

Then,

docker pull ghdl/ext:vunit-master
docker build -t ghdl/gitpython .

and

docker run --rm -t -v $(pwd):/work -w /work \
  ghdl/gitpython sh -c 'for f in $(find ./ -name 'run.py'); do python3 $f -v; done'

For reference, this is what I am doing now when I run docker

Note that, on GNU/Linux, you don't need /$(pwd)://work -w //work, you can use -v $(pwd):/work -w /work instead. The former is just a workaround for MSYS2 on Windows.

When the apt-get install git is executed, it hangs while waiting for the [Y/n] user input, but when this is forced, it is still unable to install git.

It is hard to know why it fails without knowing the error message. It might be because you didn't set DEBIAN_FRONTEND=noninteractive, because ca-certificates are not up to date, because you are using sh instead of bash... You might try the snippet in the dockerfile above, before actually building the image.

Lastly, if you see a compelling reason for me to avoid this method, I am interested in hearing suggestions. I am fairly new to CI with docker, so this was just the method I thought would be easiest to integrate and maintain.

I think this is a good approach. You need to first guess how to install it interactively, before writing a Dockerfile. Then, I would suggest that you build your own image, as it will allow you to avoid installing git and gitpython each time you want to start a container.

In order to use your custom image in a CI service, you will need to push it to some registry, typically hub.docker.com. You need a different name to do so, because you do not have permissions in hub.docker.com/r/ghdl. So, use docker build -t glennicholls/gitpython . or any other namespace where you have access.

from docker.

GlenNicholls commented on August 16, 2024

Hi @GlenNicholls! First off, thanks a lot for reaching us. This is the first issue opened by 'just a new user', so it will be really helpful to see the issues you find on your path. Do not hesitate to ask for any further clarification.

Awesome, thank you! I am an FPGA DSP engineer and am currently working on utilizing open source tools to make FPGA development more efficient where I work. We, unfortunately, suck at testing where I work, so I am going through the academic exercise in my free time to better understand these tools to see where I can take advantage of what they offer.

We have some very nice tools in house, but I am of the mindset that we should all be contributing and using open source tools for things like testing. Sure, our competitors can then use this too, but our "secret sauce" should be our IP deliverable, not the tools that help us waste less time clicking buttons in a GUI and force insight to code that breaks our projects.

We try not to provide images with non-required tools to keep them as small as possible. We feel that there are too many 'heavy' tools which would be suitable for HDL design, so we expect other users to build their custom development images, using ours as a base.

Furthermore, we expect almost every user to have git available on the host. In this context, adding git in the image involves transferring and using a few additional hundreds of MB (more than the size of GHDL itself). The target with docker is to avoid installing tools on the host, not to duplicate installations.

Yes, I completely understand your point now as this is something specific to my use case.

Do you really need to know the location of the root of your git directory? Is it not enough to know the location of your run.py file? I expect this to be in some fixed path which is relative to the root of the repo. Let's say that it is at <root_of_git_repo>/test/py/run.py. You can use root = join(dirname(file, '..', '..', '..')), and keep the same code for the VHDL files.

I am not a software person so when I made that decision, I did not know the "correct" answer. However, I am trying to bridge that gap to make my skills more applicable while crossing this barrier to take advantage of software test methodology. Your point does make sense as it is no guarantee we would always stick with Git, SVN, or any other revision control tool for that matter. We will more than likely continue changing as new engineers and new technology influence our workflow.

After looking around a bit, I will just use an __init__.py that will contain this information in my python test package. Thus, if this ever moves around, none of my python or test scripts will break and the __file__ method as you pointed out will take care of the path problem I was trying to solve. This also has the advantage of foregoing using git as a dependency.

Moreover, you might want to simplify the code to:

Good call, I am always looking for ways to make code hardware agnostic so things like this are a huge help!

In order to use your custom image in a CI service, you will need to push it to some registry, typically hub.docker.com. You need a different name to do so, because you do not have permissions in hub.docker.com/r/ghdl. So, use docker build -t glennicholls/gitpython . or any other namespace where you have access.

I see the objective now, and ultimately we will probably have a docker image with different simulation/build tools in the near future. However, I do have one question regarding that... When making the docker image as concise as possible, what is the "right" way to deal with dependencies? One thing I am thinking of is things like PyTest/Tox. I would expect these to be a requirement for the CI environment to install and not the docker image, but how do I allow the docker image to use dependencies CI has installed?

from docker.

eine commented on August 16, 2024

Awesome, thank you! I am an FPGA DSP engineer and am currently working on utilizing open source tools to make FPGA development more efficient where I work. We, unfortunately, suck at testing where I work, so I am going through the academic exercise in my free time to better understand these tools to see where I can take advantage of what they offer.

I think that most of GHDL + VUnit users, including me, have a similar background to you. Thankfully, VUnit is quite well thought, both the Python part and the VHDL libraries. Hence, it is a good project to take as a reference when you feel that you lack knowledge about 'the software side of things'.

However, I do have one question regarding that... When making the docker image as concise as possible, what is the "right" way to deal with dependencies?

This is not an easy question to answer. For now, docker images are linear. This means that, if we want to install two tools, each one in a different image and then a third image that contains both tools, it is not possible to do it automatically. You need to:

base img -> img A ------> img D
         -> img B ----|

base img -> img A ----|
         -> img B ------> img D

In other words, it is not possible to 'merge' two images, even if the share a common parent.

This is expected to be supported in the future with Buildkit. But meanwhile, it makes it different to have images with a small granularity.

One thing I am thinking of is things like PyTest/Tox. I would expect these to be a requirement for the CI environment to install and not the docker image, but how do I allow the docker image to use dependencies CI has installed?

No, the docker image cannot and should not have access to resources on the host. That's the whole point of using a container or VM. Except if the resources are just a bunch of scripts, which you can easily share with the container.

The regular approach is to install pytest/tox in each image. However, sometimes pytest/tox are used to test multiple python versions at the same time. This should be avoided. There are official python images for different versions. Therefore, the suggested approach is to have one image for each of the versions you want to test.

Nevertheless, it is also possible to install pytest/tox once on the CI, and replace some of the lower level commands with docker containers. I.e., you can alias run.py as docker run --rm....

Overall, docker allows to decouple where are the versions of the libraries and dependencies defined (in the Dockerfile) and where are the tests defined (in the project).

EDIT

I'm closing this issue, because I think that the questions are already answered. But feel free to continue the conversation if you want.

from docker.

Feature Request: Add Git and GitPython about docker HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent