Some work on the translator has already been started but is stalled: <a class="issue-l

Thanks for the info. Just pinging you <a class="user-mention notranslate" data-hov

I think we can leverage the Cargo.lock / Cargo.toml parsing from <code class="notransl

I have started working on the translator here: <a href="https://github.com/yusdacra/dr

Create translator and builder for rust/cargo,about nix-community/dream2nix

Comments (29)

yusdacra commented on July 4, 2024 3

It seems to generate the dream lock fine now! 🎉

And btw, is there a reason why a "main package" is needed in a dream lock? For example, a Cargo.lock can contain multiple packages that are in a workspace. So it makes more sense to choose a package to build in the builder instead of the translation process. Not sure if there is already a way to do that.

from dream2nix.

DavHau commented on July 4, 2024 1

Thanks for the info.
Just pinging you @yusdacra as this might be relevant to you.

from dream2nix.

yusdacra commented on July 4, 2024 1

I think we can leverage the Cargo.lock / Cargo.toml parsing from naersk for the translator and the "builder" from crate2nix (granular builds).

from dream2nix.

DavHau commented on July 4, 2024 1

Sounds good, though the checksum field is always called hash in dream2nix so far.

Maybe a good way to start is to just copy src/fetchers/http/default.nix to src/fetchers/cargo/default.nix and just change a few things. This file defines the behavior of the new fetcher. Also the data format needs to be incorporated into src/specifications/dream-lock-schema.json

from dream2nix.

happysalada commented on July 4, 2024

One package that would be worth testing is meilisearch github.com/meilisearch/meiliSearch/
currently on nixpkgs, the update to the latest version fails because of a feature override that is needed.
(it doesn't work properly with buildRustPackage, but crate2nix has exactly the right features: granular build and overrides per dependencies.)
Ideally we should model dream2nix after what is done in crate2nix.

A bit more context on meilisearch: I'm the maintainer of it in nixpkgs. Originally I was using crate2nix to build it, but transitioned to a more classic buildRustPackage build to simplify updates. It worked for a bit but now it's problematic again to the point I couldn't do the last update.

from dream2nix.

yusdacra commented on July 4, 2024

I have started working on the translator here: https://github.com/yusdacra/dream2nix/tree/feat/cargo-translator

My main questions now are:

What exactly is a "subsystem"? How does this relate to the building process? I see the nodejs package.lock translator having a nodejs version option (?). Shouldn't these be specified in a builder?
How does dream2nix make use of serializePackages? The template says it can be "a list of package objects of arbitrary structure", so this only concerns builders that will use the translator? How does it relate to "dream lock"? Or i guess in general, what is a translator expected to output?
And i guess i would like an overall process of how things go from parsing to building (translator -> builder? what are the steps in between of these?) 🙂

from dream2nix.

DavHau commented on July 4, 2024

And i guess i would like an overall process of how things go from parsing to building (translator -> builder? what are the steps in between of these?)

See this little chart for an overview of the architecture. There is also an example walking through the steps.

Translators are expected to output the dream lock format. This format is specified by a jsonschema defined in ./src/specifications. There is also an example next to the schema. The schema is verified by the python CLI, in case the translation is triggered through the CLI .
The dream lock only contains data, no logic. This allows to store this data as JSON.
Later the dream lock will be read by a builder module which translates it to nix derivations.
Enforcing a generic output format on all our translators allows us to have generic builder interfaces and generic interfaces for customization.
The dream lock format contains generic data where the structure is identical independent of the subsystem, and a subsystem specific section which differs depending on the subsystem.

The generated translator template, you started working with, uses utils.simpleTranslate. The idea of simpleTranslate is to help generating the dream lock format without you having to deal with that format in detail. simpleTranslate wants you to implement all these little functions, telling it how to extract the relevant information out of the upstream lock file, so it can then generate the dream lock for you. You just have to implement the extraction of data, skipping the reassembling part.
This abstraction felt reasonable for me, but if you feel more comfortable generating the dream lock on your own, you can do that and just not use simpleTranslate if you like.

What exactly is a "subsystem"? How does this relate to the building process?

Subsystem is a category so that all packages within one subsystem can be built the same way, even if the data (dream-lock.json) which defines those packages has been created by different translators. As an example, for nodejs we have different translators as there exist different lock file formats. But all data created by these translators must be buildable by the same builder.
Practically this would mean that we have one subsystem for each language or build system. Currently we have nodejs, python, go, etc...
Within one subsystem all translators must agree on a common structure for the subsystemAttrs. Currently this is not enforced. We should add some additional jsonschema for that at some point.
So as you are the one initializing the rust subsystem, you can put whatever you want inside subsystemAttrs. It should contain all data which is required for the build apart from the generic data that simpleTranslate already forces you to extract.
You will likely start with an empty subsystemAttrs, and later when building, you might notice that some crucial data is missing, at which point you can extend the translator to put extra data inside subsystemAttrs, which will be stored inside the dream lock and passed to the builder.

I see the nodejs package.lock translator having a nodejs version option (?). Shouldn't these be specified in a builder?

You are a 100% right, this is a design issue which needs to be fixed. The option should be removed

How does dream2nix make use of serializePackages?

serializePackages is expected by the simpleTranslate helper. Sometimes upstream lock files do not express the dependencies in a flat list and instead use deeper structures. serializePackages should flatten that structure to a list, so simpleTranslate can handle each dependency object individually, without having to understand upstream's structure.
It will then execute all the little extractor functions like getName or getVersion on all dependencyObjects to retrieve the relevant information.
Therefore each dependencyObject must contain all information which is relevant to itself.

Example: given this exemplary upstream lock file:

{
  dependencies = {
    requests = {
      version = "1.2.3";
      url = "https://...requests.tar.gz";
      hash = "deadbeef...";
    };
    async = {
      version = "2.3.4";
      url = "https://...async.tar.gz";
      hash = "deadbeef2...";
    };
  };
}

...processed via serializePackages should result in:

[
  {
    name = "requests";
    version = "1.2.3";
    url = "https://...requests.tar.gz";
    hash = "deadbeef...";
  }
  {
    name = "async";
    version = "2.3.4";
    url = "https://...async.tar.gz";
    hash = "deadbeef2...";
  }
]

The upstream structure was already relatively flat, so we just needed to incorporate the name into each object.

Please let me know if you have more questions.
Getting your feedback on the contribution process is valuable as I really want to optimize that, so people can contribute more easily.

from dream2nix.

yusdacra commented on July 4, 2024

Ok, i see. So the "getSourceType" and "sourceConstructors" need to output the source types specified in the dream.lock schema. Would it make sense to add a cargo-registry type for cargo registries like crates.io, or should http be used? I see there is npm and pypi so that's why i asked.

Also about getDependencies, mainPackageDependencies and such, do they also want flat lists? And do they want all transitive dependencies or only the direct dependencies?

from dream2nix.

DavHau commented on July 4, 2024

Would it make sense to add a cargo-registry type for cargo registries like crates.io, or should http be used? I see there is npm and pypi so that's why i asked.

It will not be required for packaging projects from github. Translating a github project will result in a list of http sources, so the crates.io specific fetcher would not be used.
The reason I included for example npm as a fetcher type is, that now the CLI is capable of packaging projects directly from npm by referencing npm:prettier/prettier for example.
If crates.io hosts application that we might want to package, then the fetcher type is useful, otherwise not.

Also about getDependencies, mainPackageDependencies and such, do they also want flat lists? And do they want all transitive dependencies or only the direct dependencies?

getDependencies and mainPackageDependencies want a flat list of direct dependencies only, where each returned dependency has this structure:

{ name = "package-name"; version = "1.2.3"; }

from dream2nix.

yusdacra commented on July 4, 2024

I'm not sure what you meant by "packaging projects from github". I meant that regardless of whether a project is from github or crates-io, it will have dependencies from crates-io. They can be downloaded through http, but they require to be decompressed (https://github.com/nix-community/naersk/blob/master/build.nix#L433-L449). So i meant it in that sense, i suppose my question being if the builder would decompress it?

Also yeah there are plenty of applications on crates.io (ripgrep, bottom, exa and many more) so i guess it would be useful.

from dream2nix.

happysalada commented on July 4, 2024

getting assets from npm also requires decompression. It should be fine for crates.io too.

You are right about the rust applications. Having a fetcher would enable easy download of those applications. It's just like npm though, it means you have to trust the binary.

from dream2nix.

DavHau commented on July 4, 2024

@yusdacra I meant that only if the initial source (the main projects source) is coming from crates.io, we need a specific fetcher for that, because we need the CLI to understand references like crates-io:ripgrep/12.0.1 etc, which it can only do if there is a fetcher for crates-io.
As you mentioned, for all rust sub-dependencies we could just use the http fetcher and compute the .cargo-checksum.json in the builder.

But, as I now understand that there are applications hosted on crates.io, we need that creates.io fetcher anyways.

Since b8dc44d, each fetcher in dream2nix is expected to return a decompressed source, so the decompression and the creation of the .cargo-checksum.json should be part of this fetcher.

So after all it might be better to just use the creates.io fetcher for all crates.io sources instead of using the http fetcher. Then we won't have to deal with the .cargo-checksum.json at build time anymore.

It's just like npm though, it means you have to trust the binary.

I think we are talking about package sources here, not binaries.

from dream2nix.

yusdacra commented on July 4, 2024

@yusdacra I meant that only if the initial source (the main projects source) is coming from crates.io, we need a specific fetcher for that, because we need the CLI to understand references like crates-io:ripgrep/12.0.1 etc, which it can only do if there is a fetcher for crates-io.
As you mentioned, for all rust sub-dependencies we could just use the http fetcher and compute the .cargo-checksum.json in the builder.

But, as I now understand that there are applications hosted on crates.io, we need that creates.io fetcher anyways.

Since b8dc44d, each fetcher in dream2nix is expected to return a decompressed source, so the decompression and the creation of the .cargo-checksum.json should be part of this fetcher.

So after all it might be better to just use the creates.io fetcher for all crates.io sources instead of using the http fetcher. Then we won't have to deal with the .cargo-checksum.json at build time anymore.

Ok, that sounds good. I will write the translator so that it uses a cargo registry fetcher then. We can have a crates-io specific fetcher, but i think a general cargo registry fetcher could be better (if people want to use dream2nix against a different registry). The index format of a registry is defined here https://doc.rust-lang.org/cargo/reference/registries.html#index-format . The problem with this is we would need to clone the git repository (cargo registries are git repos) and then look for the index file to get the download API URL. This would be very inefficient. So for now it's probably best to have just a crates-io fetcher.

from dream2nix.

DavHau commented on July 4, 2024

Sounds good. Then let's start with the crates-io fetcher only.

from dream2nix.

yusdacra commented on July 4, 2024

Sounds good. Then let's start with the crates-io fetcher only.

I was thinking of:

{
    checksum = "";
    name = "";
    version = "";
}

for the schema format.

from dream2nix.

yusdacra commented on July 4, 2024

I pushed some more changes to https://github.com/yusdacra/dream2nix feat/cargo-translator branch. Would like if you could check it and see if something is obviously wrong. I didn't test it yet, wanted to get your feedback first. I added it so that a crates-io source type is constructed with hash, name and version here, should adding it to schema be done in the same PR?

from dream2nix.

DavHau commented on July 4, 2024

Thanks I will review it tomorrow. No need to split it into several PRs. It is up to you.

from dream2nix.

DavHau commented on July 4, 2024

I see you made use of inputFiles. There is no real need for this, as the cargo toml file is contained in the project source.
I think for now you can just ignore inputFiles completely and read all files from the root of the project's source which is passed as first element of inputDirectories.

This way, your translator will be compatible to the current CLI. You can then test your translator by just executing for example:

nix run . -- add github:BurntSushi/ripgrep/13.0.0

This will generate the dream-lock.json for the given input.

Maybe it is a bit confusing that the translator interface already allows for several files and directories being passed. Its not really used anywhere yet, but I wanted to keep the possibilities open for the future.

from dream2nix.

yusdacra commented on July 4, 2024

I see you made use of inputFiles. There is no real need for this, as the cargo toml file is contained in the project source. I think for now you can just ignore inputFiles completely and read all files from the root of the project's source which is passed as first element of inputDirectories.

This way, your translator will be compatible to the current CLI. You can then test your translator by just executing for example:
nix run . -- add github:BurntSushi/ripgrep/13.0.0
This will generate the dream-lock.json for the given input.

Maybe it is a bit confusing that the translator interface already allows for several files and directories being passed. Its not really used anywhere yet, but I wanted to keep the possibilities open for the future.

Ah alright, i see. I pushed more work and tested out the translator, it seems to be passing the translation stage but dream2nix fails afterwards with:

JSONDecodeError

Expecting property name enclosed in double quotes: line 128 column 21 (char 4460)

at /nix/store/97w52ckcjnfiz89h3lh7zf1kysgfm2s8-python3-3.9.6/lib/python3.9/json/decoder.py:353 in raw_decode
    349│         have extraneous data at the end.
    350│
    351│         """
    352│         try:
  → 353│             obj, end = self.scan_once(s, idx)
    354│         except StopIteration as err:
    355│             raise JSONDecodeError("Expecting value", s, err.value) from None
    356│         return obj, end
    357│

any way I can see the JSON? Would like help on that.

And also, i see that the npm fetcher doesn't do decompression, so should it be handled by builders?

from dream2nix.

yusdacra commented on July 4, 2024

Nevermind, i fixed it. Turns out it was a trailing comma in spec. Wish this had better error reporting or maybe JSON allowed trailing commas 😅

from dream2nix.

DavHau commented on July 4, 2024

And btw, is there a reason why a "main package" is needed in a dream lock? For example, a Cargo.lock can contain multiple packages that are in a workspace

Could you point me to a github project that uses several packages. From what I observed in nodejs so far, there usually is always a main package even if the repo contains several packages. Usually the extra packages are sub-packages of the main package.

So it makes more sense to choose a package to build in the builder instead of the translation process. Not sure if there is already a way to do that.

Right now each builder is supposed to return defaultPackage and packages. Therefore you can access all contained packages via packages.

If it is not clear which package is the main package, maybe we should allow the dream-lock to have mainPackage = null

from dream2nix.

yusdacra commented on July 4, 2024

Repos like https://gitlab.com/veloren/veloren have seperate binary packages in a single workspace, others like https://github.com/iced-rs/iced use seperate Cargo packages for their examples. You can't really determine a "main package" if the user doesn't pass a package name and this is a workspace where multiple (binary) packages are available. So it's probably better to just delegate that to the builder where the user can select what they want to build.

I think it would be nice to allow mainPackage to not be set in a similar vein to Cargo.lock, where it doesn't define a "main package", instead Cargo.toml files are used to find out what to build or not. This doesn't mean it wouldn't be set though, it can be set fine in most scenarios where there is a single binary package in a repository.

Ideally IMO it would be a better idea to drop mainPackage from dream-lock, and set a default package in nix code instead. Translators would provide a list of "packages" to the CLI and then it would be displayed like the options so that the user can choose one. The CLI could then generate the nix expression using that. If the translator provides only one package this can be skipped, essentially acting as a "main package".

from dream2nix.

DavHau commented on July 4, 2024

Ideally IMO it would be a better idea to drop mainPackage from dream-lock, and set a default package in nix code instead. Translators would provide a list of "packages" to the CLI and then it would be displayed like the options so that the user can choose one. The CLI could then generate the nix expression using that. If the translator provides only one package this can be skipped, essentially acting as a "main package".

The dram2nix CLI is only used for translation, not for building as of now. And I don't see a reason why we would need it for building. As long as all packages are exposed via an attribute, the standard nix CLI should be sufficient.
Also we don't generate nix expressions. The output of the translator (dream-lock.json) is what gets consumed by the builder. There isn't any package specific nix expression (except for debugging if required).

So I think we should just allow mainPackageName to be null or unset.

from dream2nix.

yusdacra commented on July 4, 2024

The dram2nix CLI is only used for translation, not for building as of now. And I don't see a reason why we would need it for building. As long as all packages are exposed via an attribute, the standard nix CLI should be sufficient.
Also we don't generate nix expressions. The output of the translator (dream-lock.json) is what gets consumed by the builder. There isn't any package specific nix expression (except for debugging if required).

So I think we should just allow mainPackageName to be null or unset.

Thats fine. What i meant was that we remove the main package from lock, the users can just use whatever package provided by riseAndShine as a main package in Nix code. It feels weird to have values like that in the lock when they can be expressed in Nix code, I don't think locks should contain configuration (like main package basically is).

Also yeah I was talking about for debugging since that generates an expression. But i suppose setting a main package doesn't really matter since you can just specify the attribute.

from dream2nix.

DavHau commented on July 4, 2024

OK, removing the main package sounds good.
Just nodejs would need to be changed a bit. Right now it exposes all sub dependencies via packages as well.
It should probably not do that anymore, otherwise there is no distinction between top-level packages and dependencies after your proposed change.
Instead dependencies of a package can be exposed via {package}.packages....

from dream2nix.

yusdacra commented on July 4, 2024

We could close this issue now, but I will take the opportunity to use this as a tracking issue, as I want to implement a granular builder for Rust.

from dream2nix.

DavHau commented on July 4, 2024

I am currently working on a feature to support multiple lock files in a single repo. This is quite common in nodejs.
One problem that occurs is that sometimes names of packages collide. If we remove the mainPackageName, there is no possibility to detect and handle these collisions.
I think we should make that list of packages that you implemented for rust, a generic field of the dream lock. It is probably useful for all build systems. Instead of having one mainPackageName, a dream-lock will then have a list of (main) packages.

from dream2nix.

yusdacra commented on July 4, 2024

I am currently working on a feature to support multiple lock files in a single repo. This is quite common in nodejs. One problem that occurs is that sometimes names of packages collide. If we remove the mainPackageName, there is no possibility to detect and handle these collisions. I think we should make that list of packages that you implemented for rust, a generic field of the dream lock. It is probably useful for all build systems. Instead of having one mainPackageName, a dream-lock will then have a list of (main) packages.

That sounds good. packages in the Rust system was for listing packages in a workspace. I assume the nodejs situation is similar. IIRC Go and some other languages have similar situations so it will probably help for those too.

from dream2nix.

yusdacra commented on July 4, 2024

Current situation is:

There is a translator for Cargo.lock files. Two builders exist, one using buildRustPackage from nixpkgs and the other using crane (https://github.com/ipetkov/crane)

Next we want to have a granular builder, ie. a builder that will build each crate in a dependency tree as a separate derivation. For this we can use buildRustCrate from nixpkgs (crate2nix also utilizes this). This should make the situation much better since it would be pure and require no IFD, no vendoring steps in between, hopefully making everything much faster & efficient for caching.

I think this issue can be closed now that we actually have translators and builders for Cargo.

from dream2nix.

Create translator and builder for rust/cargo about dream2nix HOT 29 CLOSED

Comments (29)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent