Giter Club home page Giter Club logo

Comments (19)

jamii avatar jamii commented on June 20, 2024 2

So you are saying the the dependent crates are even built multiple times?

@kristoff3r is right, my language was imprecise. Functions like buildByPackageId are called many times with the same arguments, even though the same derivation is produced each time and only built once.

That can be a lot but I can't quite see how to derive at the exponential explosion that you are hinting at.

Suppose we have a dependency tree like:

A -> B1, B2
B1 -> C1, C2
B2 -> C1, C2
C1 -> D1, D2
C2 -> D1, D2
D1 -> E
D2 -> E

Without memoization, we'll construct the derivation for E 8 times - once for each path like A -> B1 -> C1 -> D1 -> E.

Here are the number of times dependencies is evaluated per package:

  # ... lots of lower counts ... #
   9947 proc-macro2
  10154 lazy_static
  12911 unicode-xid
  13267 futures
  15002 rand_core
  20131 cfg-if
  24425 libc

from crate2nix.

kristoff3r avatar kristoff3r commented on June 20, 2024 1

@jamii I tried looking at it for a few minutes, but I didn't get to the bottom. I think the best approach is to comment out dependencies until you have a minimal example that still doesn't evaluate, and then seeing if you can spot something that doesn't make sense.

I still got the error after commenting out everything except pgwire from materialized, and everything but coord and dataflow-types from pgwire, but removing any of those makes the error go away. I suspect it is a cyclic dependency graph, or simply one that is deep enough that you hit the recursion limit.

from crate2nix.

jamii avatar jamii commented on June 20, 2024 1

I was wrong - ulimit -s unlimited allows it to build. The problem appears to be one of breadth, not depth. I think the problem is that there is no memoization in build*, so we can build a number of derivations exponential in the depth of the graph.

jamie@machine:~/materialize$ ulimit -s unlimited

jamie@machine:~/materialize$ time nix-instantiate Cargo.nix -A workspaceMembers.materialized.build > build-log 2>&1

real	9m21.681s
user	7m3.289s
sys	0m31.606s

jamie@machine:~/materialize$ grep -c 'build deps' ./build-log 
238881

jamie@machine:~/materialize$ grep -c 'deps with renames' ./build-log 
111720

jamie@machine:~/materialize$ grep -c 'futures' ./build-log 
26682

from crate2nix.

kristoff3r avatar kristoff3r commented on June 20, 2024 1

@kolloch Each crate isn't built multiple times but it is evaluated multiple times, and the build only starts after the entire nix expression has been evaluated. I found discussion of a similar issue here: https://discourse.nixos.org/t/memoize-result-of-builtins-exec/2028

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

Thanks for the report! Sorry that I have no time debugging this currently.

from crate2nix.

jamii avatar jamii commented on June 20, 2024

Not at all, thanks for all the free code :)

Do you know of any resources on how to debug infinite recursion in nix?

from crate2nix.

jamii avatar jamii commented on June 20, 2024

Increasing the ulimit doesn't help, so it's probably not just depth. I'll work my way through commenting more stuff out.

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

So you are saying the the dependent crates are even built multiple times? Then they are even multiple different derivations for them? (Otherwise the hashes should match and nix should build them only once)

crate2nix walks the whole dependency tree including the buildInputs trees, see listOfPackageFeatures. That can be a lot but I can't quite see how to derive at the exponential explosion that you are hinting at.

Could you explain it in more detail for me? Thanks.

from crate2nix.

jamii avatar jamii commented on June 20, 2024

Here is hacky way to memoize.

        builtByPackageId =
          builtins.listToAttrs (map (packageId: {name = packageId; value = buildByPackageId packageId;}) (builtins.attrNames crateConfigs));
        buildByPackageId = packageId:
          let features = mergedFeatures.${packageId} or [];
              crateConfig = lib.filterAttrs (n: v: n != "resolvedDefaultFeatures") crateConfigs.${packageId};
              dependencies =
                  dependencyDerivations builtByPackageId features (crateConfig.dependencies or {});
              buildDependencies =
                  dependencyDerivations builtByPackageId features (crateConfig.buildDependencies or {});
              dependenciesWithRenames =
                lib.filterAttrs (n: v: v ? "rename")
                  (crateConfig.buildDependencies or {} // crateConfig.dependencies or {});
              crateRenames =
                  lib.mapAttrs (name: value: value.rename or name) dependenciesWithRenames;
          in buildRustCrate (crateConfig // { inherit features dependencies buildDependencies crateRenames; });

It cuts the instantiate time down to 3m43, which still seems pretty high.

from crate2nix.

jamii avatar jamii commented on June 20, 2024

Even with this memoization, if I run with -vvvvv I get:

amie@machine:~$ grep 'instantiated' materialize/build-log3 | cut -d' ' -f 2 | sort | uniq -c | sort -h
   # ... lots of lower counts ... #
   4908 'rust_rand_core-0.3.1'
   6117 'bytes-0.4.12.tar.gz'
   6117 'rust_bytes-0.4.12'
   6288 'either-1.5.2.tar.gz'
   6288 'rust_either-1.5.2'
   6341 'byteorder-1.3.2.tar.gz'
   6341 'rust_byteorder-1.3.2'
   6838 'crossbeam-utils-0.6.6.tar.gz'
   6838 'rust_crossbeam-utils-0.6.6'
   9429 'log-0.4.8.tar.gz'
   9429 'rust_log-0.4.8'
   9710 'iovec-0.1.2.tar.gz'
   9710 'rust_iovec-0.1.2'
   9800 'proc-macro2-1.0.1.tar.gz'
   9800 'rust_proc-macro2-1.0.1'
   9816 'rand_core-0.4.0.tar.gz'
   9816 'rust_rand_core-0.4.0'
  10154 'lazy_static-1.4.0.tar.gz'
  10154 'rust_lazy_static-1.4.0'
  12727 'rust_unicode-xid-0.2.0'
  12727 'unicode-xid-0.2.0.tar.gz'
  13267 'rust_futures-0.1.29'
  20131 'cfg-if-0.1.9.tar.gz'
  20131 'rust_cfg-if-0.1.9'
  24425 'libc-0.2.65.tar.gz'
  24425 'rust_libc-0.2.65'

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

@jamii I don't think memoization at that point in the code makes a difference. In fact, the last numbers are the same. We need to apply it at a lower level.

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

Your explanation comment is just awesome, @jamii! Thank you for being so elaborate.

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

I tried to memoize the feature resolution in

https://github.com/kolloch/crate2nix/tree/cachedFeatureResolve

Unfortunately, I get the following an error while compiling crate2nix. The weird thing is that I thought that I'd only change the feature resolution and the result of the feature resolution seems to be the same. I checked with

nix eval --json '(let cargo = import crate2nix/Cargo.nix {}; in cargo.mergePackageFeatures { packageId = cargo.rootCrate.packageId; features = ["default"]; })' | nix run nixpkgs.jq -c jq

It is also the same as the default resolution of cargo metadata:

nix eval --raw '(let cargo = import crate2nix/Cargo.nix {}; in cargo.diffDefaultPackageFeatures { packageId = cargo.rootCrate.packageId; })' | nix run nixpkgs.jq -c jq

(it outputs some onlyInCargo crates because they are filtered by target)

If you'd find the bug, I'd be very happy :)

from crate2nix.

jamii avatar jamii commented on June 20, 2024

I don't think memoization at that point in the code makes a difference. In fact, the last numbers are the same.

Memoization at that point reduced the runtime from 10m to 3m40 and the number of evaluations per package from several 100k to 1.

I agree that the same problem seems to exist somewhere lower in the stack, but we probably need both.

from crate2nix.

jamii avatar jamii commented on June 20, 2024

Unfortunately, I get the following an error while compiling crate2nix.

What was the error?

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

Sure, we will memoize on both levels if that improves things :) I just looked at the last numbers of what you posted and they were the same? Probably missed something.

The error is related to the log crate in env_logger.

from crate2nix.

jamii avatar jamii commented on June 20, 2024

I just looked at the last numbers of what you posted and they were the same? Probably missed something.

The first set of numbers was number of calls to buildByPackageId. That went down to 1 after memoization. The second set of numbers was package instantiations after memoization. I assume that they are the same number because at both levels the same graph is being unwound, so the number of paths is the same. But it seems to be two different parts of the code that are doing that same unwinding.

from crate2nix.

kolloch avatar kolloch commented on June 20, 2024

BTW, why do you think that you memoization was hacky? I thought it was quite cool. Accidentially, I didn't know that this kind of forward reference is allowed in nix (buildRustCrate from builtRustCrate).

Anyways, I also found the bug in the other memoization attempt. I am very curious if that improves things for you...

from crate2nix.

jamii avatar jamii commented on June 20, 2024

why do you think that you memoization was hacky?

Oh, just because I did the easiest thing that would test the hypothesis, rather than thinking about how best to write it.

I am very curious if that improves things for you...

It seems to be on par with memoizing buildByPackageId. The number of evaluations per package is sensible, but the number of instantiations is still huge.

from crate2nix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.