Giter Club home page Giter Club logo

bigtrees's Introduction

BigTrees

A rewrite of gander focusing more on usability of the data structures as a library, rather than on my own "dedup backups" use case.

Quick Start

git clone https://github.com/jefdaj/bigtrees
cd bigtrees

# old way, still works:
nix-shell
stack test

# new way, static build in progress:
nix build

# benchmarking
stack bench --ba --baseline=test/bench/bench.csv --timeout=60s

Done

  • Moved Gander.Cmd -> BigTrees.OldCmd, leaving old commands functional during the rewrite
  • Wrote a meta lint script (hlint, stan, stylish-haskell, weeder) and applied some basic suggestions
  • Some initial work in progress writing haddocs
  • Moved tests into lib/ + app/ alongside the functions they test, wrote more of them
  • Broke HashTree into smaller modules by operation: Build, Write, etc
  • Rewrote my old directory-tree code using a typeclass, started a PR upstream
  • Wrote comparison of text vs binary format file sizes, realized binary is always larger, removed it
  • Added mod time, size (bytes), n files (nodes) to tree data
  • Added header + footer to hashes describing filters, version used, start/end time, table format
  • Rename data structures: Depth, NFiles, NBytes

Todo

  • Static build so it can be used offline without Nix
  • "find mode": list full paths, filter by metadata and glob/regex
  • Rewrite command line interface
  • Add Graft nodes that import other tree files
  • Add Link nodes that indicate whether their target data is present in the tree
  • Add Error nodes to wrap errors, the same way directory-tree does it
  • Intelligent re-hashing of only the files whose mod times have changed
  • Clean up: write haddocks, hide partial constructors, etc
  • Upload to Hackage
  • Example screencasts of using the binary + data structures in repl
bigtrees hash   <src> [-o <tree>]
bigtrees update <tree> [-i <src>]
bigtrees cut    <tree> <branch> [-o <tree>]
bigtrees rm     <tree> <branch>
bigtrees graft  <tree> <branch> [-i <tree>]
bigtrees mv     <tree> <oldbranch> <newbranch>
bigtrees diff   <oldtree> <newtree>
bigtrees dupes  <tree> [<condition>..] [-s <sortby>] [-n <nhits>] [-p <branch>] [-d <script>]

bigtrees's People

Contributors

jefdaj avatar

Watchers

 avatar

bigtrees's Issues

Doctests not being tried

...
bigtrees      > [INFO   ] [ThreadId 7] Examples: 0  Tried: 0  Errors: 0  Unexpected output: 0
bigtrees      > Test suite test-doctests passed
...

flake.nix devShell fails due to nixpkgs overlay bug

I have a static build working with a new flake.nix based on my old gander one, but when I add any package to devShell it fails with a message about Docopt being broken:

...
        devShell = project (executableSystemDepends ++ [
          # TODO *any* package here evaluates the broken docopt? weird
          hello
        ]);
...
...
       error: Package ‘docopt-0.7.0.8’ in /nix/store/z71lmgd0ydfnax1b13zbrls5idf1y7ak-source/pkgs/development/haskell-modules/hackage-packages.nix:92335 is marked as broken, refusing to evaluate.

       a) To temporarily allow broken packages, you can use an environment variable
          for a single invocation of the nix tools.

            $ export NIXPKGS_ALLOW_BROKEN=1

          Note: When using `nix shell`, `nix build`, `nix develop`, etc with a flake,
                then pass `--impure` in order to allow use of environment variables.

       b) For `nixos-rebuild` you can set
         { nixpkgs.config.allowBroken = true; }
       in configuration.nix to override this.

       c) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
         { allowBroken = true; }
       to ~/.config/nixpkgs/config.nix.

I think this is a manifestation of nixpkgs issue #235960: somehow the non-overridden docopt with broken = true is leaking into the final package set. And maybe also being evaluated when it shouldn't be? Will need to do some more detailed investigation.

Round-trip to dir failure

Possibly related to unicode encoding and/or an ext4 filesystem issue?

Can reproduce in stack repl:

writeTestTreeDir "issue01example1_before" issue01example1
after <- buildProdTree False [] "./issue01example1_before"
diff (dropFileData issue01example1) after

Examples of failing test trees:

issue01example1 :: TestTree
issue01example1 =
  Dir 
    { name = Name "\xf58e6\x1057cc"
    , hash = Hash
        { unHash = "NjAxYWM0OTY1M2RkOGNm" }
    , contents =
        [ Dir 
            { name = Name "𧊯"
            , hash = Hash
                { unHash = "NWQxZTY4ZGRlNmFmNTRj" }
            , contents =
                [ File
                    { name = Name "Ԉ" 
                    , hash = Hash
                        { unHash = "ZTNiMGM0NDI5OGZjMWMx" }
                    , fileData = ""
                    }
                , File
                    { name = Name "쿏]"
                    , hash = Hash
                        { unHash = "YjdkMjUyOTZlN2JjNmE2" }
                    , fileData = "Û" 
                    }
                ]
            , nFiles = 2 
            }
        ]
    , nFiles = 2 
    }   

issue01example2 :: TestTree
issue01example2 =
  Dir
    { name = Name "\xf6847"
    , hash = Hash
        { unHash = "ODkzNTQzYjU1MjljNWFh" }
    , contents =
        [ Dir
            { name = Name "*\xfc5a1-"
            , hash = Hash
                { unHash = "OTkxNjI4OWVhNjUyYmE0" }
            , contents =
                [ File
                    { name = Name "🮡"
                    , hash = Hash
                        { unHash = "ZTNiMGM0NDI5OGZjMWMx" }
                    , fileData = ""
                    }
                , File
                    { name = Name "\xfec76_"
                    , hash = Hash
                        { unHash = "ZDA3NTJiNjBhZGIxNDhj" }
                    , fileData = "ç"
                    }
                ]
            , nFiles = 2
            }
        ]
    , nFiles = 2
    }

Fix build system

I have a hodgepodge of build setups "mostly working", but none that work in all situations:

  • shell.nix + stack build builds the closest-to-static binary I've been able to achieve on MacOS
  • shell.nix + stack build on Linux works but the binary has a lot of dynamic libs
  • flake.nix static build works on Linux, but not on the latest ghc (9.8.1), and not with my dev dependencies in the shell

I'd like to migrate everything to the flake if possible, but need to wait for a couple issues to resolve themselves upstream or figure out how to work around them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.