Giter Club home page Giter Club logo

Comments (6)

joeyh avatar joeyh commented on May 31, 2024

If I needed this today, I would try to find a way to write a preferred content expression for my local repository that covered the files I wanted to have present. Then I could use git annex get --auto to get new versions of them after syncing. Or, I might try to reorganize my repo, so I worked on a branch that contained only the files I wanted to have present locally, and then I could git annex get.

It would be nice if there was a haveoldversion(*) available in the preferred content language. But, it seems it would be quite expensive. It would require running git log on each file, to get past versions of the key used by the file. So I don't think that querying git for the necessary information on the fly is feasible.

What might be feasible is adding a mapping from a filename to an flag bit. git-annex get would set the flag, and git annex drop would unset it. Other updates to the tree, including git-annex sync, would not affect it.

The mapping could I suppose just be some peice of filesystem metadata for the file, or a .git/annex/blah/path/to/file, but these are pretty hacky appoaches. I'm looking at adding databases to git-annex anyway, for http://git-annex.branchable.com/design/caching_database/ , although I think that all the other use cases are of a database of information about a key, not about a file.

from datalad.

joeyh avatar joeyh commented on May 31, 2024

Thinking about this some more, a haveoldversion(*) would only be stable if the source of the data was originally git. Otherwise, git-annex in one repository would not be able to tell if another repository that uses that expression wanted a file or not.

Retrieving the data from git and caching it could work. Fits in more with the caching database plan too. git-annex commands like get/drop that update the location log could also update the cache, which would avoid expensive cache misses sometimes. But often enough for this to be reasonably fast? The cache should ideally also work when checking the preferred content of remotes. Maybe the filename to old version of key mapping would be the thing cached (but branches complicate this).

from datalad.

yarikoptic avatar yarikoptic commented on May 31, 2024

On Thu, 25 Sep 2014, Joey Hess wrote:

If I needed this today, I would try to find a way to write a preferred
content expression for my local repository that covered the files I wanted
to have present. Then I could use git annex get --auto to get new versions
of them after syncing. Or, I might try to reorganize my repo, so I worked
on a branch that contained only the files I wanted to have present
locally, and then I could git annex get.

This came up just as a perspective use-case -- no immediate (i.e. today)
resolution is really needed. But thanks for outlining workarounds -- I
didn't know about --auto option and its behavior for get operation.

It would be nice if there was a haveoldversion(*) available in the
preferred content language. But, it seems it would be quite expensive. It
would require running git log on each file, to get past versions of the
key used by the file. So I don't think that querying git for the necessary
information on the fly is feasible.

but what if there was a record of the last treeishes for a worktree
branch and a corresponding state of git-annex branch where you know that
it is in the "desired state". Then upon the "upgrade" command (via
whatever actual command it would happen) annex would check which files
have changed their availability (i.e. for which the load disappeared) --
that would not require history traversal -- and make them available
consequently updating both of those references to new treeish for that
branch and git-annex ? thinking about it though I see that it might
interfere with user-invoked 'drop' command... but may be then
reference only for the corresponding git-annex branch components of
those pairs would get updated and it would "resolve" it?

just exercising this idea -- if faulty -- just state so and ignore ;)

What might be feasible is adding a mapping from a filename to an flag bit.
git-annex get would set the flag, and git annex drop would unset it. Other
updates to the tree, including git-annex sync, would not affect it.

yeap -- sounds good to me ;)

The mapping could I suppose just be some peice of filesystem metadata for
the file, or a .git/annex/blah/path/to/file, but these are pretty hacky
appoaches. I'm looking at adding databases to git-annex anyway, for
[1]http://git-annex.branchable.com/design/caching_database/ , although I
think that all the other use cases are of a database of information about
a key, not about a file.

thanks -- I will check it out.

Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Research Scientist, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

from datalad.

joeyh avatar joeyh commented on May 31, 2024

Yes, there is probably the possibility of putting this into git annex sync --content as a special case, and so optimising it. Using preferred content makes the feature much more generic and widely usable though.

from datalad.

joeyh avatar joeyh commented on May 31, 2024

Also being discussed at http://git-annex.branchable.com/bugs/present_files__47__directories_are_dropped_after_a_sync/

from datalad.

mih avatar mih commented on May 31, 2024

I will close this old issue now. We do have update --reobtain-data.

from datalad.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.