Giter Club home page Giter Club logo

metastore's People

Contributors

ahmgithubahm avatar alphix avatar dfandrich avatar edacval avatar kmdawson avatar przemoc avatar rfrancoise avatar xkrug-bubeck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metastore's Issues

Can't change owner when current owner is numeric UID

Attempting to apply metadata to a filesystem where the current UID/GID is not in /etc/passwd results in a "getpwuid failed" error on startup, and a failure to change the ownership of the files specified in .metadata, with a "removed" message when the files were actually there.

(Situation occurred after running an rsync from a host when restoring, and the UID/GID in question restored by rsync was a different UID/GID to the original.)

Add action to dump metadata in human-readable form (`-d`, `--dump`)

Output should be similar to:

$ ls -alp --time-style=full-iso
total 4
drwxr-xr-x 1 przemoc przemoc   26 2015-09-06 19:45:34.222175902 +0200 ./
drwxr-xr-x 1 przemoc przemoc 2648 2015-09-07 16:02:10.047591753 +0200 ../
-rw-r--r-- 1 przemoc przemoc   99 2015-09-06 19:45:34.222175902 +0200 .metadata
-rw-r--r-- 1 przemoc przemoc    0 2015-09-06 19:27:41.334181566 +0200 test

Current idea is to output following tab-separated columns:

  • mode in textual form,
  • owner,
  • group,
  • modification time full-iso-style (+%F %T.%N %z),
  • path,
  • xattr name=value.

If file have extended attributes, then each attribute name and its value will be shown on new line in 6th column (xattr) with only 5th column (path) not cleared. Value will be shown as text in quotes if all bytes are within 32-126 range or as hex prefixed with 0x otherwise. Example:

$ metastore -d 
-rw-r--r--  przemoc przemoc 2015-09-06 19:27:41.334181566 +0200 ./test
                ./test  user.txt="tekst"
                ./test  user.bin=0x020100ff00

Path order will be undefined. But you'll be able pipe output to LC_ALL=C sort -t $'\t' -k5 (if you don't have bash/zsh, then replace $'\t' with literal tab in quotes).

By default dump should use existing metastore file (typically .metadata), as it was shown in above example, but it should be also able to dump metastore file that would be created if save action was used with given path. Example:

$ metastore -d .
-rw-r--r--  przemoc przemoc 2015-09-06 19:45:34.222175902 +0200 ./.metadata
drwxr-xr-x  przemoc przemoc 2015-09-06 19:45:34.222175902 +0200 ./
-rw-r--r--  przemoc przemoc 2015-09-06 19:27:41.334181566 +0200 ./test
                ./test  user.txt="tekst"
                ./test  user.bin=0x0201000000

Dump action is meant only as a helpful debugging facility/merge conflict helper. Do not ever compare dumps taken using different metastore version. Do not rely on current output format (especially in batch scripts), because it may change in future without prior notice.

Reading metadata file with xattrs leads to corruption in memory on 64-bit platforms

Hi.

After trying to apply xattrs using metastore -a -v -f /somedir/metafile /sourcedir, I get the following errors:

./test/test1/vzd_test:  adding xattr system.posix_acl_default
        lsetxattr failed: Bad address

strace reports this:

lsetxattr("./test/test1/vzd_test", "system.posix_acl_default", 0x148, 52, XATTR_CREATE) = -1 EFAULT (Bad address)
lsetxattr("./test/test1/vzd_test", "system.posix_acl_default", 0x148, 52, XATTR_REPLACE) = -1 EFAULT (Bad address)

If you wonder what second lsetxattr() with XATTR_REPLACE is doing after the first one, I must admit I patched original metastore.c by adding second lsetxattr in case of error in the first call, but it doesn,t help.

File system is ext4, mounted with both acl and user_xattr:

/dev/sda1 on /mnt/disk type ext4 (rw,user_xattr,acl)

Apply is being done on the same filesystem as save.

I am using the latest git clone.

Regards,
Uros

Parallelize store action

Judging from system monitor metastore -s only uses one thread. I'm naively assuming that at some point it has to walk down a file and directory tree and visit it's nodes recursively or iteratively. I propose to put file paths in a directory in groups of <= 100 into queues from which n threads can poll and create the file output which can then be written into a large buffer (in order to avoid an I/O bottleneck). In case it's necessary the output needs to be ordered all threads need a sequence number and others must not proceed until the lowest has finished (all threads have to do nothing, but stat calls which should cause quite equal load on each thread).

Release new version

As can be seen in commits history, there is barely any development going on, but there were some bugfixes since the last released v1.1.2 from 5.5 years ago (NEWS file mentions them), so it would be good to release new version, v1.1.3. I feel bad it didn't happen earlier.

To make it happen I need to do following steps before:

  • find my GPG keys and subkeys (which hopefully are somewhere on some disk or backup of some disk on another disk...)
  • recall passwords that I used for the keys
  • relearn gpg on:
    • how to import keys into some of my currently used machines
    • create new signing subkey (old one surely expired)
  • reach to some folks to sign new subkey

And it was similar struggle in the past. GPG always seemed a bother to me. Maybe it's just me, or maybe other folks who use it rarely (i.e. not even monthly) can relate.

Side note:
Truth is that I don't really use metastore myself (and it's like that for many years already), that's why the project had not seen much love other than fixing bugs.

Change default installation prefix to /usr/local

since it messes up binaries installed by package management (e.g. on Debian based systems). Either the prefix should be /usr/local or - even better - configurable.

Consider switching to autoconf which might seem overkill, but avoids creating a configure script now which later will be replaced by autoconf anyway or always consume maintenance costs.

When switching to autoconf #23 has to be reviewed.

New metadata file format (textual)

It's desirable to introduce new metadata file format that would be human-friendly and merge-friendly (when used in VCS like git), so making it textual is an obvious choice. Such format should be compact (no XML!), but not too compact. Below you can see current version of my draft amendment.

Data types
----------
SSTRING - `;`-terminated string with special characters (`\n`, etc.)
          and semicolon escaped

" v001t\n\n" file format
------------------------
HEADER
N * PENTRY

PENTRY format
-------------
SSTRING    - Path
BSTRING(1) - Parameter:
             "m" - mode
             "o" - owner
             "t" - mtime
             "x" - xattr
BSTRING(1) - "="
SSTRING    - Parameter value
BSTRING(1) - "\n"

Patameter value formats
-----------------------
mode  - octal mode
owner - "USER:GROUP"
mtime - UTC date+time in basic ISO8601 format (`%Y%m%dT%H%M%S.%NZ`)
xattr - "KEY1=VALUE1[,KEY2=VALUE2...]"
        (keys and values have comma and equals sign characters escaped)

Example:

MeTaSt00r3 v001t

metastore.c;m=644;
metastore.c;o=przemoc:users;
metastore.c;t=20140302T162230.123456789Z;
metastore.c;x=;

Why not put all parameters in one line? Well, it would be more space-efficient, sure, but also more error-prone and less merge-friendly. So I say no for all file parameters in one line.

Why not put file name only once followed by parameters, each one in its own line? Because we lose contextlessness of each line then, and meaningful line without context is a really nice asset that I would like to have in such new format, for all your merge, grep, etc. intents and purposes.

OTOH support for gzipping can be still considered I think. Git has textconv, so diff case can be handled well. For (hopefully rare) merge case one can gunzip file, fix it and re-gzip. Or do g(un)zipifying conversion by metastore (it depends on what would be gzipped, whole metastore file or only data after header?). Space savings coming from gzipping could be substantial for repositories with lot of files. Maybe disk space usage would be then even similar to the old format? Still, these merges, grr... If only git supported bidirectional textconv... :-)


Backward compatibility dictates that such new metadata format rather won't be a default one. There is arising need for metastore configuration file and I'll add a new issue for that.

Remove libattr dependency

In building metastore (on Fedora 20) I found that the include line for xattr.h triggered a nosuch file or directory response. Looking in a number of man pages, and the book "The Linux Programming Interface", by Michael Kerrisk, I found an alternate location, which worked properly for me.

Additionally, I needed to add an include for errno.h.

Both of these files are installed by the glibc-headers-* packages.

I couldn't determine on what distros the location <attr/xattr.h> is correct.

Thanks for your attention.

/ken

The patch is:

diff --git a/metaentry.c b/metaentry.c
index b0ea69d..02e5bb8 100644
--- a/metaentry.c
+++ b/metaentry.c
@@ -25,13 +25,14 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <unistd.h>
-#include <attr/xattr.h>
+#include <sys/xattr.h>
 #include <limits.h>
 #include <dirent.h>
 #include <sys/mman.h>
 #include <utime.h>
 #include <fcntl.h>
 #include <stdint.h>
+#include <errno.h>

 #include "metastore.h"
 #include "metaentry.h"
diff --git a/metastore.c b/metastore.c
index de1bf07..0a49e3f 100644
--- a/metastore.c
+++ b/metastore.c
@@ -23,10 +23,11 @@
 #include <sys/stat.h>
 #include <getopt.h>
 #include <utime.h>
-#include <attr/xattr.h>
+#include <sys/xattr.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <errno.h>

 #include "metastore.h"
 #include "settings.h"

Fails to build on Arch Linux

Error:

gcc -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4 -g -Wall -pedantic -std=c99 -D_FILE_OFFSET_BITS=64 -O2 -Wl,-O1,--sort-common,--as-needed,-z,relro -lbsd -o metastore utils.o metastore.o metaentry.o
metaentry.o: In function `mentries_dump':
~/metastore/metaentry.c:643: undefined reference to `strmode'

This was caused by the default LDFLAGS="-Wl,--as-needed" on Arch Linux, Gentoo and, guess, many other modern distributions.

According to https://wiki.gentoo.org/wiki/Project:Quality_Assurance/As-needed#Importance_of_linking_order, libraries must be passed to linker after the object files and the static archives.

Introduce OS-agnostic API internally to support different POSIX systems

We have to avoid preprocessor macros spaghetti, like for various OS approaches regarding extended attributes.
We already have NO_XATTR for systems not supporting it, but if they support it differently than Linux, then common OS-agnostic API should be introduced internally that would have OS-specific implementations for different POSIX systems.


Let's not tackle with non-POSIX systems like Windows, as it would require even bigger changes, that may not be really suitable for metastore.

There were some existing efforts to make metastore work on systems different than Linux.

  • macOS
    #48, see also comments in #55
  • FreeBSD
    #60, see also discussion in #54
  • OpenBSD
    #43, see also discussion in #55
  • Cygwin (Windows)
    #61, see also discussion in #56

feature request: text file format support

A text file format is needed so I can track metadata change using the same way as the other files in git repository.
Willing to contribute.
Why choose a binary file format in the first place? Should we add the text file format as an alternative or just replace the original binary file format and eliminate the "-c" action?

Mac support

Curious if this is as intended -- compiling on Mac gives following:

btmacpro:metastore btorpey$ make
sed: 1: "/^1:/s,,,;/:/{s,[^:]*:, ...": bad flag in substitute command: '}'
gcc  -g -Wall -pedantic -std=c99 -D_FILE_OFFSET_BITS=64 -O2 -DMETASTORE_VER="\"\""  -o metaentry.o -c ./src/metaentry.c
./src/metaentry.c:35:10: fatal error: 'bsd/string.h' file not found
#include <bsd/string.h>
         ^
1 error generated.
make: *** [metaentry.o] Error 1
btmacpro:metastore btorpey$

Thanks in advance for any help. Converting to git and cant live without mtime!

Test and fix example git hooks

As I was mostly out-of-git metastore user myself, I haven't caught it, but apparently pre-commit hook's git add .metadata doesn't do what one may think it does, i.e. this example script is in fact broken with non-ancient git versions. Thus it has to be thoroughly tested and fixed along the way.

Managing Git workflows in large repos

I have been looking at your tool and several others and they all seem to not be able to handle the following scenario. Suppose you have a large repo with lots of developers working on some of the same files. The file metadata is used to control the build process, which is lengthy due to the size of the project; the metadata is limited to file access and modification times.
How do you handle the file metadata when a developer decides to pull the official repo or merges his/her code with the official repo, and the official repo contains changes to the same files that he/she is working on? Obviously, you want to preserve the file changes so a file contents merge will need to occur but what to do with the metadata? What if the metadata in the official repo points to a time earlier than the metadata for such file in the developers node? What if the opposite is true. There are several scenarios at play in here but I think the 2 described above are the main ones.

Can megastore handle such scenarios?

feature request: add option to ignore arbitrary file

metastore ignores ".git" which I think is obviously not enough.
The problem I encounter is that metastore unneccessarily records the metadata of file .metadata.
I'd like to add an option "-x, --exclude=PAT" which accepts a pattern just like diff does.

Cygwin support

When I install metastore on Cygwin. It will show the following error.

gcc  -g -Wall -pedantic -std=c99 -D_FILE_OFFSET_BITS=64 -O2  -o metaentry.o -c ./src/metaentry.c
./src/metaentry.c:35:24: fatal error: bsd/string.h: No such file or directory
 #include <bsd/string.h>
                        ^
compilation terminated.
Makefile:56: recipe for target 'metaentry.o' failed
make: *** [metaentry.o] Error 1

I think there could be more generic way to use string.h but not bsd/string.h?

Improve metastore for git-related workflows

This issue is for discussing what needs to be improved in metastore to make git users more happy.

  • Textual metadata format (go to issue #6)
  • Remove empty directories (go to issue #10)
  • More documentation with exemplary hooks? (go to issue #33)

Introduce configuration file

Configuration files will be optional and they will be read in following order:

  • /etc/metastore.conf (system options)
  • $HOME/.metastore.conf (global options)
  • $CWD/.metastore.conf (local options)

The format will be most likely INI-like (key = value), but without any sections.

Options theoretically required to support current features:

  • verbosity = INT - verbosity level (0 by default)
  • mtime = BOOL - should mtime be considered when applying or diffing metadata? (no by default)
  • empty-dirs = BOOL - recreate missing empty directories (no by default)
  • git = BOOL - do not omit .git directories (no by default)
  • file = STR - metadata file (.metadata by default)

Options required to support future features:

  • format = STR - metadata file format (0 by default, v001t for new one)
  • format-convert = BOOL - should metadata file be converted to chosen format even if present file uses other one? (no by default)
  • exclude = STR - exclude dirs/files that match pattern (.git by default)
  • exclude-from = STR - exclude dirs/files that match any pattern in file
  • exclude-reset = BOOL - removes all excludes defined earlier using exclude when true
  • exclude-from-reset = BOOL - removes all excludes defined earlier using exclude-from when true (no-op when false)
  • remove-empty-dirs = BOOL - remove empty directories not present in applied metadata file (no by default)
  • work-on-parameters = STR - parameters that should be considered when diffing or applying metadata ("mox" by default)

Some options theoretically required to support current features would be better not present at all. When exclude will be available, there will be no need for git, as it would complicate things more (ignoring .git directories being excluded...). Similarly work-on-parameters is much nicer than specific mtime.

Reading extended attributes values from metadata file is broken for non-textual values

During implementation and tests of new dump action, I've noticed that while non-textual values in extended attributes were properly stored in .metadata file, retrieving them from it was simply broken, i.e. anything beyond first null byte was zeroed.

Small quantum of solace is the fact that apparently metastore users rarely use extended attributes or at least rarely with non-textual values, because otherwise they would surely report such crucial bug.

Becoming official metastore repository (blessed by David Härdeman)

Some of you feel unsure about metastore, because it's no longer accessible from David Härdeman's site, who is original author and maintainer of this useful tool. David no longer works on metastore. My repository is unofficial continuation, as I call it, but I already contacted David regarding possibility of ceding his maintainership of metastore and making my repository the official one. His response was positive. Such thing should be done publicly, though, and it will be done publicly, obviously with the help of git. When it will happen, then text about unofficial continuation will be removed from project description and this issue will be closed.

(It's not really an issue, rather kind of note for those curious about metastore.)

Let --quiet make metastore completely quiet

The --quiet option right now reduces, but does not eliminate, the output of messages. It can be useful in some contexts to make metastore completely silent. If two --quiet options are found on the command-line, only errors should be output, and adding a third --quiet should make it completely silent (a.k.a. "quiet").

New release

Please, make a new release for metastore in some short time. It is very useful for program packaging when a program has releases numbers.

Do not skip nanoseconds when applying mtime

Hello!

Using this tool with git, for synchronising a set of files where I'd like to preserve timestamps!

It seems that metastore creates the .metastore with nanoseconds, but during application only applies the timestamp to the nearest second. Therefore, when synchronising back/forth, the nanoseconds are eliminated.

Could the tool be adjusted so that there's a choice over seconds vs. fractional seconds?

Record attributes only for specified dirs/files

Hi, I'd like to propose that an command-line option is added to metastore so that it records metadata only for files and directories specified by the user. This will be very helpful for Git workflow, where the ones we want to keep track and recorded in Git repo eventually are just those files/dirs under GIt control. Currently the metadata of ALL files found in the current directory and its subdirs are recorded, so it becomes superfluous. This feature is complementary to #8 , but more specifically will be relevant to Git.

FreeBSD support

In investigation the generality of the currently suggested fix to fix to Issue #21, I discovered that it would be pretty straight-forward to port it to FreeBSD. That might be a nice enhancement.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.