treeverse / lakefs Goto Github PK
View Code? Open in Web Editor NEWlakeFS - Data version control for your data lake | Git for data
Home Page: https://docs.lakefs.io
License: Apache License 2.0
lakeFS - Data version control for your data lake | Git for data
Home Page: https://docs.lakefs.io
License: Apache License 2.0
At the very least:
Possibly required for release: these clients or servers will be involved in users' next upgrades.
sig v2 doesn't ignore port on request when compare to baredomain.
not it also panic - which should be removed or replace with error
Using go test -short
will skip the gateway playback testing.
It will enable faster basic test run without download and play of playback data.
It's currently a standalone endpoint outside of swagger therefore hidden.
Move that endpoint to swagger and document properly.
NO SUPPORT FOR
this should be added to the Deployment documentation as well.
Originally posted by @ozkatz in https://github.com/treeverse/lakeFS/pull/230/files
Retention uses single large queries. These can hurt DB performance during retention. Break queries up into smaller chunks to improve performance and let other transactions continue. (It is a sound transformation!)
In a developer environment, allow developer to keep db alive after the tests are finished. Also, allow running tests multiple times on the same db container.
The capability to rollback branch committed changes by apply the reverse changes.
Optionally for specific range of changes/commits
SignatureDoesNotMatch due to unicode encoding in signature.
After #405, There are numerous "interesting" failure modes for the role used for retention batch tagging:
AccessDenied
instead of NoSuchKey
. That makes reports useless both for users and for (future) object_dedup cleanups.Diagnose and report all of these (assuming of course that the running user has sufficient permissions...).
In S3 web interface, the path is part of the URL so the slash separator is preserved, making the path easy to copy (see photo).
In our UI, the path is given as a query param, so slashes are escaped.
Perhaps add a copy button for the path. Need to decide which path type to copy - s3 or lakeFS, and whether it will include the repo and branch names.
BA: basic configuration and makefile that uses golangci-lint was added
Architecture page in the docs currently shows at pre-alpha warning which is no longer true
In the web UI in the commits tab
the commits are not sorted by time
The command lakefs init
should be renamed to lakefs setup
, to keep consistency with the API.
I would suggest tracking skipped repos due to errors - if one of them failed, return a non-zero return code. Otherwise it'll get logged into the void and probably won't ever be noticed by anyone until the S3 bills start racking up...
Originally posted by @ozkatz in #309 (comment)
We include a full copy of swagger-ui in docs/assets/js
. This increases size and reduces performance.
DiffUncommitted should support prefix. It will enable us getting diff by level.
The response should include flag saying if there was a change in the branch level and under the prefix return the diff by level.
It should consider add/delete in the folder level.
Right now it's a huge doc - split it to steps.
Also add Kubernetes as an installation method.
Postponed until S3 based retention is in-place.
On the community page. Click it, and you'll see :-)
Detected in #256. Policies are printed with lots of %v
, which ends up throwing addresses at the user.
E.g. policies create
:
ariels@ariels:~/Dev/lakeFS$ echo '{"id": "foo", "statement": [{"action": ["auth:*"], "effect": "Allow", "resource": "arn:::::"}]}' | ./lakectl auth policies create --policy-document -
Policy created successfully.
ID: 0xc00043c020
Creation Date: 2020-07-08 12:16:26 +0300 IDT
Statements:
+--------------+-------------------------------+-------------+--------------+--------------+---------+
| POLICY ID | CREATION DATE | STATEMENT # | RESOURCE | EFFECT | ACTIONS |
+--------------+-------------------------------+-------------+--------------+--------------+---------+
| 0xc0004d2590 | 1970-01-01 02:00:00 +0200 IST | 0 | 0xc00043c040 | 0xc00043c030 | auth:* |
+--------------+-------------------------------+-------------+--------------+--------------+---------+
E.g. policies list
:
ariels@ariels:~/Dev/lakeFS$ ./lakectl auth policies list
+--------------+-------------------------------+-------------+--------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| POLICY ID | CREATION DATE | STATEMENT # | RESOURCE | EFFECT | ACTIONS |
+--------------+-------------------------------+-------------+--------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0xc0004e3d60 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3d80 | 0xc0004e3d70 | auth:* |
| 0xc0004e3d90 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3dc0 | 0xc0004e3db0 | auth:CreateCredentials, auth:DeleteCredentials, auth:ListCredentials, auth:ReadCredentials |
| 0xc0004e3dd0 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3df0 | 0xc0004e3de0 | fs:* |
| 0xc0004e3e00 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3e20 | 0xc0004e3e10 | fs:List*, fs:Read* |
| 0xc0004e3e30 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3e50 | 0xc0004e3e40 | fs:ListRepositories, fs:ReadRepository, fs:ReadCommit, fs:ListBranches, fs:ListObjects, fs:ReadObject, fs:WriteObject, fs:DeleteObject, fs:RevertBranch, fs:ReadBranch, fs:CreateBranch, fs:DeleteBranch, fs:CreateCommit |
| 0xc0004e3e60 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3e80 | 0xc0004e3e70 | retention:* |
| 0xc0004e3e90 | 2020-07-07 18:28:24 +0300 IDT | 0 | 0xc0004e3eb0 | 0xc0004e3ea0 | retention:Get* |
| 0xc0004e3ec0 | 2020-07-08 12:16:26 +0300 IDT | 0 | 0xc0004e3ee0 | 0xc0004e3ed0 | auth:* |
+--------------+-------------------------------+-------------+--------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
consider signing the binaries in order to have better download and execute experience on macOS .
https://goreleaser.com/customization/sign/
When trying to create a repository with a name of an existing repository, the following error is printed:
Error executing command: error creating repository: insert repository: ERROR: duplicate key value violates unique constr aint "catalog_repositories_name_uindex" (SQLSTATE 23505)
This should be changed to a user-friendly error.
Example command to reproduce:
lakectl repo create lakefs://existing-repo s3://example-bucket
The lakefs
binary uses a yaml configuration file.
As a lakeFS user, I would like a command line utility to generate this config file interactively.
Upon running "lakefs config", the user should be asked to fill in the basic information for a lakefs server to run.
A minimal configuration file example can be found here in the docs.
Example of information to get from the user:
The output should be a yaml file containing the configuration. The user should be able to specify an output destination for the file. If not specified, it should be saved to the default location: $HOME/.lakefs.yaml (with override protection).
Sometimes the lakeFS system performs commits as a result of another user action. Examples include branch creation, repository creation, merge commits and import API commits.
In the case of branch and repository creation, an empty committer name is used (look for the constant CatalogerCommitter
). This should be changed to the user that initiated the action.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.