Comments (6)
@wetneb What do you think of this proposal? Any preference for how the numeric range is divided? In addition to the two options above (10^3 and 2^5), we also have a PR which proposes 2^20.
from openrefine.
@tfmorris Hmm, a bit like Snowflake ids, I like it. Incidentally, we use TSIDs in DB2Rest. There's a thread-safe Java library: https://github.com/vladmihalcea/hypersistence-tsid
Which also adds a node part that's adjustable. The node part could hold an OpenRefine version, or simply a 3 bit node id and 2 bit version for 5 bits on the node part? Having the OpenRefine version encoded would help migration and sharing, no? We could quickly detect the project was from version 3.8 and uplift when shared and given to someone to open using version 4.0?
from openrefine.
I'm a bit torn on this… intuitively I'd rather prefer to go in the direction of using completely opaque ids which wouldn't carry any particular information. Users might not be aware that the project ids contain this information and it might constitute an unwelcome information leak in certain circumstances. But not a hill I would die on…
from openrefine.
@wetneb The project metadata already includes both the creation time and the last updated time. This is intended to provide a hint to the user in case the metadata is gone/corrupted, ie "Your missing project is the one that you created on the afternoon of May 30." Currently we have no way of telling the user what project(s) is/are missing. (Of course, the best thing is not to lose the projects in the first place.) From a practical point of view, the current IDs are completely opaque.
Another option would be to use the newly defined UUID v7 from rfc9562, but that would require increasing the field size from 64 to 128 bits, breaking compatibility, so is a non-starter for now until we have protocol & metadata versioning. One useful hint from that spec is that dividing fields on nibble boundaries makes them more easily human parseable in hex format. We could also place the timestamp in the high order 48 bits to match the UUID layout, for whatever that's worth.
@thadguidry That repo looks like a rip-off of https://github.com/f4b6a3/tsid-creator/, but we don't need sortable IDs - just a rough idea of time that we can convey to the user. We should have the OpenRefine version encoded in the metadata, but I don't think the project ID is the correct place for it.
from openrefine.
@tfmorris Gotcha, agree. Btw, in the first paragraph of the README says that it's not a "ripoff", it's a "fork" that's maintained because the original repo is no longer wanting to be maintained by its creator.
from openrefine.
The project metadata already includes both the creation time and the last updated time.
Yes, I am aware that we store those times in the project metadata, but what I am saying is that it feels somewhat quirky to also encode that in the project id itself.
This is intended to provide a hint to the user in case the metadata is gone/corrupted, ie "Your missing project is the one that you created on the afternoon of May 30."
To provide something like this, I'd rather store the entire project metadata in a more corruption-resilient way independent from project serialization, for instance in a SQLite database. That would have the advantage of also being able to provide the user with not just the creation date, but also the project name and other metadata fields.
from openrefine.
Related Issues (20)
- Encoding issue regression for files imported into version 3.8.0 HOT 10
- Allow manual selection of UTF-8 BOM encoding
- forEachIndex with array containing null values throw NullPointerException
- 'Search for match' link not shown when no reconciliation candidates are present
- The dialog system uses an incorrect WAI ARIA Role attribute
- TSV import always trims white space, ignoring parse setting
- Add new GREL function to normalize characters HOT 1
- Search option has disappeared from reconciliation results (3.8.0) HOT 1
- Add new GREL function to calculate the edit distance HOT 3
- Checking running status of OpenRefine with wget will not work correctly
- Move the Wikitable importer to an extension HOT 1
- Don't catch exceptions in Java unit tests
- Allow user to automatically report their OpenRefine installation configuration
- Incorrect localization for row/record count in main summary bar
- Restore deleted constructor to StandardReconConfig
- Import progress bar exceeds the intended box HOT 1
- Fail to open the browser after startup on linux without Desktop.browse support
- Update the UI for the starred tab in expression dialogue HOT 5
- Column menus: select submenu item by moving mouse diagonally
- When checking for a running open refine localhost should be included in the no proxy list
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openrefine.