ahawker / ulid Goto Github PK
View Code? Open in Web Editor NEWUniversally Unique Lexicographically Sortable Identifier (ULID) in Python 3
License: Apache License 2.0
Universally Unique Lexicographically Sortable Identifier (ULID) in Python 3
License: Apache License 2.0
Originated from Reddit Comment.
The README should contain more information (parity at the very least) with the README in ULID.
This package was written with type hints (PEP484) so it should perform some static analysis checks on build.
We need to add validation for handling the max timestamp value, 2 ^ 48 - 1, 281474976710655. Spec notes are at https://github.com/ulid/spec#overflow-errors-when-parsing-base32-strings
Parsing of the t
value in the following example should raise an exception.
>>> import ulid
>>> s = '7ZZZZZZZZZZZZZZZZZZZZZZZZZ'
>>> t = '8ZZZZZZZZZZZZZZZZZZZZZZZZZ'
>>> ulid.parse(s)
<ULID('7ZZZZZZZZZZZZZZZZZZZZZZZZZ')>
>>> ulid.parse(t)
<ULID('0ZZZZZZZZZZZZZZZZZZZZZZZZZ')>
The canonical spec for ulid. Contribute to ulid/spec development by creating an account on GitHub.
There are a number of tests based off the invalid_str_encoding
fixture that are passing but the assert is being fulfilled by an incorrect code path.
Would it be possible to update the changelog with the more recent version enhancements? Specifically, I'm upgrading from 0.0.6 to 0.0.7 and was hoping to get some high-level context. I've looked through the commits, I just wanted to make sure I wasn't missing the forest for the trees on anything.
A number of the test fixtures that generate data are powered by os.urandom
. This works fine until it generates a random sequence of bytes that starts with a leading zero. This will cause tests to fail during duration due to int.bit_length stripping leading zeros in its computation.
Example test failure: https://travis-ci.org/ahawker/ulid/jobs/294263189
All of the above is a side-effect of the fact that there is no validation logic for the timestamp
portion of a ULID
. It should never contain a zero leading byte since the minimum value is the Unix epoch.
Items to address this issue:
timestamp
values upon creationExample:
>>> import ulid
>>> data = b"\x00\xcdh\x95}\xd9\xb2Yp':y0\xe4\xce\xdc"
>>> ulid.from_bytes(data)
<ULID('00SNM9AZESP9CQ09STF4RE9KPW')>
>>> ulid.from_int(int.from_bytes(data, byteorder='big'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/hawker/src/github.com/ahawker/ulid/ulid/api.py", line 76, in from_int
raise ValueError('Expects integer to be 128 bits; got {} bytes'.format(length))
ValueError: Expects integer to be 128 bits; got 15 bytes
Currently the requirements/base.txt
requirements file (for runtime) contains dependencies that are only useful for a development/deployment environment. These should be broken out into a separate file.
The from_randomness
function in ulid/api.py
supports creating ULID instances with a randomness value from a given value. In addition to the currently supports types, it should also support Randomness
and ULID
types as well.
Randomness
, a straight copy of all bytes should suffice.ULID
, a straight copy of the last 10 bytes should suffice.I did some very basic work with pytest-benchmark during development. However, a more complete and robust set of performance tests for common API calls/flows should be written.
Completeness Criteria:
test_performance.py
.make test
doesn't always run it.make benchmark
or some similar target to execute them.Travis CI is dead for open source projects (free). Swap to Circle CI, Github Actions, or all to Appveyor
All of the requirements txt files should be updated to freeze against a specific version.
ulid-py 1.1.0
The .hex
attribute does not correctly pad to 32 characters. It skips the leading zero, giving a len-31 string (33 with the 0x
).
import ulid
import binascii
u = ulid.from_randomness(0)
print(len(u.hex))
print(u.hex)
print(f"0x{binascii.hexlify(u.bytes).decode()}")
Out:
33
0x17b0c9d5b3b00000000000000000000
0x017b0c9d5b3b00000000000000000000
Aim for 100% code coverage for the ulid/api.py
module.
Report: https://codeclimate.com/github/ahawker/ulid/coverage/59501e131e7b440001015f3e
This issue should track related work for making Windows a first class citizen for this package.
travis*
commands in Makefile
into generalized ci*
commandsulid.timestamp().datetime
returns a naive datetime object (lacking time zone information), but yet the time is in UTC.
A naive datetime is ambiguous. Can the datetime be made aware by explicitly attaching the UTC time zone? The datetime
module documentation has reasons why it is preferred to use aware datetimes to represent times in UTC
Hi,
my project is using mypy --strict
.
While importing ulid I'm getting a problem:
import ulid
MY_ULID = ulid.new()
error: Module has no attribute "new"
I found a workaround:
import ulid
MY_ULID = ulid.api.new()
But I'm sure the first way is a bit more preferable.
I made some investigation on the problem.
The following modified content of __init__.py
should fix the problem:
from .api import from_bytes, from_int, from_randomness, from_str, from_timestamp, from_uuid, new, parse
from .ulid import Randomness, Timestamp, ULID
__all__ = [
# from .api
'new', 'parse', 'from_bytes', 'from_int', 'from_str', 'from_uuid', 'from_timestamp', 'from_randomness',
# from .ulid
'Timestamp', 'Randomness', 'ULID',
]
__version__ = '0.0.14'
So I explicitly imported items and explicitly listed them in __all__
. This is some code duplication, but it looks not fatal for me.
Q1. Should I create PR with the these changes for in __init__.py
?
Q2. Should I crate PR to fix all mypy --strict
errors for the whole ulid project? The fixes are going to be trivial from my experience. Here is the full list of mypy errors:
ulid\ulid.py:23: error: Function is missing a type annotation
ulid\ulid.py:26: error: Function is missing a type annotation
ulid\ulid.py:39: error: Function is missing a type annotation
ulid\ulid.py:52: error: Function is missing a type annotation
ulid\ulid.py:67: error: Function is missing a type annotation
ulid\ulid.py:82: error: Function is missing a type annotation
ulid\ulid.py:97: error: Function is missing a type annotation
ulid\ulid.py:112: error: Function is missing a return type annotation
ulid\ulid.py:115: error: Function is missing a return type annotation
ulid\ulid.py:118: error: Function is missing a return type annotation
ulid\ulid.py:121: error: Function is missing a return type annotation
ulid\ulid.py:124: error: Function is missing a return type annotation
ulid\ulid.py:127: error: Function is missing a return type annotation
ulid\ulid.py:275: error: Returning Any from function declared to return "Timestamp"
ulid\ulid.py:275: error: Call to untyped function "Timestamp" in typed context
ulid\ulid.py:284: error: Returning Any from function declared to return "Randomness"
ulid\ulid.py:284: error: Call to untyped function "Randomness" in typed context
ulid\api.py:47: error: Returning Any from function declared to return "ULID"
ulid\api.py:47: error: Call to untyped function "ULID" in typed context
ulid\api.py:104: error: Returning Any from function declared to return "ULID"
ulid\api.py:104: error: Call to untyped function "ULID" in typed context
ulid\api.py:124: error: Returning Any from function declared to return "ULID"
ulid\api.py:124: error: Call to untyped function "ULID" in typed context
ulid\api.py:137: error: Returning Any from function declared to return "ULID"
ulid\api.py:137: error: Call to untyped function "ULID" in typed context
ulid\api.py:149: error: Returning Any from function declared to return "ULID"
ulid\api.py:149: error: Call to untyped function "ULID" in typed context
ulid\api.py:198: error: Returning Any from function declared to return "ULID"
ulid\api.py:198: error: Call to untyped function "ULID" in typed context
ulid\api.py:244: error: Returning Any from function declared to return "ULID"
ulid\api.py:244: error: Call to untyped function "ULID" in typed context
Currently, the API exposes multiple methods for creating ulid.ULID
instances from other data types. However, it does not support a "catch all" call that attempts to make the determination based on type and requires the caller to do that.
Let's imagine that a user of the library has read an input value from somewhere that they have a relatively high confidence is a ULID. However, they don't know the format in which it was stored. In order to support this mechanism, the user of the library needs to write the following code:
if isinstance(value, bytes):
return ulid.from_bytes(value)
if isinstance(value, int):
return ulid.from_int(value)
if isinstance(value, str):
return ulid.from_str(value)
if isinstance(value, uuid.UUID):
return ulid.from_uuid(value)
raise ValueError('Cannot create ULID from type {}'.format(value.__class__.__name__)
This is pretty verbose, especially since we could hide this logic inside the library in a separate API call itself. It will be slightly slower that calling the correct method directly, since we have to run the if/else tree every time and don't know the "hot path", but should be helpful for this scenario.
Potential thoughts:
from_(value)
from_value(value)
from_obj(value)
from_unknown(value)
parse(value)
decode(value)
load(value)
Either fix the reported issue or explicitly add an ignore to silence the warning if it's "written as intended".
Issues:
hawker@mbp:~/src/github.com/ahawker/ulid|master⚡
⇒ make lint
************* Module ulid
W: 11, 0: Wildcard import api (wildcard-import)
W: 13, 0: Wildcard import ulid (wildcard-import)
************* Module ulid.api
C: 21, 0: Invalid constant name "TimestampPrimitive" (invalid-name)
C: 27, 0: Invalid constant name "RandomnessPrimitive" (invalid-name)
************* Module ulid.hints
C: 12, 0: Invalid constant name "Buffer" (invalid-name)
************* Module ulid.ulid
C:191, 8: Invalid variable name "ms" (invalid-name)
make: *** [lint] Error 20
The codebase is relative well covered with comments and docstrings. We need to get the repository hooked up to an online documentation source, likely Read the Docs and get the API documentation updating as part of the build/release process.
Using a service such as pyup, this repository should be monitored for changes in dependency versions.
Suppose you have
ulid_string = '01EYV88PB2Y212QSR0AJ2JX5T4'
How would you derive a ulid object from the string?
especially relevant with django-ulid
I've encountered the problem with this ERROR, but I can import ulid successfully. Somebody can help me?
Aim for 100% code coverage for the ulid/base32.py module.
Report: https://codeclimate.com/github/ahawker/ulid/coverage/59501e131e7b440001015f3f
Wondering if its possible to reduce the size of the ID at the expense of less ULIDs per millisecond?
Starting ulid 0.2.0 I get this error when I try to simple install library
(venv) ➜ pip install ulid-py==1.0.0
Collecting ulid-py==1.0.0
Using cached https://files.pythonhosted.org/packages/3f/9e/deba154963e4eb00cd31b60f35329359dcbf8ad34a01371c10f32faf3867/ulid_py-1.0.0-py2.py3-none-any.whl
Installing collected packages: ulid-py
Found existing installation: ulid-py 0.1.0
Uninstalling ulid-py-0.1.0:
Successfully uninstalled ulid-py-0.1.0
Successfully installed ulid-py-1.0.0
(venv) ➜ python3 <<< "import ulid"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/lib/python3.8/site-packages/ulid/__init__.py", line 10, in <module>
from .api import default, microsecond, monotonic
ModuleNotFoundError: No module named 'ulid.api'
Also, when I download a package from PyPI and unpack it there is not API folder inside
To do range selection on time with ULIDs one needs to generate values with the lowest/highest possible randomness.
While this is doable with some effort, I feel it should be offered by the API. For example:
uilid.from_timestamp(timestamp, randomness=ulid.MIN_RANDOM)
Running into a cryptic error when trying to deepcopy a ULID object:
>>> import ulid
>>> a = ulid.new()
>>> a
<ULID('01EAZF1038723PE2SS9BXRQC80')>
>>> import copy
>>> copy.deepcopy(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 173, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 271, in _reconstruct
state = deepcopy(state, memo)
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 147, in deepcopy
y = copier(x, memo)
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 211, in _deepcopy_tuple
y = [deepcopy(a, memo) for a in x]
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 211, in <listcomp>
y = [deepcopy(a, memo) for a in x]
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 147, in deepcopy
y = copier(x, memo)
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 231, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/Users/ashu/.pyenv/versions/3.8.2/lib/python3.8/copy.py", line 162, in deepcopy
rv = reductor(4)
TypeError: cannot pickle 'memoryview' object
Any ideas?
Aim for 100% code coverage for the ulid/ulid.py module.
Report: https://codeclimate.com/github/ahawker/ulid/coverage/59501e131e7b440001015f41
There are many cases where a ValueError
can be raised by any number of functions across most of the modules in this package.
I am relatively confident that all of the @pytest.raises(ValueError)
calls are correct based on code coverage metrics. However, I was proven wrong today and had to address some of them with #61.
The scope of this task is to go through all tests that use @pytest.raises
, capture the exception and perform an additional assertion of the exception message to confirm that we're hitting the exact code path expected.
The from_timestamp
function in ulid/api.py
supports creating ULID instances with a timestamp from a given value. In addition to the currently supports types, it should also support Timestamp
and ULID
types as well.
Timestamp
, a straight copy of all bytes should suffice.ULID
, a straight copy of the first 6 bytes should suffice.Here are some initial thoughts but definitely incomplete list of changes necessary.
bytes
to be configurable to str
.int.to_bytes()
and int.from_bytes()
.datetime.timestamp()
memoryview
and buffer
?If you receive an ULID from some external source (e.g. a database) you might want to compute the next following ULID. This is useful for range-style queries where you are trying to retrieve every item after the aforementioned ULID. The library already does so internally to provide monotonic values but it's not entirely clear how to get the monotonically "next" ULID, given another one.
Example:
prev = ulid.parse(some_str) # From external source
next = ... # ???
I was playing with ulid.create
but I couldn't quite figure it out. It seems to be that bumping the randomness by one and if that overflow bumping the timestamp by one is what we want.
A ULID.next
method would be really nice.
As of today, it is possible to input non-base32 characters, uU
for example, into any of the api
calls.
Doing this will cause the library to fail silently and perform an incorrect base32
decode on the string.
The API should provide a feedback mechanism that informs the caller of the bad input. The implementation of that feedback is still TBD (separate API call vs. exception vs. ??).
Considerations:
[mmarkk@asus home]$ python -m timeit -s 'import random' 'random.randbytes(8)'
5000000 loops, best of 5: 93.9 nsec per loop
[mmarkk@asus home]$ python -m timeit -s 'import os' 'os.urandom(8)'
1000000 loops, best of 5: 248 nsec per loop
i want do primery key use ulid how can i do i want know how use ulid in sqlalchemy orm please give a example
Hi Andrew,
first of all, thanks for the amazing library, we've been using a lot!
I have a doubt regarding how we fix the conversion of ULIDs which are not following Crockford's Base32 standard.
We are using Lua to generate some guids (https://github.com/Tieske/ulid.lua) and for some reason, we get from time to time letters outside the Crockford's Base32.
While trying to fix this on our side (we're not sure how this is happening to be honest), we realised that Java and Python implementations silently corrects this issue in different ways:
ULID.Value ulidValueFromString = ULID.parseULID("01BX73KC0TNH409RTFD1JXKmO0")
--> "01BX73KC0TNH409RTFD1JXKM00"
mO
is silently converted into M0
In [1]: import ulid
In [2]: u = ulid.from_str('01BX73KC0TNH409RTFD1JXKmO0')
In [3]: u
Out[3]: <ULID('01BX73KC0TNH409RTFD1JXKQZ0')>
In [4]: u.str
Out[4]: '01BX73KC0TNH409RTFD1JXKQZ0'
mO
is silently converted into QZ
Shouldn't the python library behave as the Java one as per the Crockford's Base32 spec, converting L
and I
to 1
and O
to 0
and only upper casing lower case letters instead of changing them?
Thanks a lot in advance!
Eddie
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.