ia0 / data-encoding Goto Github PK
View Code? Open in Web Editor NEWEfficient and customizable data-encoding functions in Rust
Home Page: https://data-encoding.rs/
License: MIT License
Efficient and customizable data-encoding functions in Rust
Home Page: https://data-encoding.rs/
License: MIT License
use data_encoding::Specification;
fn main() {
let mut hex = {
let mut spec = Specification::new();
spec.symbols.push_str("0123456789abcdef");
spec.encoding().unwrap()
};
hex.0 = (&[0xF6; 514][..]).into(); // created with non-ascii symbols
let invalid_string = hex.encode(&[1, 2, 3, 4]);
println!("created invalid string");
println!("{:?}", invalid_string); // will panic
}
This does involve tweaking the field of the Encoding
, but that's a public and not doc(hidden)
field. At the very least it should be doc(hidden)
.
(at a meta level, this code would probably benefit from some kind of internal #[repr(transparent)] pub Ascii(u8);
type so that it is very clear that the invariants are being upheld)
This crate is currently used by uutils/coreutils
for programs like basenc
, base32
, and base64
, and I'm looking for ways to optimize them. I've experimented with an AVX2 implementation of simple Base32 encoding and the results were promising - to this end, I'm looking to write optimized SIMD implementations of encoding and decoding for individual format specifications. Is this something data-encoding
can support, given its current API? Would such implementation (and possibly API) changes be welcome? Or should I write my own crate instead?
Would you consider putting the bits that require std behind a feature flag?
Thanks for a great crate.
This helps distributions to package data-encoding crate. Thanks!
Sometimes it would be useful to encode directly into a std::fmt::Write
, because it would simplify Display
impls.
impl fmt::Display for SomeBytes {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
HEXLOWER.encode_into(&self.0, f)
}
}
In other situations (e.g. network protocols) it's useful to write into a std::io::Write
impl.
Would these use cases be something that could be added to data-encoding? Or is this already possible through a method I'm missing somehow, without using an intermediate buffer?
Hello, I've just ran cargo-bloat on my project as I'm working on reducing compile times and binary sizes can be a proxy for compile times, and found that data_encoding::Encoding::encode_mut
is taking up 76.9kb
, the third largest function in my entire project (that pulls reqwest, tungstenite, etc).
If code size isn't a concern for this project, I'm entirely fine with this being closed, but thought that you might want a heads up that this library is somewhat code-size heavy.
for example: I need to gen MD5 for the bytes of hex,
the rustc_serialize::hex generate hex as lower case a-z.
If data-encoding will add this?
Currently, decode
and decode_mut
always reject inputs of invalid length. It would be convenient to control this and permit to decode the longest valid prefix. This is currently possible to do.
For encodings that don't ignore characters, first truncating the input to the longest valid prefix then calling decode
or decode_mut
as usual:
fn truncate<'a>(base: &Encoding, input: &'a [u8]) -> &'a [u8] {
match base.decode_len(input.len()) {
Ok(_) => input,
Err(e) => &input[..e.position],
}
}
For encodings that ignore characters, using the following functions:
fn decode_prefix(base: &Encoding, input: &[u8]) -> Result<Vec<u8>, DecodeError> {
let mut output = vec![0; base.decode_len(input.len()).unwrap()];
let len = decode_prefix_mut(base, input, &mut output)?;
output.truncate(len);
Ok(output)
}
fn decode_prefix_mut(
base: &Encoding,
input: &[u8],
output: &mut [u8],
) -> Result<usize, DecodeError> {
match base.decode_mut(input, output) {
Ok(len) => Ok(len),
Err(DecodePartial { written, error, .. }) => {
if error.kind == DecodeKind::Length {
Ok(written)
} else {
Err(error)
}
}
}
}
See #28 for existing user feedback.
Hi, first of all, thanks for this tremendous crate!
It's quite a common operation to send fixed-length base64(url)-encoded hmacs, tokens etc. in HTTP cookies and urls. It also common to strip the padding characters away because they often cause problems (interfering with reserved characters in URLs and HTTP headers). Since the message is fixed-length, the padding isn't needed anyway.
It would be neat to add a helper method for decoding padding-stripped base64 for known-length messages. At the moment it's possible to manually add the paddings using the output of encode_len
as a target length, but, this is problematic especially if one strives for copying things as little as possible – if you're passed an immutable slice, you just have to copy it to be able to add the paddings.
You can subscribe to this issue to be notified when 2.5.0 is released (some time before 2023-11-26).
This issue lists breaking changes that could be worth doing. Since they would bump the major version, doing as much of them simultaneously would be best.
static
instead of const
Using const
prevents users to take long-term references, which is needed for Encoder
. The constants should instead be statics.
const fn
This crate is mostly made of pure terminating functions (e.g. encode_len()
, encode_mut()
, decode_len()
, and decode_mut()
). Those functions should be const fn
. This would replace the data-encoding-macro
library which is currently working around the const fn
limitations of Rust.
This is blocked by const fn
support in Rust: meta tracking issue.
Encoding
implementationIt should now be possible to make the Encoding
implementation private and provide an unsafe const fn
for data-encoding-macro
to use.
MaybeUninit
Most output parameters are not read from but currently use &mut [u8]
which requires them to be initialized. Those functions could instead take &mut [MaybeUninit<u8>]
.
This is mostly blocked by maybe_uninit_slice
.
Even if not necessary, bumping to the latest stable might give access to features that may be useful for future minor or patch versions. This is strictly speaking not a breaking change to bump MSRV (discussion), but it's nice to do it during a major bump. By discussion in #77 it would be nice to have the MSRV at rustc in Debian oldstable. This should cover most common distributions. Note also the trick to use syn = ">= 2, < 4"
when multiple major versions are supported for a dependency.
Currently only the type-erased API with internal dispatch is exposed. This may prevent dead-code elimination to trigger if encodings are not statically known, resulting in bigger binary size. As an alternative, the internal polymorphic API could be exposed.
This might be blocked by good const generics support in Rust.
Hi, I'm the author of crate binary_macros
: https://github.com/golddranks/binary_macros
It's a simple wrapper around the conversions implemented in this crate. The use-case is to enable using hexadecimal, base64 etc. in settings files and environment variables that are compiled statically in. I think there would be some value in merging the macros to this crate, since I think the discoverability of binary_macros is low. What do you think?
Would it make sense for this library to include base58 encoding and decoding?
Currently, I am using base58, which works fine. However, since data-encoding already supports many useful encoders and decoders, adding base58 could be a great addition.
Also, base58 has not been updated for a couple of years, which is always somewhat worrisome, although it probably doesn't need much change anyway.
Required tasks:
static
instead of const
for pre-defined encodingsEncoding
implementation private_uninit()
version to support MaybeUninit
outputsTrue
and False
(also useful to avoid parameter ordering issues)v3-preview
support to data-encoding-macro
v3-preview
Optional tasks:
When operating on fixed length hashes like SRI, it's great to use arrays on stack as buffers. But only constants and const-fns can be used in array length expressions, results in manual magic numbers like let buf = [0u8; 44]
in code (base64 encoded length of SHA256 hash). With const-fns, we could possibly write let buf = [0u8; BASE64.encode_len(32)];
to make it clear.
Hi. I need help with encoding bytes as rfc8949.
For example UT with expects value 1903e8
while default HEXLOWER returns 1900000000000003e8
.
Would appreciate any help.
Today when calling encode_append
it will always write the padding into the chunk. It's not clear to me if there is a way without buffering ahead of time to encode without padding and add the padding once at the end. Motivating use case:
fn basic_auth(username: &str, password: Option<&str>) -> String {
let mut buf = "Basic ".to_string();
BASE64.encode_append(username.as_bytes(), &mut buf);
BASE64.encode_append(b":", &mut buf);
if let Some(password) = password {
BASE64.encode_append(password.as_bytes(), &mut buf);
}
buf
}
Today this creates an unexpected HTTP basic auth header as padding is added three times. While probably technically true, I feel uncomfortable with emitting this type of base64.
error message:
Compiling data-encoding-macro-internal v0.1.1 (file:///H:/work/rust_projects/forks/data-encoding/lib/macro/internal)
Compiling pest_derive v1.0.7
Compiling derive-error-chain v0.10.1
Compiling libflate v0.1.14
Compiling num-rational v0.1.42
error[E0599]: no variant named `Op` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:36:14
|
36 | Some(TokenTree::Op(x)) if x.op() == op => (),
| ^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:53:13
|
53 | TokenTree::Term(term) => term.as_str().to_string(),
| ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:52:13
|
52 | let key = match key {
| ^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: all local variables must have a statically known size
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:58:21
|
58 | None => panic!("expected value for {}", key),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: required by `std::fmt::ArgumentV1::new`
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:62:21
|
62 | let _ = map.insert(key, value);
| ^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:50:19
|
50 | let mut map = HashMap::new();
| ^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: required by `<std::collections::HashMap<K, V>>::new`
error[E0308]: mismatched types
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:64:5
|
49 | fn parse_map(mut tokens: IntoIter) -> HashMap<String, TokenTree> {
| -------------------------- expected `std::collections::HashMap<std::string::String, proc_macro::TokenTree>` because of return type
...
64 | map
| ^^^ expected struct `std::string::String`, found str
|
= note: expected type `std::collections::HashMap<std::string::String, _, _>`
found type `std::collections::HashMap<str, _, _>`
error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:150:9
|
150 | TokenTree::Term(term) if term.as_str() == "None" => return None,
| ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:173:9
|
173 | TokenTree::Term(term) => term,
| ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:196:9
|
196 | TokenTree::Term(term) if term.as_str() == msb => BitOrder::MostSignificantFirst,
| ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
--> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:197:9
|
197 | TokenTree::Term(term) if term.as_str() == lsb => BitOrder::LeastSignificantFirst,
| ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`
error: aborting due to 11 previous errors
Some errors occurred: E0277, E0308, E0599.
For more information about an error, try `rustc --explain E0277`.
Compiling enum_primitive v0.1.1
error: Could not compile `data-encoding-macro-internal`.
warning: build failed, waiting for other jobs to finish...
error: build failed
my rustup:
Default host: x86_64-pc-windows-gnu
installed toolchains
--------------------
nightly-2018-04-06-i686-pc-windows-gnu
nightly-2018-04-16-i686-pc-windows-gnu
nightly-2018-04-27-i686-pc-windows-gnu
nightly-2018-05-05-i686-pc-windows-gnu
nightly-2018-06-01-i686-pc-windows-gnu (default)
active toolchain
----------------
nightly-2018-06-01-i686-pc-windows-gnu (default)
rustc 1.28.0-nightly (1ffb32147 2018-05-31)
we can install the nightly version by:
rustup install nightly-2018-06-01-i686-pc-windows-gnu
rustup default nightly-2018-06-01-i686-pc-windows-gnu
error[E0432]: unresolved import `proc_macro::TokenNode`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:22:27
|
22 | use proc_macro::{Spacing, TokenNode, TokenStream, TokenTree, TokenTreeIter};
| ^^^^^^^^^ no `TokenNode` in the root. Did you mean to use `TokenTree`?
error[E0432]: unresolved import `proc_macro::TokenTreeIter`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:22:62
|
22 | use proc_macro::{Spacing, TokenNode, TokenStream, TokenTree, TokenTreeIter};
| ^^^^^^^^^^^^^ no `TokenTreeIter` in the root. Did you mean to use `TokenTree`?
Compiling libflate v0.1.14
error[E0574]: expected struct, variant or union type, found enum `TokenTree`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:34:14
|
34 | Some(TokenTree { span: _, kind: TokenNode::Op(x, Spacing::Alone) })
| ^^^^^^^^^ not a struct, variant or union type
error[E0432]: unresolved import `syntax::ast::SpannedIdent`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:27:56
|
27 | use syntax::ast::{ItemKind, MetaItem, FnDecl, PatKind, SpannedIdent};
| ^^^^^^^^^^^^ no `SpannedIdent` in `ast`
Compiling flate2 v1.0.1
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:52:13
|
52 | let key = match key.kind {
| ^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: all local variables must have a statically known size
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:58:21
|
58 | None => panic!("expected value for {}", key),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: required by `std::fmt::ArgumentV1::new`
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:62:21
|
62 | let _ = map.insert(key, value);
| ^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:50:19
|
50 | let mut map = HashMap::new();
| ^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `str`
= note: required by `<std::collections::HashMap<K, V>>::new`
error: aborting due to 7 previous errors
Some errors occurred: E0277, E0432, E0574.
For more information about an error, try `rustc --explain E0277`.
error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:136:59
|
136 | let penultimate = path.segments[num_segs - 2].identifier.name.as_str();
| ^^^^^^^^^^ unknown field
|
= note: available fields are: `ident`, `parameters`
error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:143:48
|
143 | let last = path.segments[num_segs - 1].identifier.name.as_str();
| ^^^^^^^^^^ unknown field
|
= note: available fields are: `ident`, `parameters`
error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
--> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:152:40
|
152 | let first_ident = path.segments[0].identifier.name.as_str();
| ^^^^^^^^^^ unknown field
|
= note: available fields are: `ident`, `parameters`
error: aborting due to 4 previous errors
Some errors occurred: E0432, E0609.
For more information about an error, try `rustc --explain E0432`.
error: Could not compile `data-encoding-macro-internal`.
warning: build failed, waiting for other jobs to finish...
error: Could not compile `pear_codegen`.
warning: build failed, waiting for other jobs to finish...
error: build failed
While the data-encoding
library itself seems to compile fine, I just tried cargo install data-encoding-bin
on my win10 workstation, and got this output:
$ cargo install data-encoding-bin
Updating registry `https://github.com/rust-lang/crates.io-index`
Installing data-encoding-bin v0.1.0
Compiling data-encoding v2.0.0-rc.1
Compiling getopts v0.2.14
Compiling data-encoding-bin v0.1.0
error[E0433]: failed to resolve. Could not find `unix` in `os`
--> C:\Users\Joey\.cargo\registry\src\github.com-1ecc6299db9ec823\data-encoding-bin-0.1.0\src\main.rs:149:31
|
149 | unsafe { <File as std::os::unix::io::FromRawFd>::from_raw_fd(1) });
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Could not find `unix` in `os`
error[E0277]: the trait bound `std::io::Write: std::marker::Sized` is not satisfied
--> C:\Users\Joey\.cargo\registry\src\github.com-1ecc6299db9ec823\data-encoding-bin-0.1.0\src\main.rs:148:18
|
148 | output = Box::new(
| ^^^^^^^^ `std::io::Write` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `std::io::Write`
= note: required by `<std::boxed::Box<T>>::new`
error: aborting due to previous error(s)
error: failed to compile `data-encoding-bin v0.1.0`, intermediate artifacts can be found at `C:\Users\Joey\AppData\Local\Temp\cargo-install.UUs4JxW8xF3Y`
Caused by:
Could not compile `data-encoding-bin`.
To learn more, run the command again with --verbose.
Hello!
I ran into this crate while packaging rust software for debian and there are some outdated dependencies, specifically:
The proc-macro-hack dependency can be bumped without any further changes, but the syn dependency bump needs some changes.
While we can technically upload outdated versions to debian we would like to avoid packaging old versions of crates since we'd have to maintain them along with the recent versions.
Releasing an update with the latest proc-macro-hack and syn would be greatly appreciated. Thanks!
We should build the documentation with warnings as errors:
RUSTDOCFLAGS=--deny=warnings cargo doc
In staktrace/mailparse#96 it was considered to request this package allow trailing bits on the BASE64_MIME codec. The assumptions behind this reasoning are:
Would this project be willing to accept a pull request that makes BASE64_MIME more liberal in its handling by setting check_trailing_bits=false
? If not, that's okay, I can go back to the mailparse project and see if they're willing to handle it.
Note, I'm not associated with the project, I just personally stumbled across this problem recently, so I figured I'd start with their suggestion first.
Could you please release a new version of data_encoding_macro? I'm interested in a version that depends on syn >= 1.0.
No need to feature flag fmt::Display for std as its in core.
impl core::fmt::Display for DecodeError {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(f, "{} at {}", self.kind, self.position)
}
}
While trying to build with a common no_std
target thumbv6m-none-eabi
.
Changing data-encoding-macro-internal
's Cargo.toml to use the the alloc feature allowed for a clean build.
data-encoding = { version = "2.3", path = "../..", default-features = false, features = ["alloc"] }
thumbv6m-none-eabi
should probably be added to the CI build list for next time around :)
Would supporting the variant of base32 that Nix uses be something you would consider?
It is like LeastSignificantFirst
but reversed:
let mut spec = Specification::new();
spec.symbols.push_str("0123456789abcdfghijklmnpqrsvwxyz");
spec.bit_order = BitOrder::LeastSignificantFirst;
let encoding = spec.encoding().unwrap();
let mut actual = encoding.encode(&hex!("0839 7037 8635 6bca 59b0 f4a3 2987 eb2e 6de4 3ae8"));
unsafe { actual.as_bytes_mut() }.reverse();
assert_eq!("x0xf8v9fxf3jk8zln1cwlsrmhqvp0f88", actual);
We don't need to support old stable compilers.
I noticed commit f8cb34b
Once you release this as a "patch" you will likely get bug reports that it breaks builds for people using Rust < 1.56.0 and have to yank the version and re-release as major or revert it. This change unnecessarily raises MSRV.
Hi,
We've been using data_encoding 2.0.0-rc.2 for some time. Are there any roadblocks to release 2.0.0?
Hey!
I was reading through data-encoding
and it looks awesome. Thanks for your work.
I'm investigating whether or not I can use data-encoding-macro
in my own project .
It looks like a perfect fit, except for one issue.
I checked the data-encoding-macro-internal
dependencies and saw that it depends on syn
.
syn
is a fairly heavy dependency that I wouldn't want to pull into my tiny library.
After quickly scanning through data-encoding-macro
and data-encoding-macro-internal
's source code I'm not seeing anything that couldn't be accomplished with regular macro_rules!
declarative macros.
Am I missing something? Was there a specific reason that syn
was chosen?
If not, are you open to not using syn
in data-encoding-macro
?
Thanks!
The documentation currently manually mentions which feature needs to be enabled to use a function. This could be automated with doc_auto_cfg
.
In PR #35, data-encoding = { version = "2.3", default-features = false, features = ["alloc"] }
was added into lib/macro/Cargo.toml
and lib/macro/internal/Cargo.toml
. It's ok for me.
But in the later commit 76e21dd, default-features = false, features = ["alloc"]
was removed from lib/macro/internal/Cargo.toml
, which cause the feature std
of data-encoding
leaking, I think.
Solution:
lib/macro/internal/Cargo.toml
[package]
name = "data-encoding-macro-internal"
version = "0.1.9"
authors = ["Julien Cretin <[email protected]>"]
license = "MIT"
edition = "2018"
description = "Internal library for data-encoding-macro"
readme = "README.md"
repository = "https://github.com/ia0/data-encoding"
include = ["Cargo.toml", "LICENSE", "README.md", "src/lib.rs"]
[lib]
proc-macro = true
[dependencies]
- data-encoding = { version = "2.3", path = "../.." }
+ data-encoding = { version = "2.3", path = "../..", default-features = false, features = ['alloc'] }
syn = "1"
BTW, we could add resolver = "2"
to the Cargo.toml
of downstream crate (only useful for rust 1.51+) to fix the issue (Cargo's New Feature Resolver), but we need to compile data-encoding
twice.
encode
is an incredibly useful utility all in its own right — most Linux/UNIX systems do not come with base32 or hexadecimal decoders/encoders in their repositories, or if they do they are often written in slow interpreted languages.
While it's entirely possible to clone the repository and run make encode
, it's far easier for many users (especially on, e.g., Windows) to run cargo install data-encoding --example encode
to have a completely platform-independent method of installing it.
Unfortunately, without adding the file to the include array in Cargo.toml, it won't be included in the crate for installation.
The following code assert()s unexpectadly. The solution so far is to strip the padding and use the decode_nopad_mut variant.
#[test]
fn test_b64_raw() {
let a = [0u8; 8];
let a64 = ::data_encoding::base64::encode(&a);
assert_eq!("AAAAAAAAAAA=", a64);
let mut b = [0u8; 8];
::data_encoding::base64::decode_mut(a64.as_bytes(), &mut b);
assert_eq!(a, b);
}
When using spec.check_trailing_bits = false;
with the base32 spec, I would expect that any string of base32 characters would be decoded successfully with any extra bits ignored. But this is not the case. For strings with a length congruent to 1, 3, or 6 (mod 8), a DecodeError
is thrown.
use data_encoding::{BASE32_NOPAD, DecodeError};
fn main() {
println!("{:?}", decode(b"77777777"));
println!("{:?}", decode(b"777777777"));
println!("{:?}", decode(b"7777777777"));
}
fn decode(bytes: &[u8]) -> Result<Vec<u8>, DecodeError> {
let mut spec = BASE32_NOPAD.specification();
spec.check_trailing_bits = false;
spec.encoding().unwrap().decode(bytes)
}
The above code produces the following output:
Ok([255, 255, 255, 255, 255])
Err(DecodeError { position: 8, kind: Length })
Ok([255, 255, 255, 255, 255, 255])
I understand that by the canonical definition in the RFC, the second input is invalid base32, but so is the third, and I would expect them to be treated the same when it comes to check_trailing_bits
. So for this example, I would expect the second line in the output to be identical to the first.
Is the current behavior in this circumstance intentional, or is it simply a matter of a length check being done first without concern for check_trailing_bits
?
For reference, Google Authenticator for Android follows my expectation in this regard, as does Authy for iOS (in my own testing). In the comment linked there you can see that a "string of sixteen 7s ("7...7") and seventeen 7s both decode to the same byte array", which is not the case here.
We currently have custom unsafe functions to split input and output into chunks. Once slice_as_chunks
is stable we can use it.
I am trying to figure out how I'd encode into a String, but it doesn't seem to be possible without using unsafe
. Is this intentional? Is there something I'm overlooking? Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.