ia0 / data-encoding Goto Github PK

View Code? Open in Web Editor NEW

174.0 174.0 23.0 457 KB

Efficient and customizable data-encoding functions in Rust

Home Page: https://data-encoding.rs/

License: MIT License

Makefile 0.31% Shell 1.83% Rust 95.28% C 0.94% CSS 0.59% HTML 0.22% JavaScript 0.83%

base32 base32hex base64 base64url hex rust

data-encoding's People

Contributors

Stargazers

Watchers

data-encoding's Issues

Internal symbols array is publicly exposed and unsound

use data_encoding::Specification;

fn main() {
    let mut hex = {
        let mut spec = Specification::new();
        spec.symbols.push_str("0123456789abcdef");
        spec.encoding().unwrap()
    };


    hex.0 = (&[0xF6; 514][..]).into(); // created with non-ascii symbols
    let invalid_string = hex.encode(&[1, 2, 3, 4]);
   
    println!("created invalid string");
    println!("{:?}", invalid_string); // will panic
}

This does involve tweaking the field of the Encoding, but that's a public and not doc(hidden) field. At the very least it should be doc(hidden).

(at a meta level, this code would probably benefit from some kind of internal #[repr(transparent)] pub Ascii(u8); type so that it is very clear that the invariants are being upheld)

Missing a new line and decode error

1.encoded string is on the same line with prompt (mi@mi-OptiPlex-7060:~ 🐄).
2.decoded result has an error followed (data-encoding: invalid length at xx).

SIMD optimization

This crate is currently used by uutils/coreutils for programs like basenc, base32, and base64, and I'm looking for ways to optimize them. I've experimented with an AVX2 implementation of simple Base32 encoding and the results were promising - to this end, I'm looking to write optimized SIMD implementations of encoding and decoding for individual format specifications. Is this something data-encoding can support, given its current API? Would such implementation (and possibly API) changes be welcome? Or should I write my own crate instead?

consider support for no_std

Would you consider putting the bits that require std behind a feature flag?

Thanks for a great crate.

Include license text to crates.io archive

This helps distributions to package data-encoding crate. Thanks!

Encode into an "impl std::fmt::Write" and/or "impl std::io::Write"

Sometimes it would be useful to encode directly into a std::fmt::Write, because it would simplify Display impls.

impl fmt::Display for SomeBytes {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        HEXLOWER.encode_into(&self.0, f)
    }
}

In other situations (e.g. network protocols) it's useful to write into a std::io::Write impl.

Would these use cases be something that could be added to data-encoding? Or is this already possible through a method I'm missing somehow, without using an intermediate buffer?

`Encoding::encode_mut` is very code-size heavy

Hello, I've just ran cargo-bloat on my project as I'm working on reducing compile times and binary sizes can be a proxy for compile times, and found that data_encoding::Encoding::encode_mut is taking up 76.9kb , the third largest function in my entire project (that pulls reqwest, tungstenite, etc).

If code size isn't a concern for this project, I'm entirely fine with this being closed, but thought that you might want a heads up that this library is somewhat code-size heavy.

add lower case a-z for base16 or maybe base32

for example: I need to gen MD5 for the bytes of hex,
the rustc_serialize::hex generate hex as lower case a-z.

If data-encoding will add this?

Add support to decode inputs of invalid length

Currently, decode and decode_mut always reject inputs of invalid length. It would be convenient to control this and permit to decode the longest valid prefix. This is currently possible to do.

For encodings that don't ignore characters, first truncating the input to the longest valid prefix then calling decode or decode_mut as usual:

fn truncate<'a>(base: &Encoding, input: &'a [u8]) -> &'a [u8] {
    match base.decode_len(input.len()) {
        Ok(_) => input,
        Err(e) => &input[..e.position],
    }
}

For encodings that ignore characters, using the following functions:

fn decode_prefix(base: &Encoding, input: &[u8]) -> Result<Vec<u8>, DecodeError> {
    let mut output = vec![0; base.decode_len(input.len()).unwrap()];
    let len = decode_prefix_mut(base, input, &mut output)?;
    output.truncate(len);
    Ok(output)
}

fn decode_prefix_mut(
    base: &Encoding,
    input: &[u8],
    output: &mut [u8],
) -> Result<usize, DecodeError> {
    match base.decode_mut(input, output) {
        Ok(len) => Ok(len),
        Err(DecodePartial { written, error, .. }) => {
            if error.kind == DecodeKind::Length {
                Ok(written)
            } else {
                Err(error)
            }
        }
    }
}

See #28 for existing user feedback.

Add decoding with padding stripped at least to base64url

Hi, first of all, thanks for this tremendous crate!

It's quite a common operation to send fixed-length base64(url)-encoded hmacs, tokens etc. in HTTP cookies and urls. It also common to strip the padding characters away because they often cause problems (interfering with reserved characters in URLs and HTTP headers). Since the message is fixed-length, the padding isn't needed anyway.

It would be neat to add a helper method for decoding padding-stripped base64 for known-length messages. At the moment it's possible to manually add the paddings using the output of encode_len as a target length, but, this is problematic especially if one strives for copying things as little as possible – if you're passed an immutable slice, you just have to copy it to be able to add the paddings.

Release 2.5.0

You can subscribe to this issue to be notified when 2.5.0 is released (some time before 2023-11-26).

Breaking change wish list

This issue lists breaking changes that could be worth doing. Since they would bump the major version, doing as much of them simultaneously would be best.

`static` instead of `const`

Using const prevents users to take long-term references, which is needed for Encoder. The constants should instead be statics.

`const fn`

This crate is mostly made of pure terminating functions (e.g. encode_len(), encode_mut(), decode_len(), and decode_mut()). Those functions should be const fn. This would replace the data-encoding-macro library which is currently working around the const fn limitations of Rust.

This is blocked by const fn support in Rust: meta tracking issue.

Private `Encoding` implementation

It should now be possible to make the Encoding implementation private and provide an unsafe const fn for data-encoding-macro to use.

`MaybeUninit`

Most output parameters are not read from but currently use &mut [u8] which requires them to be initialized. Those functions could instead take &mut [MaybeUninit<u8>].

This is mostly blocked by maybe_uninit_slice.

MSRV

Even if not necessary, bumping to the latest stable might give access to features that may be useful for future minor or patch versions. This is strictly speaking not a breaking change to bump MSRV (discussion), but it's nice to do it during a major bump. By discussion in #77 it would be nice to have the MSRV at rustc in Debian oldstable. This should cover most common distributions. Note also the trick to use syn = ">= 2, < 4" when multiple major versions are supported for a dependency.

Expose the polymorphic implementation

Currently only the type-erased API with internal dispatch is exposed. This may prevent dead-code elimination to trigger if encodings are not statically known, resulting in bigger binary size. As an alternative, the internal polymorphic API could be exposed.

This might be blocked by good const generics support in Rust.

Consider merging the functionality from crate "binary_macros"

Hi, I'm the author of crate binary_macros: https://github.com/golddranks/binary_macros

It's a simple wrapper around the conversions implemented in this crate. The use-case is to enable using hexadecimal, base64 etc. in settings files and environment variables that are compiled statically in. I think there would be some value in merging the macros to this crate, since I think the discoverability of binary_macros is low. What do you think?

no-std support for macro crate

Add no-alloc support to the macro crate (it has no-std after #35). Might depend or be simpler after #36.

feature request - base58 d/encoding

Would it make sense for this library to include base58 encoding and decoding?

Currently, I am using base58, which works fine. However, since data-encoding already supports many useful encoders and decoders, adding base58 could be a great addition.

Also, base58 has not been updated for a couple of years, which is always somewhat worrisome, although it probably doesn't need much change anyway.

Release 3.0.0

Required tasks:

Use static dispatch instead of dynamic dispatch
Use static instead of const for pre-defined encodings
Make Encoding implementation private
Add _uninit() version to support MaybeUninit outputs
Introduce explicit traits to avoid users seeing True and False (also useful to avoid parameter ordering issues)
Add v3-preview support to data-encoding-macro
Improve documentation of v3-preview

Optional tasks:

Identify and fix performance regression with "wrap" functions
Add SIMD support (see #95)
Consider whether a "short input" engine for base58, nix-base32, etc would be worth it (see #110)

Make `{encode,decode}_len` const fns?

When operating on fixed length hashes like SRI, it's great to use arrays on stack as buffers. But only constants and const-fns can be used in array length expressions, results in manual magic numbers like let buf = [0u8; 44] in code (base64 encoded length of SHA256 hash). With const-fns, we could possibly write let buf = [0u8; BASE64.encode_len(32)]; to make it clear.

Question about Specification

Hi. I need help with encoding bytes as rfc8949.
For example UT with expects value 1903e8 while default HEXLOWER returns 1900000000000003e8.
Would appreciate any help.

Partial encoding with padding

Today when calling encode_append it will always write the padding into the chunk. It's not clear to me if there is a way without buffering ahead of time to encode without padding and add the padding once at the end. Motivating use case:

fn basic_auth(username: &str, password: Option<&str>) -> String {
    let mut buf = "Basic ".to_string();
    BASE64.encode_append(username.as_bytes(), &mut buf);
    BASE64.encode_append(b":", &mut buf);
    if let Some(password) = password {
        BASE64.encode_append(password.as_bytes(), &mut buf);
    }
    buf
}

Today this creates an unexpected HTTP basic auth header as padding is added three times. While probably technically true, I feel uncomfortable with emitting this type of base64.

data-encoding-macro-internal build failed at 1.28.0-nightly (1ffb32147 2018-05-31)

error message:

   Compiling data-encoding-macro-internal v0.1.1 (file:///H:/work/rust_projects/forks/data-encoding/lib/macro/internal)
   Compiling pest_derive v1.0.7
   Compiling derive-error-chain v0.10.1
   Compiling libflate v0.1.14
   Compiling num-rational v0.1.42
error[E0599]: no variant named `Op` found for type `proc_macro::TokenTree` in the current scope
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:36:14
   |
36 |         Some(TokenTree::Op(x)) if x.op() == op => (),
   |              ^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:53:13
   |
53 |             TokenTree::Term(term) => term.as_str().to_string(),
   |             ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:52:13
   |
52 |         let key = match key {
   |             ^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: all local variables must have a statically known size

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:58:21
   |
58 |             None => panic!("expected value for {}", key),
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: required by `std::fmt::ArgumentV1::new`
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:62:21
   |
62 |         let _ = map.insert(key, value);
   |                     ^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:50:19
   |
50 |     let mut map = HashMap::new();
   |                   ^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: required by `<std::collections::HashMap<K, V>>::new`

error[E0308]: mismatched types
  --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:64:5
   |
49 | fn parse_map(mut tokens: IntoIter) -> HashMap<String, TokenTree> {
   |                                       -------------------------- expected `std::collections::HashMap<std::string::String, proc_macro::TokenTree>` because of return type
...
64 |     map
   |     ^^^ expected struct `std::string::String`, found str
   |
   = note: expected type `std::collections::HashMap<std::string::String, _, _>`
              found type `std::collections::HashMap<str, _, _>`

error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
   --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:150:9
    |
150 |         TokenTree::Term(term) if term.as_str() == "None" => return None,
    |         ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
   --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:173:9
    |
173 |         TokenTree::Term(term) => term,
    |         ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
   --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:196:9
    |
196 |         TokenTree::Term(term) if term.as_str() == msb => BitOrder::MostSignificantFirst,
    |         ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error[E0599]: no variant named `Term` found for type `proc_macro::TokenTree` in the current scope
   --> H:\work\rust_projects\forks\data-encoding\lib\macro\internal\src\lib.rs:197:9
    |
197 |         TokenTree::Term(term) if term.as_str() == lsb => BitOrder::LeastSignificantFirst,
    |         ^^^^^^^^^^^^^^^^^^^^^ variant not found in `proc_macro::TokenTree`

error: aborting due to 11 previous errors

Some errors occurred: E0277, E0308, E0599.
For more information about an error, try `rustc --explain E0277`.
   Compiling enum_primitive v0.1.1
error: Could not compile `data-encoding-macro-internal`.
warning: build failed, waiting for other jobs to finish...
error: build failed

my rustup:

Default host: x86_64-pc-windows-gnu

installed toolchains
--------------------

nightly-2018-04-06-i686-pc-windows-gnu
nightly-2018-04-16-i686-pc-windows-gnu
nightly-2018-04-27-i686-pc-windows-gnu
nightly-2018-05-05-i686-pc-windows-gnu
nightly-2018-06-01-i686-pc-windows-gnu (default)

active toolchain
----------------

nightly-2018-06-01-i686-pc-windows-gnu (default)
rustc 1.28.0-nightly (1ffb32147 2018-05-31)

we can install the nightly version by:

rustup install nightly-2018-06-01-i686-pc-windows-gnu
rustup default nightly-2018-06-01-i686-pc-windows-gnu

Compile failed with nightly-2018-04-08-i686-pc-windows-gnu.

error[E0432]: unresolved import `proc_macro::TokenNode`
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:22:27
   |
22 | use proc_macro::{Spacing, TokenNode, TokenStream, TokenTree, TokenTreeIter};
   |                           ^^^^^^^^^ no `TokenNode` in the root. Did you mean to use `TokenTree`?

error[E0432]: unresolved import `proc_macro::TokenTreeIter`
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:22:62
   |
22 | use proc_macro::{Spacing, TokenNode, TokenStream, TokenTree, TokenTreeIter};
   |                                                              ^^^^^^^^^^^^^ no `TokenTreeIter` in the root. Did you mean to use `TokenTree`?

   Compiling libflate v0.1.14
error[E0574]: expected struct, variant or union type, found enum `TokenTree`
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:34:14
   |
34 |         Some(TokenTree { span: _, kind: TokenNode::Op(x, Spacing::Alone) })
   |              ^^^^^^^^^ not a struct, variant or union type

error[E0432]: unresolved import `syntax::ast::SpannedIdent`
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:27:56
   |
27 | use syntax::ast::{ItemKind, MetaItem, FnDecl, PatKind, SpannedIdent};
   |                                                        ^^^^^^^^^^^^ no `SpannedIdent` in `ast`

   Compiling flate2 v1.0.1
error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:52:13
   |
52 |         let key = match key.kind {
   |             ^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: all local variables must have a statically known size

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:58:21
   |
58 |             None => panic!("expected value for {}", key),
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: required by `std::fmt::ArgumentV1::new`
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:62:21
   |
62 |         let _ = map.insert(key, value);
   |                     ^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`

error[E0277]: the trait bound `str: std::marker::Sized` is not satisfied
  --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\data-encoding-macro-internal-0.1.1\src\lib.rs:50:19
   |
50 |     let mut map = HashMap::new();
   |                   ^^^^^^^^^^^^ `str` does not have a constant size known at compile-time
   |
   = help: the trait `std::marker::Sized` is not implemented for `str`
   = note: required by `<std::collections::HashMap<K, V>>::new`

error: aborting due to 7 previous errors

Some errors occurred: E0277, E0432, E0574.
For more information about an error, try `rustc --explain E0277`.
error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
   --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:136:59
    |
136 |             let penultimate = path.segments[num_segs - 2].identifier.name.as_str();
    |                                                           ^^^^^^^^^^ unknown field
    |
    = note: available fields are: `ident`, `parameters`

error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
   --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:143:48
    |
143 |         let last = path.segments[num_segs - 1].identifier.name.as_str();
    |                                                ^^^^^^^^^^ unknown field
    |
    = note: available fields are: `ident`, `parameters`

error[E0609]: no field `identifier` on type `syntax::ast::PathSegment`
   --> E:\rustdir\registry\src\mirrors.xx.com-27933024a10d4e91\pear_codegen-0.0.12\src\lib.rs:152:40
    |
152 |     let first_ident = path.segments[0].identifier.name.as_str();
    |                                        ^^^^^^^^^^ unknown field
    |
    = note: available fields are: `ident`, `parameters`

error: aborting due to 4 previous errors

Some errors occurred: E0432, E0609.
For more information about an error, try `rustc --explain E0432`.
error: Could not compile `data-encoding-macro-internal`.
warning: build failed, waiting for other jobs to finish...
error: Could not compile `pear_codegen`.
warning: build failed, waiting for other jobs to finish...
error: build failed

[windows 10] data-encoding-bin fails to compile and install

While the data-encoding library itself seems to compile fine, I just tried cargo install data-encoding-bin on my win10 workstation, and got this output:

$ cargo install data-encoding-bin
    Updating registry `https://github.com/rust-lang/crates.io-index`
  Installing data-encoding-bin v0.1.0
   Compiling data-encoding v2.0.0-rc.1
   Compiling getopts v0.2.14
   Compiling data-encoding-bin v0.1.0
error[E0433]: failed to resolve. Could not find `unix` in `os`
   --> C:\Users\Joey\.cargo\registry\src\github.com-1ecc6299db9ec823\data-encoding-bin-0.1.0\src\main.rs:149:31
    |
149 |             unsafe { <File as std::os::unix::io::FromRawFd>::from_raw_fd(1) });
    |                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Could not find `unix` in `os`

error[E0277]: the trait bound `std::io::Write: std::marker::Sized` is not satisfied
   --> C:\Users\Joey\.cargo\registry\src\github.com-1ecc6299db9ec823\data-encoding-bin-0.1.0\src\main.rs:148:18
    |
148 |         output = Box::new(
    |                  ^^^^^^^^ `std::io::Write` does not have a constant size known at compile-time
    |
    = help: the trait `std::marker::Sized` is not implemented for `std::io::Write`
    = note: required by `<std::boxed::Box<T>>::new`

error: aborting due to previous error(s)

error: failed to compile `data-encoding-bin v0.1.0`, intermediate artifacts can be found at `C:\Users\Joey\AppData\Local\Temp\cargo-install.UUs4JxW8xF3Y`

Caused by:
  Could not compile `data-encoding-bin`.

To learn more, run the command again with --verbose.

Outdated dependencies

Hello!

I ran into this crate while packaging rust software for debian and there are some outdated dependencies, specifically:

proc-macro-hack
syn

The proc-macro-hack dependency can be bumped without any further changes, but the syn dependency bump needs some changes.

While we can technically upload outdated versions to debian we would like to avoid packaging old versions of crates since we'd have to maintain them along with the recent versions.

Releasing an update with the latest proc-macro-hack and syn would be greatly appreciated. Thanks!

Deny warnings in rustdoc

We should build the documentation with warnings as errors:

RUSTDOCFLAGS=--deny=warnings cargo doc

Consider `check_trailing_bits=false` for BASE64_MIME decoder

In staktrace/mailparse#96 it was considered to request this package allow trailing bits on the BASE64_MIME codec. The assumptions behind this reasoning are:

BASE64_MIME is tailored for email
in real world emails are sometimes imperfect, and returning something is better than nothing

Would this project be willing to accept a pull request that makes BASE64_MIME more liberal in its handling by setting check_trailing_bits=false? If not, that's okay, I can go back to the mailparse project and see if they're willing to handle it.

Note, I'm not associated with the project, I just personally stumbled across this problem recently, so I figured I'd start with their suggestion first.

Releasing new version of data_encoding_macro

Could you please release a new version of data_encoding_macro? I'm interested in a version that depends on syn >= 1.0.

(std -> core)::fmt::Display for DecodeError and DecodeKind

No need to feature flag fmt::Display for std as its in core.

impl core::fmt::Display for DecodeError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "{} at {}", self.kind, self.position)
    }
}

Macros Won't Build in no_std

While trying to build with a common no_std target thumbv6m-none-eabi.
Changing data-encoding-macro-internal's Cargo.toml to use the the alloc feature allowed for a clean build.
data-encoding = { version = "2.3", path = "../..", default-features = false, features = ["alloc"] }

thumbv6m-none-eabi should probably be added to the CI build list for next time around :)

Support Nix base32

Would supporting the variant of base32 that Nix uses be something you would consider?

It is like LeastSignificantFirst but reversed:

let mut spec = Specification::new();
spec.symbols.push_str("0123456789abcdfghijklmnpqrsvwxyz");
spec.bit_order = BitOrder::LeastSignificantFirst;
let encoding = spec.encoding().unwrap();
let mut actual = encoding.encode(&hex!("0839 7037 8635 6bca 59b0 f4a3 2987 eb2e 6de4 3ae8"));
unsafe { actual.as_bytes_mut() }.reverse();
assert_eq!("x0xf8v9fxf3jk8zln1cwlsrmhqvp0f88", actual);

Do not use deprecated proc-macro-hack

We don't need to support old stable compilers.

Rust 2021 edition is a major, not a "patch" change

I noticed commit f8cb34b

Once you release this as a "patch" you will likely get bug reports that it breaks builds for people using Rust < 1.56.0 and have to yank the version and re-release as major or revert it. This change unnecessarily raises MSRV.

Consider releasing 2.0.0

Hi,

We've been using data_encoding 2.0.0-rc.2 for some time. Are there any roadblocks to release 2.0.0?

Removing dependency on syn

Hey!

I was reading through data-encoding and it looks awesome. Thanks for your work.

I'm investigating whether or not I can use data-encoding-macro in my own project .

It looks like a perfect fit, except for one issue.

I checked the data-encoding-macro-internal dependencies and saw that it depends on syn.
syn is a fairly heavy dependency that I wouldn't want to pull into my tiny library.

After quickly scanning through data-encoding-macro and data-encoding-macro-internal's source code I'm not seeing anything that couldn't be accomplished with regular macro_rules! declarative macros.

Am I missing something? Was there a specific reason that syn was chosen?

If not, are you open to not using syn in data-encoding-macro?

Thanks!

Use doc_auto_cfg once stable

The documentation currently manually mentions which feature needs to be enabled to use a function. This could be automated with doc_auto_cfg.

feature `std` leaking when using macro in no_std env

In PR #35, data-encoding = { version = "2.3", default-features = false, features = ["alloc"] } was added into lib/macro/Cargo.toml and lib/macro/internal/Cargo.toml. It's ok for me.
But in the later commit 76e21dd, default-features = false, features = ["alloc"] was removed from lib/macro/internal/Cargo.toml, which cause the feature std of data-encoding leaking, I think.

Solution:

lib/macro/internal/Cargo.toml

[package]
name = "data-encoding-macro-internal"
version = "0.1.9"
authors = ["Julien Cretin <[email protected]>"]
license = "MIT"
edition = "2018"
description = "Internal library for data-encoding-macro"
readme = "README.md"
repository = "https://github.com/ia0/data-encoding"
include = ["Cargo.toml", "LICENSE", "README.md", "src/lib.rs"]

[lib]
proc-macro = true

[dependencies]
- data-encoding = { version = "2.3", path = "../.." }
+ data-encoding = { version = "2.3", path = "../..", default-features = false, features = ['alloc'] }
syn = "1"

BTW, we could add resolver = "2" to the Cargo.toml of downstream crate (only useful for rust 1.51+) to fix the issue (Cargo's New Feature Resolver), but we need to compile data-encoding twice.

Add `examples/encode/main.rs` to include for use with `cargo install --example`

encode is an incredibly useful utility all in its own right — most Linux/UNIX systems do not come with base32 or hexadecimal decoders/encoders in their repositories, or if they do they are often written in slow interpreted languages.

While it's entirely possible to clone the repository and run make encode, it's far easier for many users (especially on, e.g., Windows) to run cargo install data-encoding --example encode to have a completely platform-independent method of installing it.

Unfortunately, without adding the file to the include array in Cargo.toml, it won't be included in the crate for installation.

Fix build for i686-pc-windows-msvc

uutils/uucore@672f80e

Better ergonomics when working with exact size

The following code assert()s unexpectadly. The solution so far is to strip the padding and use the decode_nopad_mut variant.

        #[test]
        fn test_b64_raw() {
            let a = [0u8; 8];
            let a64 = ::data_encoding::base64::encode(&a);
            assert_eq!("AAAAAAAAAAA=", a64);
            let mut b = [0u8; 8];
            ::data_encoding::base64::decode_mut(a64.as_bytes(), &mut b);
            assert_eq!(a, b);
        }

check_trailing_bits still errors at certain lengths for base32

When using spec.check_trailing_bits = false; with the base32 spec, I would expect that any string of base32 characters would be decoded successfully with any extra bits ignored. But this is not the case. For strings with a length congruent to 1, 3, or 6 (mod 8), a DecodeError is thrown.

use data_encoding::{BASE32_NOPAD, DecodeError};

fn main() {
    println!("{:?}", decode(b"77777777"));
    println!("{:?}", decode(b"777777777"));
    println!("{:?}", decode(b"7777777777"));
}

fn decode(bytes: &[u8]) -> Result<Vec<u8>, DecodeError> {
    let mut spec = BASE32_NOPAD.specification();
    spec.check_trailing_bits = false;

    spec.encoding().unwrap().decode(bytes)
}

The above code produces the following output:

Ok([255, 255, 255, 255, 255])
Err(DecodeError { position: 8, kind: Length })
Ok([255, 255, 255, 255, 255, 255])

Playground link

I understand that by the canonical definition in the RFC, the second input is invalid base32, but so is the third, and I would expect them to be treated the same when it comes to check_trailing_bits. So for this example, I would expect the second line in the output to be identical to the first.

Is the current behavior in this circumstance intentional, or is it simply a matter of a length check being done first without concern for check_trailing_bits?

For reference, Google Authenticator for Android follows my expectation in this regard, as does Authy for iOS (in my own testing). In the comment linked there you can see that a "string of sixteen 7s ("7...7") and seventeen 7s both decode to the same byte array", which is not the case here.

Use as_chunks family of functions once stable

We currently have custom unsafe functions to split input and output into chunks. Once slice_as_chunks is stable we can use it.

Encoding into a String?

I am trying to figure out how I'd encode into a String, but it doesn't seem to be possible without using unsafe. Is this intentional? Is there something I'm overlooking? Thanks.

ia0 / data-encoding Goto Github PK

data-encoding's People

Contributors

Stargazers

Watchers

Forkers

data-encoding's Issues

static instead of const

const fn

Private Encoding implementation

MaybeUninit

MSRV

Expose the polymorphic implementation

Recommend Projects

Recommend Topics

Recommend Org

`static` instead of `const`

`const fn`

Private `Encoding` implementation

`MaybeUninit`