Giter Club home page Giter Club logo

hematite_nbt's Introduction

hematite_nbt hematite_nbt at crates.io hematite_nbt at docs.rs Build Status

This repository contains the Hematite project's standalone nbt crate for working with Minecraft's Named Binary Tag (NBT) format.

This is not the only NBT-related crate available, but it has some notable features:

  • Full support for serializing and deserializing types via Serde. This means that you can read and write the NBT binary format of any struct annotated with the standard #[derive(Serialize, Deserialize)] traits (provided it actually has a valid NBT representation).

  • An API that attempts to differentiate between complete and partial NBT objects via nbt::Blob and nbt::Value. Only complete objects can be serialized.

  • Support for the TAG_Long_Array data introduced in Minecraft 1.12.

  • Support for the modified UTF-8 encoding used by the vanilla Minecraft client.

License

Licensed under the terms of the MIT license.

hematite_nbt's People

Contributors

atheriel avatar bvssvni avatar caelunshun avatar ejmount avatar fenhl avatar freax13 avatar iaiao avatar potpourri avatar samlich avatar schuwi avatar stackdoubleflow avatar williewillus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hematite_nbt's Issues

ByteArray is not serialized correctly

If you serialize a ByteArray and inspect the generated binary, it'll be serialized as a List (0x09) of Bytes (0x01), which is incorrect, since it should actually be a ByteArray (0x07).

Support ordered NBT compounds

At the moment the nbt::Blob and nbt::Value::Compound types are backed by hashmaps, which do not preserve order. I haven't found any evidence that this violates the NBT specification, but nonetheless it is annoying in practice.

There are a couple of alternative maps we could use instead, including:

  1. The IndexMap type provided by the indexmap crate, also used by the serde_json crate to achieve a similar goal.
  2. The LinkedHashMap type as suggested in #40. This may require some additional effort to maintain Serde support.
  3. The BTreeMap type from the standard library.

Of these, (1) seems to be the closest to what users might expect, since it preserves order based on insertion.

Serde support not on crates.io

I'd like to depend on the current version of this crate in my crate, which I plan to publish on crates.io. Will the current version be published soon or should I add a git submodule?

The vanilla client will allow non-utf8 strings

I'm dealing with a player .dat file containing a book with non-utf8 content, generated by the vanilla client.

I don't know when it's possible, but apparently the vanilla client may allow non-utf8 data.

It'd be nice to somehow allow parsing this anyway.

Deserialization of Blobs using serde confuses types

Consider:

use nbt::{Blob, Value}; 

fn main() {
    let mut blob = Blob::new();
    blob.insert("test", Value::Int(123)).unwrap();

    let mut bytes = Vec::new();
    nbt::to_writer(&mut bytes, &blob, None).unwrap();
    println!("{:?}", bytes);

    let de_blob: Blob = nbt::from_reader(&bytes[..]).unwrap();
    println!("{:?}", de_blob);
}

Output:

[10, 0, 0, 3, 0, 4, 116, 101, 115, 116, 0, 0, 0, 123, 0]
Blob { title: "", content: {"test": Byte(123)} }

Serialization seems to work -- that 3 in the 4th position is TAG_Int as you'd expect, but it gets deserialized into a Value::Byte (using larger values will give Shorts and then Ints).

Using Blob::from_reader works, and gives Blob { title: "", content: {"test": Int(123)} }.

I think this is caused by using serde(untagged) for the Value enum, but I really don't understand serde well enough to know how to fix this.

`to_writer` and `from_reader` should not require `Sized`

The generic bounds for to_writer and from_reader do not have ?Sized so they implicitly require Sized.

If I want to use a DST (mainly dyn std::io::Write) in this case, it gets a bit ugly because I have to double-borrow (&mut &mut dyn std::io::Write).

From my cursory inspection, there's nothing that requires it to be Sized. If I'm mistaken please tell me.

Papercut: nbt-serde always writes List tags as opposed to ByteArray/IntArray tags

Currently, an &[i8] or &[i32]always results in nbt-serde writing a List tag. This is undesirable when trying to work with chunk files, as Minecraft only accepts IntArray/ByteArray. Another slightly unrelated bug is that serialize_bytes returns an error, instead of writing a ByteArray tag. Fixing both of these issues would make nbt-serde usable with Minecraft files. However, while it should still be possible to write a List tag with i8/i32 in some way, IntArray/ByteArray should be the default.

Example of a papercut:

#[derive(Debug, Serialize, Deserialize)]
pub struct ChunkRoot {
	#[serde(rename="DataVersion")]
	pub version: i32,
	#[serde(rename="Level")]
	pub chunk: Chunk
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Chunk {
	#[serde(rename="xPos")]
	pub x: i32,
	#[serde(rename="zPos")]
	pub z: i32,
	#[serde(rename="LastUpdate")]
	pub last_update: i64,
	#[serde(rename="LightPopulated")]
	pub light_populated: bool,
	#[serde(rename="TerrainPopulated")]
	pub terrain_populated: bool,
	#[serde(rename="V")]
	pub v: i8,
	#[serde(rename="InhabitedTime")]
	pub inhabited_time: i64,
	#[serde(rename="Biomes")]
	pub biomes: Vec<i8>,
	#[serde(rename="HeightMap")]
	pub heightmap: Vec<i32>,
	#[serde(rename="Sections")]
	pub sections: Vec<Section>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Section {
	#[serde(rename="Y")]
	pub y: i8,
	#[serde(rename="Blocks")]
	pub blocks: Vec<i8>,
	#[serde(rename="Add")]
	pub add: Option<Vec<i8>>,
	#[serde(rename="Data")]
	pub data: Vec<i8>,
	#[serde(rename="BlockLight")]
	pub block_light: Vec<i8>,
	#[serde(rename="SkyLight")]
	pub sky_light: Vec<i8>
}

broken_chunk.zip

Inhomogeneous values of Value::List when reading chunk data

When loading nbt data from a chunk, various lists of numbers such as BlockStates are returned as inhomogeneous lists of various integer sizes. This seems to cause problems when trying to save these values back and the result causes minecraft to reset the chunk.

Reading the "specification" for NBT, it seems like inhomogeneous lists shouldn't be representable as the tag that indicates the type of the values in the list is only given once for the entire list. I don't know if this is some undocumented feature or if it is what is causing the chuck resets, but it seemed weird to me.

I've attached an example of chunk nbt data that seems to have this problem.

original_bytes.nbt.txt

It's base64 encoded because github doesn't seem to want to let me upload a binary file. Decode with:

base64 -d original_bytes.nbt.txt > original_bytes.nbt

Here is code I used to check for inhomogeneous lists.

use nbt::Value;
use std::collections::HashMap;

pub fn check_value(value: &Value) {
    match value {
        Value::Byte(b) => return,
        Value::Short(s) => return,
        Value::Int(i) => return,
        Value::Long(l) => return,
        Value::Float(f) => return,
        Value::Double(d) => return,
        Value::String(s) => return,
        Value::List(l) => check_list(l),
        Value::Compound(c) => check_compound(c),
        Value::ByteArray(ba) => return,
        Value::IntArray(ia) => return,
        Value::LongArray(la) => return 
    }
}

pub fn check_compound(c: &HashMap<String, Value>) {
    for (key, value) in c.iter() {
        println!("KEY: {} => {}", key, value.tag_name());
        check_value(value);
    }
}

fn check_list(l: &Vec<Value>) {
    match l.first() {
        Some(first) => {
            let tag = first.id();
            for v in l.iter() {
                if v.id() != tag {
                    panic!("NBT lists cannot be inhomogenious");
                }
            }
        }
        None => return
    }
}

Bool deserialization errors for internally-tagged enum variants

I have a patch with a unit test demonstrating this issue (the test internal_variant_bool fails at deserialization): aramperes@0056f45

thread 'internal_variant_bool' panicked at 'NBT deserialization.: Serde("invalid type: integer `1`, expected a boolean")', src/libcore/result.rs:999:5

Context: Internally-tagged enum representation is a serde feature to define the struct used based on a field inside of said struct. In the patch, I am serializing a deserializing the following "variants" (represented as JSON here):

// internal_variant_string
{
  "data": {
    "variant": "String",
    "data": "test"
  }
}

// internal_variant_bool
{
  "data": {
    "variant": "Bool",
    "data": true
  }
}

There seems to be an incompatibility between this serde feature and hematite_nbt's bool deserialization mechanism. The error message shows that it is happening at the serde-level, which makes me think the deserialize_bool in nbt::de is not executed in this particular test case.

An easy work-around is to use u8 instead of bool for the BoolVariant struct.

Cargo fmt, Clippy, deprecation

When compiling the project rustc emits a lot of warnings. Also the code format seems inconsistent with cargo fmt.
I would suggest running cargo fmt, cargo clippy and fixing the deprecated stuff.

I might get on this myself in the next few days. Just wanted to hear if there is any reason that speaks against this :)

Serializing Byte Arrays With NBT Serde

I'm using this crate to enable exporting to the Litematica schematic format, and I've figured everything out except how to create blob array data (I'm not sure of the proper name).

image
In the above screencap, the BlockStates field is what I'm interested in serializing to. PreviewImageData is the same type, but with integers rather than longs. Please note I'm not asking how these fields actually work, but rather how to serialize to their types from a Rust struct using the NBT Serde functionality.

Thank you!

Little-endian support?

The Minecraft: PE specification calls for little-endian nbt for slot data. Is it possible to add little-endian support to hematite_nbt?

Improve top-level documentation

The README.md and the crate-level documentation are very sparse at the moment, and could use substantial improvement. I'd like to highlight serde support, since it will likely be a distinguishing feature of this NBT crate over others.

More information in Error

Knowing where an error in de/serialization happened (byte offset, nbt path) would be very neat.

I haven't looked into how hard this would be to implement yet.

Consolidate the serde and non-serde APIs

Among other things, this will require us to:

  • Implement Deserialize and Serialize for Blob objects.
  • Export one set of from_reader and to_writer functions depending on the feature flags.

Blob API is too terse

I can't tell if I'm missing something, but the API for nbt::Blob doesn't let you enumerate keys, or remove them, or even read its name. It's okay for writing output, but you can't use it for processing NBT data that you don't know the structure of. Is there any better way to read an NBT blob, maybe using the Value type?

Edit: managed to use Value by first reading a u8 tag type and a short-prefixed string.

Support fixed-length arrays in #[derive(NbtFmt)], including for NBT byte/int arrays.

Fixed length arrays are a desirable feature to have for structs, including the ongoing work on getting hematite_server to support loading .mca files. Ideally, code like

#[derive(NbtFmt)]
struct WithFixedArray {
    bytes: [i8; 1024],
}

would generate a read/write implementation using a TAG_List with type TAG_Byte (as it would with a Vec<i8> currently), optionally using something like

#[derive(NbtFmt)]
struct WithFixedArray {
    #[nbt_byte_array]
    bytes: [i8; 1024],
}

to explicitly indicate that the implementation should use a TAG_ByteArray instead.

Some tags are parsed with incorrect types

I found that short, int and long tags will not always be parsed correctly, but an minimal integer type that can represent its value. For example, int 10 will be parsed into byte, while int -2147483648 will be parsed into int. Besides, a list of int will be parsed into int array.

I don't understand why, but I think tags are not parsed with incorrect types. I added a test to check how tags are parsed, and that's the result:

Type of "byte" is TAG_Byte
Type of "short" is TAG_Byte, expected TAG_Short
Type of "int" is TAG_Byte, expected TAG_Int
Type of "long" is TAG_Byte, expected TAG_Long
Type of "float" is TAG_Float
Type of "double" is TAG_Float, expected TAG_Double
Type of "string" is TAG_String
Type of "byte array" is TAG_ByteArray
Type of "int array" is TAG_ByteArray, expected TAG_IntArray
Type of "long array" is TAG_ByteArray, expected TAG_LongArray
Type of "compound" is TAG_Compound
Type of "list of byte" is TAG_ByteArray, expected TAG_List
Type of "list of int" is TAG_ByteArray, expected TAG_List
Type of "list of long" is TAG_ByteArray, expected TAG_List

I'm sorry to say that 9 types of tags are parsed with incorrect type.

This test can be found at my fork of this repo. The nbt file is at https://github.com/ToKiNoBug/hematite_nbt/blob/fix-type-parsing/tests/types.nbt.

Here is a copy of the test code:

#[test]
fn test_types() {
    let file = File::open("tests/types.nbt").unwrap();
    let nbt:Result<nbt::Map<String,nbt::Value>,nbt::Error>=nbt::from_gzip_reader(file);
    let nbt=nbt.unwrap();

    let type_lut=[
        ("byte","TAG_Byte"),
        ("short","TAG_Short"),
        ("int","TAG_Int"),
        ("long","TAG_Long"),
        ("float","TAG_Float"),
        ("double","TAG_Double"),
        ("string","TAG_String"),
        ("byte array","TAG_ByteArray"),
        ("int array","TAG_IntArray"),
        ("long array","TAG_LongArray"),
        ("compound","TAG_Compound"),
        ("list of byte","TAG_List"),
        ("list of int","TAG_List"),
        ("list of long","TAG_List"),
    ];

    let mut mismatch_counter=0;
    for (key,expected_type) in type_lut {
        let tag:&nbt::Value=nbt.get(key).unwrap();
        let mut correct=true;
        if tag.tag_name()!=expected_type {
            mismatch_counter += 1;
            correct=false;
        }
        if correct {
            println!("Type of \"{}\" is {}",key,tag.tag_name());
        }
        else {
            eprintln!("Type of \"{}\" is {}, expected {}",key,tag.tag_name(),expected_type);
        }
    }

    if mismatch_counter>0 {
        panic!("{} types mismatched",mismatch_counter);
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.