Giter Club home page Giter Club logo

stable-vec's People

Contributors

burjui avatar farhanpatwary avatar lukaskalbertodt avatar ratmice avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

stable-vec's Issues

Try out and offer different ways to store the "deleted" flag for elements

There are different ways to store whether or not an element is present or deleted:

  • (A) Vec<Option<T>>: the obvious, semantically "correct" version. Sadly, this can lead to a lot of wasted memory since Option<T> can be twice as large as T for some types (e.g. u32). However, for cases where Option<T> has the same size as T, it's pretty optimal, because we need no additional memory. Additionally, this works completely without unsafe code, so that's good.
  • (B) Vec<T> + BitVec + count: this is what this crate is currently using. In situations where Option<T> is larger than T, this is more or less the most space efficient. Iteration over the indices of existing elements is the fastest this way since that data is packed a lot closer than Vec<T>, we have a lot fewer cache misses and need to load fewer memory.

There are the following variants I can think of:

  • Combined Vec and BitVec plus count (variant of (B)): a simple optimization of the second version. The length, and probably also the capacity field of Vec and BitVec contain the same information and shouldn't be stored twice. This only improves the stack size of StablVec, but that's something.
  • Vec<T> with user specified dummy value (variant of (A)): this might be useful in a couple of situations. Option<T> can only get optimized if the dummy value of T is a binary 0.
  • compact: bool (variant of (B)): while the vector is still compact, we don't need to store any per element information. Since I expect stable vecs to be compact a lot of the time, this could be worth it.

All these variants need to be benchmarked and compared to one another. Then, it would be nice to let the user decide which one to use.

API additions

Things we probably want to add to the API:

  • a consuming iterator
  • insert_at
  • FromIterator impl
  • into_vec()
  • contains()
  • Debug impl
  • Extend impl
  • Default impl
  • retain
  • insert_into_hole(index, elem) -> Result<(), Elem>
  • io::Write impl for StableVec<u8>
  • From<Vec<T>> impl -> only possible with specialization. Instead from_vec was added.
  • Expose next_index or something like that to allow users to build iterators themselves

Things we might or might not want to add, mostly because the semantics are not clear yet:

  • truncate(new_len)
  • clear() (but what does it do exactly?)
  • first(), first_mut(), last(), last_mut() (but those would be O(n)!)
  • swap() (invalidates two indices... uhm maybe?)

Missing trait implementations

There are some traits missing for StableVec that could be implemented.

Important traits:

  • Hash
  • Ord
  • Write
  • RefUnwindSafe
  • UnwindSafe
  • PartialOrd

serdes Serialize and Deserialize could also be supported?

Convenience:

  • From<BTreeMap<usize, T>>
  • Extend<&'a T>
  • From<&'_ str>
  • From<BinaryHeap<T>>
  • From<Box<[T]>>
  • From<CString>
  • From<String>
  • From<Cow<'a, [T]>>
  • PartialEq<&'_ [B; N]>
  • PartialEq<&'_ mut [B]>
  • PartialEq<[B; N]>

`IterMut::remove_current` does not drop the value

IterMut::remove_current is broken. I don't know what my brain thought when writing this, but it's not great.

This method was used by StableVec::retain. This was changed in 0.2.2 (#20). Next, I will remove remove_current as it's unlikely to stay in this form. And changing semantics about it would require a breaking change anyway. So rather remove it now before anything bad happens.

Confusion with `push` and `insert` methods

There seem to be two ways to add an element to a StableVec:

  1. StableVec::push, which inserts an element in the next free slot? It is not entirely clear by the documentation what this is supposed to be?

  2. StableVec::insert, which inserts an element at the specified position, but does panic if the index is out of bounds (which makes it quite inconvenient, because you would have to always call StableVec::reserve_for(index) and then call StableVec::insert, to prevent panics).

I am a bit confused about the behavior of those two methods.

  • I expect a function called push to push an element to the back of the vec (like Vec::push)
  • and I do not expect insert to panic if the index is out of bounds (I see this method more like a HashMap::<usize, T>::insert)

I would suggest adding the following methods:

  • StableVec::push_back(&mut self, value: T), appends an element to the back of the StableVec, ignoring free slots
  • StableVec::pop_back(&mut self) -> Option<T>, removes the last element from a StableVec

And changing the StableVec::insert method to not panic if the capacity is exhausted and instead resize the vector.

Random notes: performance

I'm currently working on another performance-critical project that uses this crate. I'm inspecting generated assembly code and noted some possible improvements to this crate. Before I forget all of this, here is a random list of notes:

  • Using bitvec[index] is bad: because that crate uses the static TRUE/FALSE trick to in the [] operator we have a branch where there shouldn't be one. Use get instead or something like that
  • First indexing the bitvec and then the data vec does two bound checks. That's bad. Everywhere where we use has_element_at (and it returns true), we should use get_unchecked or so.
  • Setting or resetting a bit (done via & or | with a mask) is not optimized well. This should (probably be done) with the BTS and BTR instructions on x86. Have too look into this.

(This issue is rather a reminder to myself...)

Rename `exists()`

This is a bad name. It should be something like contains_index() or something like that...

Layout experiments

There are some things we could be experimenting with. They might lead to a performance improvement, but this needs to be checked.

  • A Vec<T> core with user-specified nil-value: the OptionCore is useful when T is NonNull, i.e. the 0-bit pattern is never used. But sometimes other bit patterns are not used, which could be used to indicate an empty slot.

  • compact: bool: store a boolean that is true if the stable vec is still compact. In this case, we don't have to store any additional information for slots (i.e. not bit vec). This could be useful in situation where no element is removed as this saves memory and could be faster. But it might also be slower in general because it adds complexity.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.