Giter Club home page Giter Club logo

decompressor-prototype's People

Contributors

flagxor avatar karlschimpf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

decompressor-prototype's Issues

~std::vector causing runtime crashes?

For reasons I don't understand, I'm am getting runtime crashes (free of bad pointer) when std::~vector is being called.

It occurs in vectors in classes SymbolTable and CounterWriter. It is very consistent.

Before fixing the std::vector's in these classes to be a (heap allocated) pointer that is not deleted, the code consistently died on:

build/debug/bin/compress-int -i test/test-sources/toy./wasm -l 4

(any number greater than 3 for -l did this).

Reading Stack Height?

Hello, I'm experimenting with adding metering to wasm. Currently it is done through an AST transform. It is pretty simple. It reads the AST, counts the number of nodes for each branch and append a metering statement on each leaf. more here.

So two questions

  1. would it even make sense to inject the metering statements using a filter? is this to far out-of-scope?
  2. if so how would you do it? I think it currently impossible. You need to count the number of AST node. I think you would have to be able to read what the stack height.

Different forms of define arguments.

Currently, all arguments to a define are "pass by expression", matching a structured form of a static macro call. To add more power, we would really like to allow many different forms of parameter passing (done at the Eval node). The proposed new parameter forms are:

(params) - The define gets no arguments.

(params n) - The define gets N values. These values are evaluated before they are passed to the define.

(exprs n) - The define gets n expressions that are only evaluated on demand, within the define.

(exprs.cached n) - The define gets n expressions, but the data for expressions are cached at the point of the call, rather than on demand. When reading, this allows the same data to be read multiple times, effectively allowing the insertion of multiple copies.

(cached) - The data for the macro (and not its arguments) is cached, based on the single argument value passed in.

(args E1 ... EN) - A mixed list of arguments, composed by its arguments, each of which can be one of the previous argument specifiers.

The integer compressor should compress multiple integer sequences first

The current compressor is somewhat complex because it treats singleton integer patterns the same was as multiple integer sequence patterns.

The problem is that singletons have considerable less savings because they are only being replaced by an abbreviation value. Hence, it may "shrink" the width (slightly), but doesn't remove values from the stream (as multiple integer sequence patterns do).

We should first schedule multiple integer sequences first. Then we should chose which of the remaining singletons should be converted to a pattern. This does two things:

  1. It allows us to still encode single integers using abbreviations (size based on frequency use), and
    the assignment of Huffman encoding values can be merged with multiple integer sequences.
  2. It simplifies the selection of multiple integer sequence patterns.

Need cleaner granularity on Symbol tables and headers

To generalize the decompressor reader, we need multiple copies of default (algorithm) symbol tables. Currently, this isn't possible.

Either we should add a "copy" symbol table method, or modify the "root" install methods to allow us to update the default copy.

Another approach is to realize that for the decompressor, we only need header built so that we can choose the algorithm to apply. Once the algorithm is known, we can load on demand as needed. This solution will remove the need for copying the default algorithm, since it will be loaded (on demand) for as many casses as it is needed.

Simplifying Queue and Pipe data structures.

The original intent of the Queue data structure was do define a data structure that allowed simultaneous reading and writing to it.

The "Queue.FirstPage" field was used to automatically clean up when streaming. However, because there was only one first page for both reading and writing, the Pipe concept could not be implemented.

A better solution is to add a "read" first page and a "write" first page concept. Each of these (shared) pointers keep track of the portion of the queue that is being used for reading and writing. Either page pointer may be null, if the corresponding concept is not being used.

If there is both a "read" and a "write" first page, the "read" page should always be behind the "write" first page, unless "eof" has been frozen (in which case the remainder of the input can be read).

In addition, advancement of read cursors should be limited to pages that appear before the first write page (only applies if first write page is non-null). This guarantees that the input is no longer changing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.