Giter Club home page Giter Club logo

protobuf's Introduction

protobuf

My own implementation of Google's Protocol Buffers.

Changes in v0.3.1

  • encoding module became protobuf module.
  • Performance tests.
  • Bool.dump 2.2 times faster.
  • Varint 14% faster.
  • add_field chaining.
  • __hash__ 17% faster.

Changes in v0.3

  • README techniques added.
  • Hashes of message types.
  • Fixed: loading of missing required field doesn't raise ValueError.
  • Message load doesn't use StringIO for reading embedded messages and packed repeated fields anymore.
  • TypeMetadata! (read below)
  • Removed MarshalableCode (it's not protobuf's business).
  • Fixed: reading of Int32 values raises TypeError: 'str' object is not callable

Changes in v0.2

  • Fixed Int32 type name (was Int32Type).
  • Added validation of message type.
  • Unicode type.
  • Python code object type.
  • Fixed casting values to bool and from bool.

Using

Fow now, there is full protobuf encoding implementation, so you can use the encoding module with full compatibility with the standard implementation.

The encoding module is covered with tests, but you should understand that there are may be some unknown bugs. Thus, use this software at your own risk.

Do from encoding import * and you're ready to go.

Note: all names of message types are similar to described there. ;-)

Sample 1. Introduction

Assume you have the following definition:

message Test2 {
  string b = 2;
}

First, you should create the message type:

Test2 = MessageType()
Test2.add_field(2, 'b', String)

Then, create a message and fill it with the appropriate data:

msg = Test2()
msg.b = 'testing'

You can dump this now!

print msg.dumps() # This will dump into a string.
msg.dump(open('/tmp/message', 'wb')) # And this will dump into any write-like object.

You also can load this message with:

msg = Test2.load(open('/tmp/message', 'rb'))

or with:

msg = load(open('/tmp/message', 'rb'), Test2)

Simple enough. :)

Sample 2. Required field

To add a missing field you should pass an additional flags parameter to add_field like this:

Test2 = MessageType()
Test2.add_field(2, 'b', String, flags=Flags.REQUIRED)

If you'll not fill a required field, then ValueError will be raised during serialization.

Sample 3. Repeated field

Do like this:

Test2 = MessageType()
Test2.add_field(1, 'b', UVarint, flags=Flags.REPEATED)
msg = Test2()
msg.b = (1, 2, 3)

A value of repeated field can be any iterable object. The loaded value will always be list.

Sample 4. Packed repeated field

Test4 = MessageType()
Test4.add_field(4, 'd', UVarint, flags=Flags.PACKED_REPEATED)
msg = Test4()
msg.d = (3, 270, 86942)

Sample 5. Embedded messages

Consider the following definitions:

message Test1 {
  int32 a = 1;
}

and

message Test3 {
  required Test1 c = 3;
}

To create an embedded field, pass EmbeddedMessage as the type of field and fill it like this:

# Create the type.
Test1 = MessageType()
Test1.add_field(1, 'a', UVarint)
Test3 = MessageType()
Test3.add_field(3, 'c', EmbeddedMessage(Test1))
# Fill the message.
msg = Test3()
msg.c = Test1()
msg.c.a = 150

Data types

There are the following data types supported for now:

UVarint             # Unsigned integer.
Varint              # Signed integer.
Bool                # Boolean.
Fixed64             # 8-byte string.
UInt64              # C++'s 64-bit `unsigned long long`
Int64               # C++'s 64-bit `long long`
Float64             # C++'s `double`.
Fixed32             # 4-byte string.
UInt32              # C++'s 32-bit `unsigned int`.
Int32               # C++'s 32-bit `int`.
Float32             # C++'s `float`.
Bytes               # Pure bytes string.
Unicode             # Unicode string.
TypeMetadata        # Type that describes another type.

Some techniques

Streaming messages

The Protocol Buffer format is not self delimiting. But you can wrap you message type in EmbeddedMessage class and write/read it sequentially.

The other option is to use protobuf.EofWrapper that has a limit parameter in its constructor. The EofWrapper raises EOFError when the specified number of bytes is read.

Self-describing messages and TypeMetadata

There is no any description of the message type in a message itself. Therefore, if you want to send a self-described messages, you should send the a description of the message too.

I've implemented a tool for this... Look:

A, B, C = MessageType(), MessageType(), MessageType()
A.add_field(1, 'a', UVarint)
A.add_field(2, 'b', TypeMetadata, flags=Flags.REPEATED)     # <- Look here!
A.add_field(3, 'c', Bytes)
B.add_field(4, 'ololo', Float32)
B.add_field(5, 'c', TypeMetadata, flags=Flags.REPEATED)     # <- And here!
B.add_field(6, 'd', Bool, flags=Flags.PACKED_REPEATED)
C.add_field(7, 'ghjhdf', UVarint)
msg = A()
msg.a = 1
msg.b = [B, C]                                              # Assigning of types.
msg.c = 'ololo'
bytes = msg.dumps()
...
msg = A.loads(bytes)
msg2 = msg.b[0]()                                           # Creating a message of the loaded type.

You can send your bytes anywhere and you'll got your message type on the other side!

add_field chaining

add_field return the message type itself, thus you can do so:

MessageType().add_field(1, 'a', EmbeddedMessage(MessageType().add_field(1, 'a', UVarint)))

More info

See protobuf to see the API and run-tests modules to see more usage samples.

protobuf's People

Contributors

eigenein avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.