whtsky / bencoder.pyx Goto Github PK

View Code? Open in Web Editor NEW

33.0 3.0 10.0 251 KB

A fast bencode implementation in Cython

License: BSD 3-Clause "New" or "Revised" License

Python 64.23% Cython 35.77%

cython bencode bencoder bencoding bittorrent

bencoder.pyx's People

Contributors

Stargazers

Watchers

Forkers

itdaniher mengqhui jakeogh lartza manlan2 fossabot jameshilliard watch-later esenzet mayli

bencoder.pyx's Issues

Can't bencode torrent file

Hi,
I have a torrent file decoded, and now I need to bencode it again after having modified a few things. However, I get an error:

Traceback (most recent call last):
  File "test.py", line 61, in <module>
    print (bencode(data['info']))
  File "bencoder.pyx", line 159, in bencoder.bencode (bencoder.c:4217)
  File "bencoder.pyx", line 100, in bencoder.encode (bencoder.c:2787)
  File "bencoder.pyx", line 141, in bencoder.encode_dict (bencoder.c:4084)
  File "bencoder.pyx", line 100, in bencoder.encode (bencoder.c:2787)
  File "bencoder.pyx", line 109, in bencoder.encode_int (bencoder.c:3015)
OverflowError: value too large to convert to int

Here is a pprint of the data I'm trying to encode:
https://gist.github.com/vinz243/9071201b4057f6f838e97936595572fd

Thanks in advance

Missing build for Python 3.7

Not 100% sure, but on first glance it looks like you didn't build for Python 3.7.

I'm running macOS 10.13.2:

(tester--unpHGAN) ➜  tester pip install bencoder.pyx==1.2.1
Collecting bencoder.pyx==1.2.1
  Could not find a version that satisfies the requirement bencoder.pyx==1.2.1 (from versions: 1.0.0, 1.1.0, 1.1.1, 1.1.2, 1.1.3)
No matching distribution found for bencoder.pyx==1.2.1

Strings decoded are left as byte strings

I am uncertain if this is a bug or not, but I noticed that when decoding bencoded data, any decoded strings are left as byte strings even though they're not really binary data, but actual human-readable strings.

eg.

bdecode(b'8:a string')
# b'a string'

If you wanted to use this as a string you would have to also run it through decode('utf-8).

From what I understand, the bencode standard does not make a distinction between strings and binary data. Would it be a good idea to try and decode any byte strings to regular strings and if it fails then leave it as-is, assuming it is actually binary data? I just know that having to append decode('utf-8) onto every decoded dict key gets to be very repetitive.

Missing PyPI wheel for Python 3.5 Windows x64

bdecode variant without erroring on long string?

Doing some hacking on @boramalper's magnetico.

It relies on data present at the end of a bencoded bytestring. In order to use bencoder.pyx I had to patch out https://github.com/whtsky/bencoder.pyx/blob/master/bencoder.pyx#L96 and return both r and l.

See https://github.com/boramalper/magnetico/blob/master/magneticod/magneticod/bencode.py#L53 for the function for which I needed to add a replacement.

Would this project accept a PR adding a bdecode2 returning both r and l, and not erroring on excess data?

Thanks!

Encoding slow?

Hi,

I've discovered that my pure Python implementation of bencode encode is faster than your Cython one. Depending on what is being encoded it can range from a few percent faster to several orders of magnitude faster. Not sure why that is the case, I would expect the Cython version to be faster in all cases.

Here's my bencode encode function.

def encode(obj):
    """
    Encode data in to bencode, return bytes.

    The following objects may be encoded: int, bytes, list, dicts.

    Dict keys must be bytes, and unicode strings will be encoded in to
    utf-8.

    """
    binary = []
    append = binary.append

    def add_encode(obj):
        """Encode an object, appending bytes to `binary` list."""
        if isinstance(obj, bytes):
            append(b'%i:%b' % (len(obj), obj))
        elif isinstance(obj, memoryview):
            append(b'%i:%b' % (len(obj), obj.tobytes()))
        elif isinstance(obj, str):
            obj_bytes = obj.encode('utf-8')
            append(b"%i:%b" % (len(obj_bytes), obj_bytes))
        elif isinstance(obj, int):
            append(b"i%ie" % obj)
        elif isinstance(obj, (list, tuple)):
            append(b"l")
            for item in obj:
                add_encode(item)
            append(b'e')
        elif isinstance(obj, dict):
            append(b'd')
            try:
                for key, value in sorted(obj.items(), key=itemgetter(0)):
                    append(b"%i:%b" % (len(key), key))
                    add_encode(value)
            except TypeError:
                raise EncodeError('dict keys must be bytes')
            append(b'e')
        else:
            raise EncodeError(
                'value {!r} can not be encoded in Bencode'.format(obj)
            )
    add_encode(obj)
    return b''.join(binary)

Python 3.11 compatibility, #include "longintrepr.h" is invalid

https://docs.python.org/3.11/whatsnew/3.11.html documents that longintrepr.h can no longer be included, but just patching this out is insufficient to fix compilation failures.

bencoder.pyx fails to build on 3.11 due to this exact reason. Can this be fixed somehow?

> [builder 3/3] RUN pip wheel --no-cache --no-deps  bencoder.pyx:                                                        
#0 24.14   Building wheel for bencoder.pyx (pyproject.toml): started
#0 24.50   Building wheel for bencoder.pyx (pyproject.toml): finished with status 'error'
#0 24.51   error: subprocess-exited-with-error
#0 24.51   
#0 24.51   × Building wheel for bencoder.pyx (pyproject.toml) did not run successfully.
#0 24.51   │ exit code: 1
#0 24.51   ╰─> [14 lines of output]
#0 24.51       Compiling bencoder.pyx because it changed.
#0 24.51       [1/1] Cythonizing bencoder.pyx
#0 24.51       running bdist_wheel
#0 24.51       running build
#0 24.51       running build_ext
#0 24.51       building 'bencoder' extension
#0 24.51       creating build
#0 24.51       creating build/temp.linux-x86_64-cpython-311
#0 24.51       gcc -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC -I/usr/local/include/python3.11 -c bencoder.c -o build/temp.linux-x86_64-cpython-311/bencoder.o -O3
#0 24.51       bencoder.c:211:12: fatal error: longintrepr.h: No such file or directory
#0 24.51         211 |   #include "longintrepr.h"
#0 24.51             |            ^~~~~~~~~~~~~~~
#0 24.51       compilation terminated.
#0 24.51       error: command '/usr/bin/gcc' failed with exit code 1
#0 24.51       [end of output]
#0 24.51   
#0 24.51   note: This error originates from a subprocess, and is likely not a problem with pip.
#0 24.51   ERROR: Failed building wheel for bencoder.pyx

Source distribution for 2.0.1 is missing

Source distribution for 2.0.1 is missing from PyPI. There are only wheels published. Previous versions had both wheels and source distributions.

This means that if you try to install package on platform where wheels are not available, like Raspberry Pi (ARM), it will most likely fail, because Pip won't find support distribution.

Broken compilation on Windows (1.1.2)

This is due to a stray line in the PyPI package in the bencoder.pyx.egg-info/SOURCES.TXT:

/Users/whtsky/Documents/codes/bencoder.pyx/bencoder.c

Removing this allows the build to complete successfully.

whtsky / bencoder.pyx Goto Github PK

bencoder.pyx's People

Contributors

Stargazers

Watchers

Forkers

bencoder.pyx's Issues

Can't bencode torrent file

Missing build for Python 3.7

Strings decoded are left as byte strings

Missing PyPI wheel for Python 3.5 Windows x64

bdecode variant without erroring on long string?

Encoding slow?

Python 3.11 compatibility, #include "longintrepr.h" is invalid

Source distribution for 2.0.1 is missing

Broken compilation on Windows (1.1.2)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent