Comments (3)
Found this workaround:
with libarchive.file_writer('bytes.tar', 'pax') as ar:
content = b'bytes'
- ar.add_file_from_memory('bytes.bin', len(content), content)
+ ar.add_file_from_memory('bytes.bin', len(content), [content])
Or, why not do exactly the same inside the function to fix the problem (untested):
--- a/libarchive/write.py
+++ b/libarchive/write.py
@@ -99,6 +99,10 @@ class ArchiveWrite(object):
entry_set_perm(archive_entry_pointer, permission)
write_header(archive_pointer, archive_entry_pointer)
+ # Make bytestrings work #68
+ if isinstance(entry_data, bytes):
+ entry_data = [entry_data]
+
for chunk in entry_data:
if not chunk:
break
from python-libarchive-c.
AFAICT your "workaround" is in fact the one and only correct way to use this method.
The documentation is weak, and probably could read, "entry_data: binary content of entry as an iterable yielding bytes or bytearray objects."
As for the test case, it is badly broken and only "works" by accident. And clearly sets a bad example.
from python-libarchive-c.
I found out the hard way that if you feed a unicode string to entry_data (e.g., your_data
being passed as [your_data]
) you will get VERY strange output - specifically, it'll look like UTF-16 doubly-encoded ("A" (0x41 in ascii) is 0x0041 in Unicode, and then it appears to get re-encoded as 0x00000041).
So, if your_data
is unicode, .encode()
it first. In Python 3 you can just check if it's an instance of str
.
This also pops up if you are using the unicode_literals
import on Python 2 and strings are involved.
But what caused me the most trouble is that, regardless of the import above... json.dumps()
in Python 2.7 can return either a non-unicode string or a unicode one, depending on the options. In Python 3, json.dumps()
returns str
... and you'll have the same problem if you don't encode()
it to bytes.
The more I think about it, the more I wonder if this is simply a bug. For Python3 at least, entry_data
should ONLY be a list of byte objects. I'm trying to think of why you'd want to let the library try to encode non-byte data given that it will fail badly in the effort and then blithely pass that broken data to the system libarchive.
from python-libarchive-c.
Related Issues (20)
- Extract to specific folder HOT 2
- Minor test issues under pypy
- Attempting to use ArchiveEntry outside of for loop doesn't work
- read_next_header2() takes long time for closing archive in solid 7z archive when last file is big HOT 1
- Type checking
- Obtaining the position (byte offset) of an entry within the archive HOT 1
- `AttributeError` when using a libarchive version which doesn't support passphrases
- Cannot set uid/gid when adding file from memory
- hold strong reference of callback HOT 1
- Suggestion: Test file HOT 1
- AttributeError: 'ArchiveEntry' object has no attribute '_pointer' when getting the entry format_name
- Tests with unicode path entires are failing HOT 2
- Ability to decrypt zip and 7z files HOT 3
- Extract empty files? HOT 2
- SECURE_NODOTDOT not having effect? HOT 1
- Writing 7zip file HOT 1
- Modifying the path of an entry before adding it to an archive HOT 1
- Detect valid archives HOT 3
- Inconsistent behavior for encrypted rar/zip/7z HOT 1
- Installing python-libarchive using pip via wheel HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-libarchive-c.