devsnd / tinytag Goto Github PK

Python library for reading audio file metadata, duration of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA, Wave, AIFF and a few more

License: MIT License

Python 100.00%

mp3 flac audio wav music ogg mp4 opus m4a wma

tinytag's Issues

ZeroDivision in Flac._determine_duration

I have a flac file that raises a ZeroDivisionError when its duration is parsed:

>>> from tinytag import TinyTag
>>> TinyTag.get('barcelona.flac', duration=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 78, in get
    tag.load(tags=tags, duration=duration)
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 434, in load
    self._determine_duration(self._filehandler)
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 477, in _determine_duration
    self.bitrate = self.filesize/self.duration*8/1024
ZeroDivisionError: float division by zero

To recreate: I created the flac by transcoding it from mp3 with python-audiotranscode. The mp3 has a duration of about 2:45*; the flac file plays just fine (and completely), except the duration remains unknown in the player I've tried (audacious).

$ flac --version
flac 1.3.0
$ ffmpeg -version
ffmpeg version 2.2.2
[...]

As a fallback for when the duration remains unknown, I'd set duration and bitrate to None: it makes for fast failing calculations and is a better match for missing/unknown values than 0, NaN or Inifinity. (The latter are proper float values and could be mistaken for the actual value, in which case they'd most probably be wrong.) Except I don't know if None values work well with what the rest of tinytag does, so I'm leaving the fix up to you. :)

* Might be interesting: both mp3 and flac stop playing after 2:45. Audacious shows a duration of 2:46 for the mp3, while tinytag computes a different mp3 duration of 172.31..., which is ~2:52.

tintag cannot read webm file meta data

I cannot read meta data in case file is webm format.

Flac file with an ID3 header meets "Invalid flac header" Exception

test.zip

fail and raise exception in this line

test file is above.

Can't get cover art from mp3 files with ID3 v2.2

Hi,

I can't get the cover art of few audio files.
The get_image () method does not detect. I can provide the file by e-mail
Otherwise I want to thank you for this library, very efficient and convenient. (I have made some benchmarks tests against others and this one is very fast)

Email sent to tomwallroth ...

README.md has a tiny spelling mistake

hi, when I read the list of possible attributes that I can get with TinyTag, I find a tiny spelling mistake. The detailed info is as follows:

tag.title # title of the sonf

I think the sonf should be song : )

Use correct character encoding when reading ID3 tags

ID3 tags are not always decoded correctly, see CherryMusic issue #536.

Since commit 1c53058, latin1 is used for decoding instead of ascii. If possible, tinytag should make an effort to determine the proper encoding to use, before falling back to a default. Bonus points for making the fallback encoding a parameter. Besides latin, at least UTF-8 should be supported,

Example for saving album art to a local file

Is there an example I could use for saving album art to a local file? I'm doing this:

tag = TinyTag.get(filePath, image=True)
image_data = tag.get_image()

with open(filePath + '.jpg', "wb") as f:
    f.write(image_data)

But then I get a jpg file that can't be opened.

tinytag unable to process valid mp3

I was trying to use tinytag.TinyTag.get(filename), but it errored and said
tinytag.tinytag.TinyTagException: mp3 parsing failed.
Now I get that this means the mp3 is invalid, but the weird thing is:
It works with ffplay, VLC and WMP (and probably all music playing software)

File was downloaded using youtube_dl, not sure if that matters.

Traceback (most recent call last):
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 311, in _determine_duration
    frame_bitrate = ID3.bitrate_by_version_by_layer[mpeg_id][layer_id][br_id]
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 94, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 115, in load
    self._determine_duration(self._filehandler)
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 313, in _determine_duration
    raise TinyTagException('mp3 parsing failed')
tinytag.tinytag.TinyTagException: mp3 parsing failed

Flac handler can't ignore ID3 header

Some FLAC files also have an ID3 header. It can be ignore by skipping the ID3 header when this line fails.

REQUEST: Support for m4b files?

Any chance of adding support for m4b files?

thanks!

0.6.1 not on pypi

Seems the most recent version never made it to pypi.

pip install tinytag==0.6.1
Downloading/unpacking tinytag==0.6.1
  Could not find a version that satisfies the requirement tinytag==0.6.1 (from versions: 0.6.0)

Note, it can be worked around with pip install https://github.com/devsnd/tinytag/archive/0.6.1.tar.gz, but best to check your deploy steps to figure out why it isn't finding it from pypi.

Request: support for RIFF INFO tags

Since Windows Media Player only tags ripped Wave files with RIFF INFO tags instead of ID3 this could be useful

Problem reading a wav file (struct.error)

I have an issue with reading a couple of wav files. They were ripped from a CD using dBpoweramp, but TinyTag can't read them. The metadata can be read by ffmpeg though.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tinytag/tinytag.py", line 131, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "tinytag/tinytag.py", line 145, in load
    self._parse_tag(self._filehandler)
  File "tinytag/tinytag.py", line 860, in _parse_tag
    self._determine_duration(fh)  # parse whole file to determine tags:(
  File "tinytag/tinytag.py", line 851, in _determine_duration
    id3._parse_id3v2(fh)
  File "tinytag/tinytag.py", line 596, in _parse_id3v2
    frame_size = self._parse_frame(fh, id3version=major)
  File "tinytag/tinytag.py", line 629, in _parse_frame
    frame = struct.unpack(binformat, frame_header_data)
struct.error: unpack requires a string argument of length 10

ZeroDivisionError when duration is 0

Got a ZeroDivisionError while processing this:

self.duration = xframes * ID3.samples_per_frame / float(self.samplerate)
                        self.bitrate = byte_count * 8 / self.duration

Traceback can be found here:

tag = TinyTag.get("music.mp3")

File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 136, in load self._determine_duration(self._filehandler)

File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 528, in _determine_duration
self.bitrate = byte_count * 8 / self.duration

ZeroDivisionError: float division by zero

You may want to add some error-handling to avoid compatibility problems, I can fix it for you if you want to.

Duration is doubled in mp3 files

Just a thought: Could it be that duration needs to be divided by number of channels?

get_image() fails on m4a

as far as i've checked the feature/mp4 has been merged back into master. the get_image() fails on m4a files:

tag = TinyTag.get('sample.m4a', image=True)
image_data = tag.get_image()
if not image_data:
    print('oops! this file has a cover image :-/')

Hangs when parsing of a specific wav file (getting duration)

I have particular wav file (don't have rights to share though sorry), but when I load with TinyTag the process hangs.

Digging it issue I see that the process is stuck in a loop of scanning the file. I have a fix prepared and I will share branch for consideration.

self.samplerate is None when reading Ogg with Opus codec

Python output:

>>> from tinytag import TinyTag
>>> TinyTag.get("a.ogg")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 94, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 115, in load
    self._determine_duration(self._filehandler)
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 480, in _determine_duration
    self.duration = self._max_samplenum / float(self.samplerate)
TypeError: float() argument must be a string or a number, not 'NoneType'

FFprobe output:

Input #0, ogg, from 'a.ogg':
  Duration: 00:05:19.43, start: 0.000000, bitrate: 177 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata: [...]

The file in question: music.zip

Invalid tags from imported m4a

Bona-fide .m4a files as imported from an idevice don't parse:

>>> TinyTag.get('~/Music/iPhone/9764342624706351712.m4a')
{'filesize': 15616586, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 256.0, 'channels': 2, 'comment': None, 'composer': None, 'disc': None, 'disc_total': None, 'duration': 411.0110101 'genre': None, 'samplerate': 44100, 'title': None, 'track': None, 'track_total': None, 'year': None}

found (and fixed) a bug in the _set_field function

This is my first time using tinytag
I have a collection of 75000 mp3s
Turns out that one of the files had 1/4/4/4/4/4 stored in the "disc" tag
which made _set_field crash

If you change the lines from
current, total = str(value).split('/') #which of course only works if the value is xx/xx

to this:
splits = str(value).split('/')
current, total = splits[0], splits[1]

it will work every time

Output garbled characters when ID3V1 contains Chinese characters.

I changed this line to return self._unpad(codecs.decode(x, 'gbk')) got the correct meta data.

tinytag/tinytag/tinytag.py

Line 607 in e9b301d

return self._unpad(codecs.decode(x, 'latin1'))

Add an option of encoding? or other solution?

duration of wav file is 0

it says the duration of wav file is 0 even though its not

Album and track names longer than 30 characters are truncated

For instance, tag.album gives 'The Double EP: A Sea of Split ' rather than 'The Double EP: A Sea of Split Peas'. The album name is correctly determined by other programs such as easytag or rythmbox, so I believe it to be correctly encoded in the file.

Edit: seems to occur for track names as well.

Add an option to ignore encoding errors

Currently TinyTag fails if it can't decode a string. (line 664, _decode_string)

However when parsing existing data, it happens sometimes that someone wrote garbage into the tags data field. It should be possible to ignore the garbage.

Speed up TinyTag with MP3 processing

As you propose, we continue our discussion here.
@devsnd "Maybe we could add the following ability to TinyTag: If there are e.g. 5 consecutive frames with the same bitrate, we assume it's CBR and stop."

This is the way mp3info works with CBR MP3 files. And I have to change the way mp3info calculates the play time because nether TinyTag nor mp3info cannot calculate correct time for my 22050/mono MP3 files. Both give me about two times less than actual play time though JPlayer show the correct play time. That's why I started to look for another library to calculate the play time.

Is it Possible to change Tags and Album art?

Can i edit the tags of media file ,
and also edit ID3v2 tags so that they show up in Windows media player??

Precedence with multiple tag headers

When a file has multiple sets of tags, say ID3 and FLAC, they are currently just merged, on a first-come-first-serve basis per tag (e.g. if ID3 comes first, has the artist tag set, then the artist tag from the FLAC header is ignored).
Imo, it would make more sense to just use data from one of them, deciding which one of them to use either on the file format (if FLAC, prefer FLAC metadata) or completeness (use the one that's got more complete information).

This is mainly inspired by one file I found which had an ID3 header with absolutely useless information first, followed by the FLAC header with actual information; tinytag currently shows mostly the useless information, since that came first, but also includes useful information (see the test case I added in #56)

How to get the Codec ?

Hi,

I would like to extract the codec of the music files (mp3, FLAC, ...). The main objectif if to compare all tracks each other to find duplicates and keep the tracks with the best quality.

How I could procede to get the info ?

Thanks,

lbrth

Additional Metadata

It'd be nice to get some additional metadata about the file:
info.audio_offset
info.bitrate
info.duration
info.size
info.audio_size
info.comment
info.genre
info.sample_rate

see: http://www.hardcoded.net/docs/hsaudiotag/usage.html#available-attributes

Request: provide a method that returns all supported formats

I'm looking for my application to do a conditional like:

if not file.split(".")[-1].lower() in TinyTag.getSupportedTypes():

essentially, to check programmatically if a file extension is supported before trying to parse metadata. it would help for future changes...

Binary paths are not handled

It seems that the python spec allows pathes to be either strings or binary data. According to stackoverflow this is cause on linux, filenames do not have an encoding but are just a buch of bytes.

In my case, I have a path that is not UTF-8 and converting it to an UTF-8 string does not work cause the conversion fails with encoding errors.

Using the os functions (ie. os.stat) to access the file through its binary name works. (like os.stat(b"some\x12nasty\x34file\xff"))

However TinyTag fails to access the file. This is caused by tinytag.py line 122:
filename = os.path.expanduser(str(filename)) # cast pathlib.Path to str

In the light of how pathes work on linux, casting to str is obviously the wrong way.

Invalid duration

Hello. I open this ticket which seems to correspond to #37.
I have an album with tracks about twice as long as their true duration. I send a mp3 file by email.

implement VBR mp3 length estimation

The correct length of an mp3 file canonly be estimated, if the whole file was parsed. This is very expensive.

Implement an estimation algorithm to speed up length detection.

WAV file does not print the channels

Hello, as I describe when using any .wav file I get the following:

{'filesize': 35799444, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 1378.125, 'channels': None, 'comment': None, 'disc': None, 'disc_total': None, 'duration': 202.94401360544217, 'genre': None, 'samplerate': 44100, 'title': 'frenemy rem 7', 'track': None, 'track_total': None, 'year': None, 'audio_offest': 112}

Is there any way to fix this?

TinyTag gives None for tags generated by Windows Media Player rip

I'm trying to get the tags of some Wave files ripped from a CD through Windows Media Player, but tinytag is giving None for all the tags. If I try to add tags to the files manually with Audacity, I get struct error: unpack requires a buffer of 10 bytes. Other Wave files converted from mp3 and tagged using Audacity work fine.

UnicodeDecodeError on my mp3

hi, when I do info = TinyTag.get('my.mp3') i get this error
I can provide mp3

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-2-015becc94593> in <module>()
----> 1 info = TinyTag.get('my.mp3')

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in get(cls, filename, tags, length)
     56         if filename.lower().endswith('.mp3'):
     57             with open(filename, 'rb') as af:
---> 58                 return ID3(af, tags=tags, length=length)
     59         elif filename.lower().endswith(('.oga', '.ogg')):
     60             with open(filename, 'rb') as af:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in __init__(self, filehandler, tags, length)
    117     def __init__(self, filehandler, tags=True, length=True):
    118         TinyTag.__init__(self)
--> 119         self.load(filehandler, tags=tags, length=length)
    120
    121     def _determine_length(self, fh):

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in load(self, filehandler, tags, length)
     77         """
     78         if tags:
---> 79             self._parse_tag(filehandler)
     80             filehandler.seek(0)
     81         if length:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_tag(self, fh)
    159
    160     def _parse_tag(self, fh):
--> 161         self._parse_id3v2(fh)
    162         if not self.has_all_tags():  # try to get more info using id3v1
    163             fh.seek(-128, 2)  # id3v1 occuppies the last 128 bytes

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_id3v2(self, fh)
    183             while parsed_size < size:
    184                 is_id3_v22 = major == 2
--> 185                 frame_size = self._parse_frame(fh, is_v22=is_id3_v22)
    186                 if frame_size == 0:
    187                     break

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_frame(self, fh, is_v22)
    219                     self._parse_track(content)
    220                 else:
--> 221                     self._set_field(fieldname, content, self._decode_string)
    222             return frame_size
    223         return 0

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _set_field(self, fieldname, bytestring, transfunc)
     88             return
     89         if transfunc:
---> 90             setattr(self, fieldname, transfunc(bytestring))
     91         else:
     92             setattr(self, fieldname, bytestring)

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _decode_string(self, b)
    228             return self._unpad(codecs.decode(b[1:], 'ISO-8859-1'))
    229         if b[0:3] == b'\x01\xff\xfe':
--> 230             return self._unpad(codecs.decode(b[3:], 'UTF-16'))
    231         return self._unpad(codecs.decode(b, 'ISO-8859-1'))
    232

/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_16.pyc in decode(input, errors)
     14
     15 def decode(input, errors='strict'):
---> 16     return codecs.utf_16_decode(input, errors, True)
     17
     18 class IncrementalEncoder(codecs.IncrementalEncoder):

UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 20: truncated data

wrong duration for long flac files

when decoding flac duration everything is fine down to the millisecond, as long as the file is short. files longer than a few minutes already show a much too low duration for the track.

Crash from reading non-ascii metadata (ID3v1)

Found this in my cherrymusic/error.log:

ERROR    [2014-10-24 12:16:17,214] : cherrypy.error.3074577548 : from line (201) at
        /home/cherrymusic/cherrymusic/cherrymusic/cherrypy/_cplogging.py
        --
        [24/Oct/2014:12:16:17] HTTP Traceback (most recent call last):
[...]
  File "/home/cherrymusic/cherrymusic/cherrymusic/cherrymusicserver/metainfo.py", line 55, in getSongInfo
    tag = TinyTag.get(filepath)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 78, in get
    tag.load(tags=tags, duration=duration)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 93, in load
    self._parse_tag(self._filehandler)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 201, in _parse_tag
    self._parse_id3v1(fh)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 231, in _parse_id3v1
    self._set_field('title', fh.read(30), transfunc=asciidecode)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 104, in _set_field
    setattr(self, fieldname, transfunc(bytestring))
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 230, in <lambda>
    asciidecode = lambda x: self._unpad(codecs.decode(x, 'ASCII'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 22: ordinal not in range(128)

I don't have access to the live output of the server session right now, or I'd look up which file's responsible.

Anyway, I suppose you can recognize this from the occasional "error getting song metadata" in cherrymusic.

utf-8 error

I get this error with some unicode song titles.MP4Box was used to pack the m4a file
and the title is "Cold Water (feat. Justin Bieber & MØ)".
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 34: invalid start byte
Any help is appreciated.

Fails to load wav with ID3 with an invalid zero byte header

I have a wav file that failed to load with TinyTag. I think it was created it with originally created with audacity a while ago, however I've created a small unit test to simulate the issue.

When I try to load this file I get the error

Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/Users/ian/projects/opensource/tinytag/tinytag/tests/test.py", line 114, in get_info tag = TinyTag.get(filename) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 132, in get tag.load(tags=tags, duration=duration, image=image) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 146, in load self._parse_tag(self._filehandler) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 864, in _parse_tag self._determine_duration(fh) # parse whole file to determine tags:( File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 855, in _determine_duration id3._parse_id3v2(fh) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 597, in _parse_id3v2 frame_size = self._parse_frame(fh, id3version=major) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 633, in _parse_frame frame = struct.unpack(binformat, frame_header_data) struct.error: unpack requires a buffer of 10 bytes

When I dump the frame_header_data values in the _parse_frame method I see ...

b'TRCK\x00\x00\x00\x03\x00\x00' b'TIT2\x00\x00\x00\x08\x00\x00' b'\x00'

Essentially I see this zero byte in the final header. Possibly an invalid ID3 tag header, however TinyTag does have the opportunity to handle such headers without failure.

I'll create PR shortly demonstrating this with a unit test with a candidate fix.

Comment Fields

Would it be possible for tinytag to expose the ID3 "comment" field?

WARNING: Generating metadata for package tinytag produced metadata for project name unknown

Issue with album art data

I need to get cover art and other data for some of my music files. Seems like most of the tag parsing libraries only give me the cover art of MP3, but that's fine..

Anyways, I couldn't get my image to display because some of the bytes data is truncated, according to Pillow, at least. I even loaded the bytes with the PyQt library to display on the GUI, but it can't load from the data.

Then I loaded mutagen, which is a little more non-intuitive to use than tinytag. But anyways, I got bytes data from both the libraries and compared them on diffnow

I tried it with 2 files, and apparently tinytag does cut out some data from the beginning. I could get the bytes data from mutagen to display perfectly.

I have uploaded the reports from both analysis here: https://drive.google.com/open?id=1EZ7XMPoQsrEaeQmZZ3Z-nwK2XM66EVBZ
Please take a look... Here are some screenshots from the analysis

Here's the difference of the first analysis. The first line is TinyTag and the second line is Mutagen

Here's the second one. Same thing

Is this a bug? I hope this is fixed in a future version.
Thank you so much for the amazing library.

AssertionError: tinytag .dist-info directory not found

Exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 176, in main
status = self.run(options, args)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/commands/install.py", line 393, in run
use_user_site=options.use_user_site,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/init.py", line 57, in install_given_reqs
**kwargs
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 919, in install
use_user_site=use_user_site, pycompile=pycompile,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 445, in move_wheel_files
warn_script_location=warn_script_location,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/wheel.py", line 391, in move_wheel_files
assert info_dir, "%s .dist-info directory not found" % req
AssertionError: tinytag .dist-info directory not found

This is the full issue log. Any packages that I may be missing?

Tinytag throws UnicodeDecodeError:

When I tries to get tags of valid OGG file with tinytag.TinyTag.get(path), I got this trace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 120, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 135, in load
    self._parse_tag(self._filehandler)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 703, in _parse_tag
    self._parse_vorbis_comment(walker)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 737, in _parse_vorbis_comment
    keyvalpair = codecs.decode(fh.read(length), 'UTF-8')
  File "/usr/lib/python3.4/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 10: invalid start byte

Python version: 3.4.0 CPython on linux

m4a/mp4 support

Hey @devsnd,

Can you add m4a / mp4 support to tiny?

Read images from ID3v2 Tags

The APIC field probably contains the cover image within a ID3 tag. See:
https://en.wikipedia.org/wiki/ID3#ID3v2_frame_specification_.28Version_2.3.29

It should be possible to read this information here: https://github.com/devsnd/tinytag/blob/master/tinytag/tinytag.py#L254

The best way to tackle this without a major performance hit is probably to introduce 2 private fields that include the offset and length of the image file within the MP3, a public field which tells the user whether such an image could be read and a method to get a bytestream of the image.

bitrate is in bps (not kbps) for VBR mp3 files

Hi,

Any VBR mp3 file I point tinytag at gives me a bitrate value which is about 1000x what audio software reports. I expect that this value is bps rather than kbps.

For example, for the file at (https antisol dot org slash brass dot mp3), tinytag gives me a bitrate value of 195206.785168, whereas various audio software reports the file as being 195kbps, including:

audacious
qmmp
vlc ('input bitrate' on 'statistics' screen lists ~195kbps, but codec info tab shows 128kbps)

I've seen this with all other VBR mp3 files I've tried, they all give a value ~1000x the value reported by audacious.

As a workaround I was able to do something like:

if bitrate > 320: bitrate = bitrate / 1099

to get a pretty good bitrate value. But it would be nice to see a more accurate value come from TinyTag.

Thanks!

ID3 improperly decodes.

I encountered this issue while parsing over some of my MP3 (I haven't tested this on other formats). The issue presents in all text fields.

{
 'audio_offset': 2058, 'track': '\x0311', 'year': '2000', 'filesize': 5382993, 
 'album': '\x03Relationship of Command', 'title': '\x03Nonâ\x80\x90Zero Possibility', 
 'samplerate': 44100, 'duration': 336.43710016824866, 'track_total': '11', 
 'bitrate': 128.0, 'artist': '\x03At the Driveâ\x80\x90In'
}

I receive \x03 END OF TEXT at this beginning of each string data point (always happens, at least on the handful of albums I've tested on) as well as a garbled decode on -. I believe the issue is related to ID3 making the decision on how to decode text: either ISO-8859-1 or UTF-16 depending on how the file reads.

I believe the best course of action would be an option keyword argument on TinyTag.get that allows the user to specify the preferred decoding and then either falling back onto a standard (ISO-8859-1, which is currently used, perhaps) or passing the buck back to the user for error handling.

I haven't had the chance to test this change myself, yet. I'm currently dealing with the issue with this bandage hack:

def fixer(value, ignore=(AttributeError, UnicodeEncodeError), handle=None):
    '''Actual fixer function for fix_track

    ignore is a tuple of exceptions we should just discard.
    handle is an optional exception handler.
    '''
    try:
        value = value.encode('latin-1').decode('utf-8')
        # matching \x03 is frustrating
        # again, just a crutch to lean on
        if not value[0].isprintable():
            value = value[1:]
    except ignore as e:
        if handle:
            handle(e)
        else:
            pass
    finally:
        # we always end up here
        return value


def fix_track(
              track, 
              fixer=fixer, 
              fields=('artist', 'album', 'title', 'track', 'year', 'track_total'),
              int_convert=('track', 'year', 'track_total')
              ):
    '''Fix encoding issue encountered on some tracks.

    Accepts a track object and attempts to massage the data in our favor.
    * fixer is the function we want to run on this track to correct data
    * fields in the specific fields we'd like to attempt to correct
    * int_convert is a subset of fields that is data that should be integers
    '''
    for f in fields:
        value = getattr(track, f)
        if not value: 
            # value is likely None
            # we'll pass on this value
            # to avoid blowing up
            continue
        else:
            value = fixer(value)
        if f in int_convert:
            try:
                value = int(value)
            except ValueError:
                pass
        setattr(track, f, value)

    # TODO: need to make this mutable
    # for now, it's hardcoded as TinyTag
    # stores duration as a float
    track.duration = int(track.duration)

    # returning the track allows us
    # to be flexible in application
    # of this function
    return track

Expected: At the Drive-In - Relationship of Command - 11 - Non-Zero Possibility
Actual: At the Driveâ€�In - Relationship of Command - 11 - Nonâ€�Zero Possibility
After Fixing: At the Drive‐In - Relationship of Command - 11 - Non‐Zero Possibility

Read disc number / total number of discs?

Thanks for the great library. I'm using it to build a script to automatically file audio files in a correct location and it's been really helpful.

However, I have several multiple-disc albums where track numbers are duplicated on each disc. The actual disc number is stored in the disc number part of the track. However, tinytag doesn't seem to read this information. Any chance for adding this in a future release?

I could give it a shot myself, but I have no idea about FLAC / OGG / MP3 ID3tag specifications and wouldn't know where to actually find this information. What sources did you use in order to implement tintytag?

devsnd / tinytag Goto Github PK

tinytag's Issues

Recommend Projects

Recommend Topics

Recommend Org