devsnd / tinytag Goto Github PK
View Code? Open in Web Editor NEWPython library for reading audio file metadata, duration of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA, Wave, AIFF and a few more
License: MIT License
Python library for reading audio file metadata, duration of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA, Wave, AIFF and a few more
License: MIT License
I have a flac file that raises a ZeroDivisionError
when its duration is parsed:
>>> from tinytag import TinyTag
>>> TinyTag.get('barcelona.flac', duration=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 78, in get
tag.load(tags=tags, duration=duration)
File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 434, in load
self._determine_duration(self._filehandler)
File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 477, in _determine_duration
self.bitrate = self.filesize/self.duration*8/1024
ZeroDivisionError: float division by zero
To recreate: I created the flac by transcoding it from mp3 with python-audiotranscode
. The mp3 has a duration of about 2:45*; the flac file plays just fine (and completely), except the duration remains unknown in the player I've tried (audacious).
$ flac --version
flac 1.3.0
$ ffmpeg -version
ffmpeg version 2.2.2
[...]
As a fallback for when the duration remains unknown, I'd set duration
and bitrate
to None: it makes for fast failing calculations and is a better match for missing/unknown values than 0, NaN or Inifinity. (The latter are proper float values and could be mistaken for the actual value, in which case they'd most probably be wrong.) Except I don't know if None
values work well with what the rest of tinytag does, so I'm leaving the fix up to you. :)
* Might be interesting: both mp3 and flac stop playing after 2:45. Audacious shows a duration of 2:46 for the mp3, while tinytag computes a different mp3 duration of 172.31..., which is ~2:52.
I cannot read meta data in case file is webm format.
Hi,
I can't get the cover art of few audio files.
The get_image () method does not detect. I can provide the file by e-mail
Otherwise I want to thank you for this library, very efficient and convenient. (I have made some benchmarks tests against others and this one is very fast)
Email sent to tomwallroth ...
hi, when I read the list of possible attributes that I can get with TinyTag, I find a tiny spelling mistake. The detailed info is as follows:
tag.title # title of the sonf
I think the sonf
should be song
: )
ID3 tags are not always decoded correctly, see CherryMusic issue #536.
Since commit 1c53058, latin1
is used for decoding instead of ascii
. If possible, tinytag should make an effort to determine the proper encoding to use, before falling back to a default. Bonus points for making the fallback encoding a parameter. Besides latin
, at least UTF-8
should be supported,
Is there an example I could use for saving album art to a local file? I'm doing this:
tag = TinyTag.get(filePath, image=True)
image_data = tag.get_image()
with open(filePath + '.jpg', "wb") as f:
f.write(image_data)
But then I get a jpg file that can't be opened.
I was trying to use tinytag.TinyTag.get(filename)
, but it errored and said
tinytag.tinytag.TinyTagException: mp3 parsing failed
.
Now I get that this means the mp3 is invalid, but the weird thing is:
It works with ffplay, VLC and WMP (and probably all music playing software)
File was downloaded using youtube_dl, not sure if that matters.
Traceback (most recent call last):
File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 311, in _determine_duration
frame_bitrate = ID3.bitrate_by_version_by_layer[mpeg_id][layer_id][br_id]
TypeError: 'NoneType' object is not subscriptable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 94, in get
tag.load(tags=tags, duration=duration, image=image)
File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 115, in load
self._determine_duration(self._filehandler)
File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 313, in _determine_duration
raise TinyTagException('mp3 parsing failed')
tinytag.tinytag.TinyTagException: mp3 parsing failed
Some FLAC files also have an ID3 header. It can be ignore by skipping the ID3 header when this line fails.
Any chance of adding support for m4b files?
thanks!
Seems the most recent version never made it to pypi.
pip install tinytag==0.6.1
Downloading/unpacking tinytag==0.6.1
Could not find a version that satisfies the requirement tinytag==0.6.1 (from versions: 0.6.0)
Note, it can be worked around with pip install https://github.com/devsnd/tinytag/archive/0.6.1.tar.gz
, but best to check your deploy steps to figure out why it isn't finding it from pypi.
Since Windows Media Player only tags ripped Wave files with RIFF INFO tags instead of ID3 this could be useful
I have an issue with reading a couple of wav files. They were ripped from a CD using dBpoweramp, but TinyTag can't read them. The metadata can be read by ffmpeg though.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tinytag/tinytag.py", line 131, in get
tag.load(tags=tags, duration=duration, image=image)
File "tinytag/tinytag.py", line 145, in load
self._parse_tag(self._filehandler)
File "tinytag/tinytag.py", line 860, in _parse_tag
self._determine_duration(fh) # parse whole file to determine tags:(
File "tinytag/tinytag.py", line 851, in _determine_duration
id3._parse_id3v2(fh)
File "tinytag/tinytag.py", line 596, in _parse_id3v2
frame_size = self._parse_frame(fh, id3version=major)
File "tinytag/tinytag.py", line 629, in _parse_frame
frame = struct.unpack(binformat, frame_header_data)
struct.error: unpack requires a string argument of length 10
Got a ZeroDivisionError while processing this:
self.duration = xframes * ID3.samples_per_frame / float(self.samplerate)
self.bitrate = byte_count * 8 / self.duration
Traceback can be found here:
tag = TinyTag.get("music.mp3")
File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 136, in load self._determine_duration(self._filehandler)
File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 528, in _determine_duration
self.bitrate = byte_count * 8 / self.duration
ZeroDivisionError: float division by zero
You may want to add some error-handling to avoid compatibility problems, I can fix it for you if you want to.
Just a thought: Could it be that duration needs to be divided by number of channels?
as far as i've checked the feature/mp4
has been merged back into master
. the get_image()
fails on m4a
files:
tag = TinyTag.get('sample.m4a', image=True)
image_data = tag.get_image()
if not image_data:
print('oops! this file has a cover image :-/')
I have particular wav file (don't have rights to share though sorry), but when I load with TinyTag the process hangs.
Digging it issue I see that the process is stuck in a loop of scanning the file. I have a fix prepared and I will share branch for consideration.
Python output:
>>> from tinytag import TinyTag
>>> TinyTag.get("a.ogg")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 94, in get
tag.load(tags=tags, duration=duration, image=image)
File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 115, in load
self._determine_duration(self._filehandler)
File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 480, in _determine_duration
self.duration = self._max_samplenum / float(self.samplerate)
TypeError: float() argument must be a string or a number, not 'NoneType'
FFprobe output:
Input #0, ogg, from 'a.ogg':
Duration: 00:05:19.43, start: 0.000000, bitrate: 177 kb/s
Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
Metadata: [...]
The file in question: music.zip
Bona-fide .m4a
files as imported from an idevice don't parse:
>>> TinyTag.get('~/Music/iPhone/9764342624706351712.m4a')
{'filesize': 15616586, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 256.0, 'channels': 2, 'comment': None, 'composer': None, 'disc': None, 'disc_total': None, 'duration': 411.0110101 'genre': None, 'samplerate': 44100, 'title': None, 'track': None, 'track_total': None, 'year': None}
This is my first time using tinytag
I have a collection of 75000 mp3s
Turns out that one of the files had 1/4/4/4/4/4 stored in the "disc" tag
which made _set_field crash
If you change the lines from
current, total = str(value).split('/') #which of course only works if the value is xx/xx
to this:
splits = str(value).split('/')
current, total = splits[0], splits[1]
it will work every time
I changed this line to return self._unpad(codecs.decode(x, 'gbk'))
got the correct meta data.
Line 607 in e9b301d
Add an option of encoding? or other solution?
it says the duration of wav file is 0 even though its not
For instance, tag.album
gives 'The Double EP: A Sea of Split '
rather than 'The Double EP: A Sea of Split Peas'
. The album name is correctly determined by other programs such as easytag or rythmbox, so I believe it to be correctly encoded in the file.
Edit: seems to occur for track names as well.
Currently TinyTag fails if it can't decode a string. (line 664, _decode_string)
However when parsing existing data, it happens sometimes that someone wrote garbage into the tags data field. It should be possible to ignore the garbage.
As you propose, we continue our discussion here.
@devsnd "Maybe we could add the following ability to TinyTag: If there are e.g. 5 consecutive frames with the same bitrate, we assume it's CBR and stop."
This is the way mp3info works with CBR MP3 files. And I have to change the way mp3info calculates the play time because nether TinyTag nor mp3info cannot calculate correct time for my 22050/mono MP3 files. Both give me about two times less than actual play time though JPlayer show the correct play time. That's why I started to look for another library to calculate the play time.
Can i edit the tags of media file ,
and also edit ID3v2 tags so that they show up in Windows media player??
When a file has multiple sets of tags, say ID3 and FLAC, they are currently just merged, on a first-come-first-serve basis per tag (e.g. if ID3 comes first, has the artist tag set, then the artist tag from the FLAC header is ignored).
Imo, it would make more sense to just use data from one of them, deciding which one of them to use either on the file format (if FLAC, prefer FLAC metadata) or completeness (use the one that's got more complete information).
This is mainly inspired by one file I found which had an ID3 header with absolutely useless information first, followed by the FLAC header with actual information; tinytag currently shows mostly the useless information, since that came first, but also includes useful information (see the test case I added in #56)
Hi,
I would like to extract the codec of the music files (mp3, FLAC, ...). The main objectif if to compare all tracks each other to find duplicates and keep the tracks with the best quality.
How I could procede to get the info ?
Thanks,
lbrth
It'd be nice to get some additional metadata about the file:
info.audio_offset
info.bitrate
info.duration
info.size
info.audio_size
info.comment
info.genre
info.sample_rate
see: http://www.hardcoded.net/docs/hsaudiotag/usage.html#available-attributes
I'm looking for my application to do a conditional like:
if not file.split(".")[-1].lower() in TinyTag.getSupportedTypes():
essentially, to check programmatically if a file extension is supported before trying to parse metadata. it would help for future changes...
It seems that the python spec allows pathes to be either strings or binary data. According to stackoverflow this is cause on linux, filenames do not have an encoding but are just a buch of bytes.
In my case, I have a path that is not UTF-8 and converting it to an UTF-8 string does not work cause the conversion fails with encoding errors.
Using the os functions (ie. os.stat) to access the file through its binary name works. (like os.stat(b"some\x12nasty\x34file\xff")
)
However TinyTag fails to access the file. This is caused by tinytag.py line 122:
filename = os.path.expanduser(str(filename)) # cast pathlib.Path to str
In the light of how pathes work on linux, casting to str is obviously the wrong way.
Hello. I open this ticket which seems to correspond to #37.
I have an album with tracks about twice as long as their true duration. I send a mp3 file by email.
The correct length of an mp3 file canonly be estimated, if the whole file was parsed. This is very expensive.
Implement an estimation algorithm to speed up length detection.
Hello, as I describe when using any .wav file I get the following:
{'filesize': 35799444, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 1378.125, 'channels': None, 'comment': None, 'disc': None, 'disc_total': None, 'duration': 202.94401360544217, 'genre': None, 'samplerate': 44100, 'title': 'frenemy rem 7', 'track': None, 'track_total': None, 'year': None, 'audio_offest': 112}
Is there any way to fix this?
I'm trying to get the tags of some Wave files ripped from a CD through Windows Media Player, but tinytag is giving None
for all the tags. If I try to add tags to the files manually with Audacity, I get struct error: unpack requires a buffer of 10 bytes
. Other Wave files converted from mp3 and tagged using Audacity work fine.
hi, when I do info = TinyTag.get('my.mp3')
i get this error
I can provide mp3
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-2-015becc94593> in <module>()
----> 1 info = TinyTag.get('my.mp3')
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in get(cls, filename, tags, length)
56 if filename.lower().endswith('.mp3'):
57 with open(filename, 'rb') as af:
---> 58 return ID3(af, tags=tags, length=length)
59 elif filename.lower().endswith(('.oga', '.ogg')):
60 with open(filename, 'rb') as af:
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in __init__(self, filehandler, tags, length)
117 def __init__(self, filehandler, tags=True, length=True):
118 TinyTag.__init__(self)
--> 119 self.load(filehandler, tags=tags, length=length)
120
121 def _determine_length(self, fh):
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in load(self, filehandler, tags, length)
77 """
78 if tags:
---> 79 self._parse_tag(filehandler)
80 filehandler.seek(0)
81 if length:
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_tag(self, fh)
159
160 def _parse_tag(self, fh):
--> 161 self._parse_id3v2(fh)
162 if not self.has_all_tags(): # try to get more info using id3v1
163 fh.seek(-128, 2) # id3v1 occuppies the last 128 bytes
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_id3v2(self, fh)
183 while parsed_size < size:
184 is_id3_v22 = major == 2
--> 185 frame_size = self._parse_frame(fh, is_v22=is_id3_v22)
186 if frame_size == 0:
187 break
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_frame(self, fh, is_v22)
219 self._parse_track(content)
220 else:
--> 221 self._set_field(fieldname, content, self._decode_string)
222 return frame_size
223 return 0
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _set_field(self, fieldname, bytestring, transfunc)
88 return
89 if transfunc:
---> 90 setattr(self, fieldname, transfunc(bytestring))
91 else:
92 setattr(self, fieldname, bytestring)
/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _decode_string(self, b)
228 return self._unpad(codecs.decode(b[1:], 'ISO-8859-1'))
229 if b[0:3] == b'\x01\xff\xfe':
--> 230 return self._unpad(codecs.decode(b[3:], 'UTF-16'))
231 return self._unpad(codecs.decode(b, 'ISO-8859-1'))
232
/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_16.pyc in decode(input, errors)
14
15 def decode(input, errors='strict'):
---> 16 return codecs.utf_16_decode(input, errors, True)
17
18 class IncrementalEncoder(codecs.IncrementalEncoder):
UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 20: truncated data
when decoding flac duration everything is fine down to the millisecond, as long as the file is short. files longer than a few minutes already show a much too low duration for the track.
Found this in my cherrymusic/error.log:
ERROR [2014-10-24 12:16:17,214] : cherrypy.error.3074577548 : from line (201) at
/home/cherrymusic/cherrymusic/cherrymusic/cherrypy/_cplogging.py
--
[24/Oct/2014:12:16:17] HTTP Traceback (most recent call last):
[...]
File "/home/cherrymusic/cherrymusic/cherrymusic/cherrymusicserver/metainfo.py", line 55, in getSongInfo
tag = TinyTag.get(filepath)
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 78, in get
tag.load(tags=tags, duration=duration)
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 93, in load
self._parse_tag(self._filehandler)
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 201, in _parse_tag
self._parse_id3v1(fh)
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 231, in _parse_id3v1
self._set_field('title', fh.read(30), transfunc=asciidecode)
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 104, in _set_field
setattr(self, fieldname, transfunc(bytestring))
File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 230, in <lambda>
asciidecode = lambda x: self._unpad(codecs.decode(x, 'ASCII'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 22: ordinal not in range(128)
I don't have access to the live output of the server session right now, or I'd look up which file's responsible.
Anyway, I suppose you can recognize this from the occasional "error getting song metadata" in cherrymusic.
I get this error with some unicode song titles.MP4Box was used to pack the m4a file
and the title is "Cold Water (feat. Justin Bieber & MØ)".
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 34: invalid start byte
Any help is appreciated.
I have a wav file that failed to load with TinyTag. I think it was created it with originally created with audacity a while ago, however I've created a small unit test to simulate the issue.
When I try to load this file I get the error
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/Users/ian/projects/opensource/tinytag/tinytag/tests/test.py", line 114, in get_info tag = TinyTag.get(filename) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 132, in get tag.load(tags=tags, duration=duration, image=image) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 146, in load self._parse_tag(self._filehandler) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 864, in _parse_tag self._determine_duration(fh) # parse whole file to determine tags:( File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 855, in _determine_duration id3._parse_id3v2(fh) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 597, in _parse_id3v2 frame_size = self._parse_frame(fh, id3version=major) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 633, in _parse_frame frame = struct.unpack(binformat, frame_header_data) struct.error: unpack requires a buffer of 10 bytes
When I dump the frame_header_data values in the _parse_frame method I see ...
b'TRCK\x00\x00\x00\x03\x00\x00' b'TIT2\x00\x00\x00\x08\x00\x00' b'\x00'
Essentially I see this zero byte in the final header. Possibly an invalid ID3 tag header, however TinyTag does have the opportunity to handle such headers without failure.
I'll create PR shortly demonstrating this with a unit test with a candidate fix.
Would it be possible for tinytag to expose the ID3 "comment" field?
I need to get cover art and other data for some of my music files. Seems like most of the tag parsing libraries only give me the cover art of MP3, but that's fine..
Anyways, I couldn't get my image to display because some of the bytes data is truncated, according to Pillow, at least. I even loaded the bytes with the PyQt library to display on the GUI, but it can't load from the data.
Then I loaded mutagen, which is a little more non-intuitive to use than tinytag. But anyways, I got bytes data from both the libraries and compared them on diffnow
I tried it with 2 files, and apparently tinytag does cut out some data from the beginning. I could get the bytes data from mutagen to display perfectly.
I have uploaded the reports from both analysis here: https://drive.google.com/open?id=1EZ7XMPoQsrEaeQmZZ3Z-nwK2XM66EVBZ
Please take a look... Here are some screenshots from the analysis
Here's the difference of the first analysis. The first line is TinyTag and the second line is Mutagen
Here's the second one. Same thing
Is this a bug? I hope this is fixed in a future version.
Thank you so much for the amazing library.
Exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 176, in main
status = self.run(options, args)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/commands/install.py", line 393, in run
use_user_site=options.use_user_site,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/init.py", line 57, in install_given_reqs
**kwargs
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 919, in install
use_user_site=use_user_site, pycompile=pycompile,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 445, in move_wheel_files
warn_script_location=warn_script_location,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/wheel.py", line 391, in move_wheel_files
assert info_dir, "%s .dist-info directory not found" % req
AssertionError: tinytag .dist-info directory not found
This is the full issue log. Any packages that I may be missing?
When I tries to get tags of valid OGG file with tinytag.TinyTag.get(path)
, I got this trace:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 120, in get
tag.load(tags=tags, duration=duration, image=image)
File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 135, in load
self._parse_tag(self._filehandler)
File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 703, in _parse_tag
self._parse_vorbis_comment(walker)
File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 737, in _parse_vorbis_comment
keyvalpair = codecs.decode(fh.read(length), 'UTF-8')
File "/usr/lib/python3.4/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 10: invalid start byte
Python version: 3.4.0 CPython
on linux
Hey @devsnd,
Can you add m4a / mp4 support to tiny?
The APIC
field probably contains the cover image within a ID3 tag. See:
https://en.wikipedia.org/wiki/ID3#ID3v2_frame_specification_.28Version_2.3.29
It should be possible to read this information here: https://github.com/devsnd/tinytag/blob/master/tinytag/tinytag.py#L254
The best way to tackle this without a major performance hit is probably to introduce 2 private fields that include the offset and length of the image file within the MP3, a public field which tells the user whether such an image could be read and a method to get a bytestream of the image.
Hi,
Any VBR mp3 file I point tinytag at gives me a bitrate value which is about 1000x what audio software reports. I expect that this value is bps rather than kbps.
For example, for the file at (https antisol dot org slash brass dot mp3), tinytag gives me a bitrate value of 195206.785168, whereas various audio software reports the file as being 195kbps, including:
I've seen this with all other VBR mp3 files I've tried, they all give a value ~1000x the value reported by audacious.
As a workaround I was able to do something like:
if bitrate > 320: bitrate = bitrate / 1099
to get a pretty good bitrate value. But it would be nice to see a more accurate value come from TinyTag.
Thanks!
I encountered this issue while parsing over some of my MP3 (I haven't tested this on other formats). The issue presents in all text fields.
{
'audio_offset': 2058, 'track': '\x0311', 'year': '2000', 'filesize': 5382993,
'album': '\x03Relationship of Command', 'title': '\x03Nonâ\x80\x90Zero Possibility',
'samplerate': 44100, 'duration': 336.43710016824866, 'track_total': '11',
'bitrate': 128.0, 'artist': '\x03At the Driveâ\x80\x90In'
}
I receive \x03 END OF TEXT
at this beginning of each string data point (always happens, at least on the handful of albums I've tested on) as well as a garbled decode on -
. I believe the issue is related to ID3
making the decision on how to decode text: either ISO-8859-1
or UTF-16
depending on how the file reads.
I believe the best course of action would be an option keyword argument on TinyTag.get
that allows the user to specify the preferred decoding and then either falling back onto a standard (ISO-8859-1, which is currently used, perhaps) or passing the buck back to the user for error handling.
I haven't had the chance to test this change myself, yet. I'm currently dealing with the issue with this bandage hack:
def fixer(value, ignore=(AttributeError, UnicodeEncodeError), handle=None):
'''Actual fixer function for fix_track
ignore is a tuple of exceptions we should just discard.
handle is an optional exception handler.
'''
try:
value = value.encode('latin-1').decode('utf-8')
# matching \x03 is frustrating
# again, just a crutch to lean on
if not value[0].isprintable():
value = value[1:]
except ignore as e:
if handle:
handle(e)
else:
pass
finally:
# we always end up here
return value
def fix_track(
track,
fixer=fixer,
fields=('artist', 'album', 'title', 'track', 'year', 'track_total'),
int_convert=('track', 'year', 'track_total')
):
'''Fix encoding issue encountered on some tracks.
Accepts a track object and attempts to massage the data in our favor.
* fixer is the function we want to run on this track to correct data
* fields in the specific fields we'd like to attempt to correct
* int_convert is a subset of fields that is data that should be integers
'''
for f in fields:
value = getattr(track, f)
if not value:
# value is likely None
# we'll pass on this value
# to avoid blowing up
continue
else:
value = fixer(value)
if f in int_convert:
try:
value = int(value)
except ValueError:
pass
setattr(track, f, value)
# TODO: need to make this mutable
# for now, it's hardcoded as TinyTag
# stores duration as a float
track.duration = int(track.duration)
# returning the track allows us
# to be flexible in application
# of this function
return track
Expected: At the Drive-In - Relationship of Command - 11 - Non-Zero Possibility
Actual: At the Drive�In - Relationship of Command - 11 - Non�Zero Possibility
After Fixing: At the Drive‐In - Relationship of Command - 11 - Non‐Zero Possibility
Thanks for the great library. I'm using it to build a script to automatically file audio files in a correct location and it's been really helpful.
However, I have several multiple-disc albums where track numbers are duplicated on each disc. The actual disc number is stored in the disc number part of the track. However, tinytag doesn't seem to read this information. Any chance for adding this in a future release?
I could give it a shot myself, but I have no idea about FLAC / OGG / MP3 ID3tag specifications and wouldn't know where to actually find this information. What sources did you use in order to implement tintytag?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.