Giter Club home page Giter Club logo

tinytag's Introduction

tinytag

tinytag is a Python library for reading audio file metadata

Build Status Coverage Status PyPI Version PyPI Downloads

Install

python3 -m pip install tinytag

Features

  • Read tags, images and properties of audio files
  • Supported formats:
    • MP3 / MP2 / MP1 (ID3 v1, v1.1, v2.2, v2.3+)
    • M4A (AAC / ALAC)
    • WAVE / WAV
    • OGG (FLAC / Opus / Speex / Vorbis)
    • FLAC
    • WMA
    • AIFF / AIFF-C
  • Same API for all formats
  • Pure Python, no dependencies
  • Supports Python 3.7 or higher
  • High test coverage
  • A few hundred lines of code (just include it in your project!)

Usage

tinytag only provides the minimum needed for reading metadata, and presents it in a simple format. It can determine track number, total tracks, title, artist, album, year, duration and more.

from tinytag import TinyTag
tag = TinyTag.get('/some/music.mp3')
print(f'This track is by {tag.artist}.')
print(f'It is {tag.duration:.2f} seconds long.')

Alternatively you can use tinytag directly on the command line:

$ python -m tinytag --format csv /some/music.mp3
> {"filename": "/some/music.mp3", "filesize": 30212227, "album": "Album", "albumartist": "Artist", "artist": "Artist", "audio_offset": null, "bitrate": 256, "channels": 2, "comment": null, "composer": null, "disc": "1", "disc_total": null, "duration": 10, "genre": null, "samplerate": 44100, "title": "Title", "track": "5", "track_total": null, "year": "2012"}

Check python -m tinytag --help for all CLI options, for example other output formats.

Support for changing/writing metadata will not be added, use another library for this.

Supported Files

To receive a tuple of file extensions tinytag supports, use the SUPPORTED_FILE_EXTENSIONS constant:

TinyTag.SUPPORTED_FILE_EXTENSIONS

Alternatively, check if a file is supported by providing its path:

is_supported = TinyTag.is_supported('/some/music.mp3')

Attributes

List of common attributes tinytag provides:

tag.album         # album as string
tag.albumartist   # album artist as string
tag.artist        # artist name as string
tag.bitdepth      # bit depth for lossless audio
tag.bitrate       # bitrate in kBits/s
tag.comment       # file comment as string
tag.disc          # disc number
tag.disc_total    # the total number of discs
tag.duration      # duration of the song in seconds
tag.filesize      # file size in bytes
tag.genre         # genre as string
tag.samplerate    # samples per second
tag.title         # title of the song
tag.track         # track number
tag.track_total   # total number of tracks
tag.year          # year or date as string

For non-common fields and fields specific to certain file formats, use extra:

tag.extra         # a dict of additional data

The following standard extra field names are used when file formats provide relevant data:

other_artists     # additional artists as list
other_genres      # additional genres as list

bpm
composer
conductor
copyright
director
encoded_by
encoder_settings
initial_key
isrc
language
lyricist
lyrics
media
publisher
set_subtitle
url

Any other extra field names are not guaranteed to be consistent across audio formats.

Additionally, you can also get images from ID3 tags. To receive any available image, prioritizing the front cover:

tag: TinyTag = TinyTag.get('/some/music.mp3', image=True)
image: TagImage | None = tag.images.any

if image is not None:
    data: bytes = image.data
    name: str = image.name
    description: str = image.description

If you need to receive an image of a specific kind, including its description, use images:

tag.images        # available embedded images

The following common images are available:

front_cover
back_cover
leaflet
media
other

The following less common images are provided in an extra dict when present:

icon
other_icon
lead_artist
artist
conductor
band
composer
lyricist
recording_location
during_recording
during_performance
video
bright_colored_fish
illustration
band_logo
publisher_logo
unknown

The following image attributes are available:

data           # image data as bytes
name           # image name/kind as string
mime_type      # image MIME type as string
description    # image description as string

To receive a common image, e.g. front_cover:

from tinytag import TinyTag, TagImage, TagImages

tag: TinyTag = TinyTag.get('/some/music.ogg')
images: TagImages = tag.images
front_cover_images: list[TagImage] = images.front_cover

if front_cover_images:
    image: TagImage = front_cover_images[0]  # Use first image
    data: bytes = image.data
    description: str = image.description

To receive an extra image, e.g. bright_colored_fish:

fish_images = tag.images.extra.get('bright_colored_fish')

if fish_images:
    image = fish_images[0]  # Use first image
    data = image.data
    description = image.description

Encoding

To open files using a specific encoding, you can use the encoding parameter. This parameter is however only used for formats where the encoding isn't explicitly specified.

TinyTag.get('a_file_with_gbk_encoding.mp3', encoding='gbk')

File-like Objects

To use a file-like object (e.g. BytesIO) instead of a file path, pass a file_obj keyword argument:

TinyTag.get(file_obj=your_file_obj)

Exceptions

TinyTagException        # Base class for exceptions
ParseError              # Parsing an audio file failed
UnsupportedFormatError  # File format is not supported

Changelog

2.0.0 (Unreleased)

  • BREAKING: Store 'disc', 'disc_total', 'track' and 'track_total' values as int instead of str
  • BREAKING: TinyTagException no longer inherits LookupError
  • BREAKING: TinyTag subclasses are now private
  • BREAKING: Remove function to use custom audio file samples in tests
  • BREAKING: Remove support for Python 2
  • Mark 'ignore_errors' parameter for TinyTag.get() as obsolete
  • Mark 'audio_offset' attribute as obsolete
  • Deprecate 'composer' attribute in favor of 'extra.composer'
  • Deprecate 'get_image()' method in favor of 'images.any' property
  • Provide access to custom metadata fields through the 'extra' dict
  • Provide access to all available images
  • Add more standard 'extra' fields
  • FLAC: Apply ID3 tags after Vorbis
  • OGG/WMA: set missing 'channels' field
  • WMA: set missing 'extra.copyright' field
  • WMA: raise exception if file is invalid
  • Add type hints to codebase
  • Various optimizations

1.10.1 (2023-10-26)

  • Update 'extra' fields with data from other tags #188
  • ID3: Add missing 'extra.copyright' field

1.10.0 (2023-10-18)

  • Add support for OGG FLAC format #182
  • Add support for OGG Speex format #181
  • Wave: support image loading
  • Add support for file-like objects (BytesIO) #178
  • Add list of supported file extensions #177
  • Fix deprecations related to setuptools #176
  • Fix pathlib support in TinyTag.is_supported()
  • Only remove zero bytes at the end of strings
  • Stricter conditions in while loops
  • OGG: Add stricter magic byte matching for OGG files
  • Compatibility with Python 3.4 and 3.5 is no longer tested

1.9.0 (2023-04-23)

  • Add bitdepth attribute for lossless audio #157
  • Add recognition of Audible formats #163 (thanks to snowskeleton)
  • Add .m4v to list of supported file extensions #142
  • Aiff: Implement replacement for Python's aifc module #164
  • ID3: Only check for language in COMM and USLT frames #147
  • ID3: Read the correct number of bytes from Xing header #154
  • ID3: Add support for ID3v2.4 TDRC frame #156 (thanks to Uninen)
  • M4A: Add description fields #168 (thanks to snowskeleton)
  • RIFF: Handle tags containing extra zero-byte #141
  • Vorbis: Parse OGG cover art #144 (thanks to Pseurae)
  • Vorbis: Support standard disctotal/tracktotal comments #171
  • Wave: Add proper support for padded IFF chunks

1.8.1 (2022-03-12) [still mathiascode-edition]

  • MP3 ID3: Set correct file position if tag reading is disabled #119 (thanks to mathiascode)
  • MP3: Fix incorrect calculation of duration for VBR encoded MP3s #128 (thanks to mathiascode)

1.8.0 (2022-03-05) [mathiascode-edition]

  • Add support for ALAC audio files #130 (thanks to mathiascode)
  • AIFF: Fixed bitrate calculation for certain files #129 (thanks to mathiascode)
  • MP3: Do not round MP3 bitrates #131 (thanks to mathiascode)
  • MP3 ID3: Support any language in COMM and USLT frames #135 (thanks to mathiascode)
  • Performance: Don't use regex when parsing genre #136 (thanks to mathiascode)
  • Disable tag parsing for all formats when requested #137 (thanks to mathiascode)
  • M4A: Fix invalid bitrates in certain files #132 (thanks to mathiascode)
  • WAV: Fix metadata parsing for certain files #133 (thanks to mathiascode)

1.7.0. (2021-12-14)

  • fixed rare occasion of ID3v2 tags missing their first character, #106
  • allow overriding the default encoding of ID3 tags (e.g. TinyTag.get(..., encoding='gbk')))
  • fixed calculation of bitrate for very short mp3 files, #99
  • utf-8 support for AIFF files, #123
  • fixed image parsing for id3v2 with images containing utf-16LE descriptions, #117
  • fixed ID3v1 tags overwriting ID3v2 tags, #121
  • Set correct file position if tag reading is disabled for ID3 (thanks to mathiascode)

1.6.0 (2021-08-28) [aw-edition]

  • fixed handling of non-latin encoding types for images (thanks to aw-was-here)
  • added support for ISRC data, available in extra['isrc'] field (thanks to aw-was-here)
  • added support for AIFF/AIFF-C (thanks to aw-was-here)
  • fixed import deprecation warnings (thanks to idotobi)
  • fixed exception for TinyTag misuse being different in different python versions (thanks to idotobi)
  • added support for ID3 initial key tonality hint, available in extra['initial_key']
  • added support for ID3 unsynchronized lyrics, available in extra['lyrics']
  • added extra field, which may contain additional metadata not available in all file formats

1.5.0 (2020-11-05)

  • fixed data type to always return str for disc, disc_total, track, track_total #97 (thanks to kostalski)
  • fixed package install being reported as UNKNOWN for some python/pip variations #90 (thanks to russpoutine)
  • Added automatic detection for certain MP4 file headers

1.4.0 (2020-04-23)

  • detecting file types based on their magic header bytes, #85
  • fixed opus duration being wrong for files with lower sample rate #81
  • implemented support for binary paths #72
  • always cast mp3 bitrates to int, so that CBR and VBR output behaves the sam
  • made str deterministic and use json as output format

1.3.0 (2020-03-09)

  • added option to ignore encoding errors ignore_errors #73
  • Improved text decoding for many malformed files

1.2.2 (2019-04-13)

  • Improved stability when reading corrupted mp3 files

1.2.1 (2019-04-13)

  • fixed wav files not correctly reporting the number of channels #61

1.2.0 (2019-04-13)

  • using setup.cfg instead of setup.py (thanks to scivision)
  • added support for calling TinyTag.get with pathlib.Path (thanks to scivision)
  • added appveyor windows test CI (thanks to scivision)
  • using pytest instead of nosetest (thanks to scivision)

1.1.0 (2019-04-13)

  • added new field "composer" (Thanks to Phil Borman)

1.0.1 (2019-04-13)

  • fixed ID3 loading for files with corrupt header (thanks to Ian Homer)
  • fixed parsing of duration in wav file (thanks to Ian Homer)

1.0.0 (2018-12-12)

  • added comment field
  • added wav-riff format support
  • use MP4 parser for m4b files
  • added simple cli tool
  • fix parsing of FLAC files with ID3 header (thanks to minus7)
  • added method TinyTag.is_supported(filename)

0.19.0 (2018-02-11)

  • fixed corrupted images for some mp3s (#45)

0.18.0 (2017-04-29)

  • fixed wrong bitrate and crash when parsing xing header

0.17.0 (2016-10-02)

  • supporting ID3v2.2 images

0.16.0 (2016-08-06)

  • MP4 cover image support

0.15.2 (2016-08-06)

  • fixed crash for malformed MP4 files (#34)

0.15.0 (2016-08-06)

  • fixed decoding of UTF-16LE ID3v2 Tags, improved overall stability

0.14.0 (2016-06-05):

  • MP4/M4A and Opus support

tinytag's People

Contributors

aw-was-here avatar bobotig avatar candh avatar devsnd avatar egbertw avatar evandrolg avatar glogiotatidis avatar honghe avatar ianhomer avatar idotobi avatar kianmeng avatar kostalski avatar mathiascode avatar minus7 avatar pseurae avatar rizumu avatar russpoutine avatar sampas avatar snowskeleton avatar tilboerner avatar timgates42 avatar tomtier avatar uninen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinytag's Issues

wrong duration for long flac files

when decoding flac duration everything is fine down to the millisecond, as long as the file is short. files longer than a few minutes already show a much too low duration for the track.

Fails to load wav with ID3 with an invalid zero byte header

I have a wav file that failed to load with TinyTag. I think it was created it with originally created with audacity a while ago, however I've created a small unit test to simulate the issue.

When I try to load this file I get the error

Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/Users/ian/projects/opensource/tinytag/tinytag/tests/test.py", line 114, in get_info tag = TinyTag.get(filename) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 132, in get tag.load(tags=tags, duration=duration, image=image) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 146, in load self._parse_tag(self._filehandler) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 864, in _parse_tag self._determine_duration(fh) # parse whole file to determine tags:( File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 855, in _determine_duration id3._parse_id3v2(fh) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 597, in _parse_id3v2 frame_size = self._parse_frame(fh, id3version=major) File "/Users/ian/projects/opensource/tinytag/tinytag/tinytag.py", line 633, in _parse_frame frame = struct.unpack(binformat, frame_header_data) struct.error: unpack requires a buffer of 10 bytes

When I dump the frame_header_data values in the _parse_frame method I see ...

b'TRCK\x00\x00\x00\x03\x00\x00' b'TIT2\x00\x00\x00\x08\x00\x00' b'\x00'

Essentially I see this zero byte in the final header. Possibly an invalid ID3 tag header, however TinyTag does have the opportunity to handle such headers without failure.

I'll create PR shortly demonstrating this with a unit test with a candidate fix.

Use correct character encoding when reading ID3 tags

ID3 tags are not always decoded correctly, see CherryMusic issue #536.

Since commit 1c53058, latin1 is used for decoding instead of ascii. If possible, tinytag should make an effort to determine the proper encoding to use, before falling back to a default. Bonus points for making the fallback encoding a parameter. Besides latin, at least UTF-8 should be supported,

TinyTag gives None for tags generated by Windows Media Player rip

I'm trying to get the tags of some Wave files ripped from a CD through Windows Media Player, but tinytag is giving None for all the tags. If I try to add tags to the files manually with Audacity, I get struct error: unpack requires a buffer of 10 bytes. Other Wave files converted from mp3 and tagged using Audacity work fine.

self.samplerate is None when reading Ogg with Opus codec

Python output:

>>> from tinytag import TinyTag
>>> TinyTag.get("a.ogg")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 94, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 115, in load
    self._determine_duration(self._filehandler)
  File "/usr/lib/python3.5/site-packages/tinytag/tinytag.py", line 480, in _determine_duration
    self.duration = self._max_samplenum / float(self.samplerate)
TypeError: float() argument must be a string or a number, not 'NoneType'

FFprobe output:

Input #0, ogg, from 'a.ogg':
  Duration: 00:05:19.43, start: 0.000000, bitrate: 177 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata: [...]

The file in question: music.zip

Add an option to ignore encoding errors

Currently TinyTag fails if it can't decode a string. (line 664, _decode_string)

However when parsing existing data, it happens sometimes that someone wrote garbage into the tags data field. It should be possible to ignore the garbage.

0.6.1 not on pypi

Seems the most recent version never made it to pypi.

pip install tinytag==0.6.1
Downloading/unpacking tinytag==0.6.1
  Could not find a version that satisfies the requirement tinytag==0.6.1 (from versions: 0.6.0)

Note, it can be worked around with pip install https://github.com/devsnd/tinytag/archive/0.6.1.tar.gz, but best to check your deploy steps to figure out why it isn't finding it from pypi.

Can't get cover art from mp3 files with ID3 v2.2

Hi,

I can't get the cover art of few audio files.
The get_image () method does not detect. I can provide the file by e-mail
Otherwise I want to thank you for this library, very efficient and convenient. (I have made some benchmarks tests against others and this one is very fast)

Email sent to tomwallroth ...

Precedence with multiple tag headers

When a file has multiple sets of tags, say ID3 and FLAC, they are currently just merged, on a first-come-first-serve basis per tag (e.g. if ID3 comes first, has the artist tag set, then the artist tag from the FLAC header is ignored).
Imo, it would make more sense to just use data from one of them, deciding which one of them to use either on the file format (if FLAC, prefer FLAC metadata) or completeness (use the one that's got more complete information).

This is mainly inspired by one file I found which had an ID3 header with absolutely useless information first, followed by the FLAC header with actual information; tinytag currently shows mostly the useless information, since that came first, but also includes useful information (see the test case I added in #56)

Tinytag throws UnicodeDecodeError:

When I tries to get tags of valid OGG file with tinytag.TinyTag.get(path), I got this trace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 120, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 135, in load
    self._parse_tag(self._filehandler)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 703, in _parse_tag
    self._parse_vorbis_comment(walker)
  File "/usr/local/lib/python3.4/dist-packages/tinytag/tinytag.py", line 737, in _parse_vorbis_comment
    keyvalpair = codecs.decode(fh.read(length), 'UTF-8')
  File "/usr/lib/python3.4/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 10: invalid start byte

Python version: 3.4.0 CPython on linux

Request: provide a method that returns all supported formats

I'm looking for my application to do a conditional like:

if not file.split(".")[-1].lower() in TinyTag.getSupportedTypes():

essentially, to check programmatically if a file extension is supported before trying to parse metadata. it would help for future changes...

Example for saving album art to a local file

Is there an example I could use for saving album art to a local file? I'm doing this:

tag = TinyTag.get(filePath, image=True)
image_data = tag.get_image()

with open(filePath + '.jpg', "wb") as f:
    f.write(image_data)

But then I get a jpg file that can't be opened.

Issue with album art data

I need to get cover art and other data for some of my music files. Seems like most of the tag parsing libraries only give me the cover art of MP3, but that's fine..

Anyways, I couldn't get my image to display because some of the bytes data is truncated, according to Pillow, at least. I even loaded the bytes with the PyQt library to display on the GUI, but it can't load from the data.

Then I loaded mutagen, which is a little more non-intuitive to use than tinytag. But anyways, I got bytes data from both the libraries and compared them on diffnow

I tried it with 2 files, and apparently tinytag does cut out some data from the beginning. I could get the bytes data from mutagen to display perfectly.

I have uploaded the reports from both analysis here: https://drive.google.com/open?id=1EZ7XMPoQsrEaeQmZZ3Z-nwK2XM66EVBZ
Please take a look... Here are some screenshots from the analysis

Here's the difference of the first analysis. The first line is TinyTag and the second line is Mutagen
screen shot 2018-02-10 at 10 46 06 pm

Here's the second one. Same thing

screen shot 2018-02-10 at 10 46 22 pm

Is this a bug? I hope this is fixed in a future version.
Thank you so much for the amazing library.

AssertionError: tinytag .dist-info directory not found

Exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/cli/base_command.py", line 176, in main
status = self.run(options, args)
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/commands/install.py", line 393, in run
use_user_site=options.use_user_site,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/init.py", line 57, in install_given_reqs
**kwargs
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 919, in install
use_user_site=use_user_site, pycompile=pycompile,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/req/req_install.py", line 445, in move_wheel_files
warn_script_location=warn_script_location,
File "/usr/local/lib/python3.5/dist-packages/pip/_internal/wheel.py", line 391, in move_wheel_files
assert info_dir, "%s .dist-info directory not found" % req
AssertionError: tinytag .dist-info directory not found

This is the full issue log. Any packages that I may be missing?

found (and fixed) a bug in the _set_field function

This is my first time using tinytag
I have a collection of 75000 mp3s
Turns out that one of the files had 1/4/4/4/4/4 stored in the "disc" tag
which made _set_field crash

If you change the lines from
current, total = str(value).split('/') #which of course only works if the value is xx/xx

to this:
splits = str(value).split('/')
current, total = splits[0], splits[1]

it will work every time

Read disc number / total number of discs?

Thanks for the great library. I'm using it to build a script to automatically file audio files in a correct location and it's been really helpful.

However, I have several multiple-disc albums where track numbers are duplicated on each disc. The actual disc number is stored in the disc number part of the track. However, tinytag doesn't seem to read this information. Any chance for adding this in a future release?

I could give it a shot myself, but I have no idea about FLAC / OGG / MP3 ID3tag specifications and wouldn't know where to actually find this information. What sources did you use in order to implement tintytag?

implement VBR mp3 length estimation

The correct length of an mp3 file canonly be estimated, if the whole file was parsed. This is very expensive.

Implement an estimation algorithm to speed up length detection.

Speed up TinyTag with MP3 processing

As you propose, we continue our discussion here.
@devsnd "Maybe we could add the following ability to TinyTag: If there are e.g. 5 consecutive frames with the same bitrate, we assume it's CBR and stop."

This is the way mp3info works with CBR MP3 files. And I have to change the way mp3info calculates the play time because nether TinyTag nor mp3info cannot calculate correct time for my 22050/mono MP3 files. Both give me about two times less than actual play time though JPlayer show the correct play time. That's why I started to look for another library to calculate the play time.

Read images from ID3v2 Tags

The APIC field probably contains the cover image within a ID3 tag. See:
https://en.wikipedia.org/wiki/ID3#ID3v2_frame_specification_.28Version_2.3.29

It should be possible to read this information here: https://github.com/devsnd/tinytag/blob/master/tinytag/tinytag.py#L254

The best way to tackle this without a major performance hit is probably to introduce 2 private fields that include the offset and length of the image file within the MP3, a public field which tells the user whether such an image could be read and a method to get a bytestream of the image.

WAV file does not print the channels

Hello, as I describe when using any .wav file I get the following:

{'filesize': 35799444, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 1378.125, 'channels': None, 'comment': None, 'disc': None, 'disc_total': None, 'duration': 202.94401360544217, 'genre': None, 'samplerate': 44100, 'title': 'frenemy rem 7', 'track': None, 'track_total': None, 'year': None, 'audio_offest': 112}

Is there any way to fix this?

How to get the Codec ?

Hi,

I would like to extract the codec of the music files (mp3, FLAC, ...). The main objectif if to compare all tracks each other to find duplicates and keep the tracks with the best quality.

How I could procede to get the info ?

Thanks,

lbrth

tinytag unable to process valid mp3

I was trying to use tinytag.TinyTag.get(filename), but it errored and said
tinytag.tinytag.TinyTagException: mp3 parsing failed.
Now I get that this means the mp3 is invalid, but the weird thing is:
It works with ffplay, VLC and WMP (and probably all music playing software)

File was downloaded using youtube_dl, not sure if that matters.

Traceback (most recent call last):
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 311, in _determine_duration
    frame_bitrate = ID3.bitrate_by_version_by_layer[mpeg_id][layer_id][br_id]
TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 94, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 115, in load
    self._determine_duration(self._filehandler)
  File "D:\Program_Files\Python\lib\site-packages\tinytag\tinytag.py", line 313, in _determine_duration
    raise TinyTagException('mp3 parsing failed')
tinytag.tinytag.TinyTagException: mp3 parsing failed

Problem reading a wav file (struct.error)

I have an issue with reading a couple of wav files. They were ripped from a CD using dBpoweramp, but TinyTag can't read them. The metadata can be read by ffmpeg though.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tinytag/tinytag.py", line 131, in get
    tag.load(tags=tags, duration=duration, image=image)
  File "tinytag/tinytag.py", line 145, in load
    self._parse_tag(self._filehandler)
  File "tinytag/tinytag.py", line 860, in _parse_tag
    self._determine_duration(fh)  # parse whole file to determine tags:(
  File "tinytag/tinytag.py", line 851, in _determine_duration
    id3._parse_id3v2(fh)
  File "tinytag/tinytag.py", line 596, in _parse_id3v2
    frame_size = self._parse_frame(fh, id3version=major)
  File "tinytag/tinytag.py", line 629, in _parse_frame
    frame = struct.unpack(binformat, frame_header_data)
struct.error: unpack requires a string argument of length 10

get_image() fails on m4a

as far as i've checked the feature/mp4 has been merged back into master. the get_image() fails on m4a files:

tag = TinyTag.get('sample.m4a', image=True)
image_data = tag.get_image()
if not image_data:
    print('oops! this file has a cover image :-/')

Crash from reading non-ascii metadata (ID3v1)

Found this in my cherrymusic/error.log:

ERROR    [2014-10-24 12:16:17,214] : cherrypy.error.3074577548 : from line (201) at
        /home/cherrymusic/cherrymusic/cherrymusic/cherrypy/_cplogging.py
        --
        [24/Oct/2014:12:16:17] HTTP Traceback (most recent call last):
[...]
  File "/home/cherrymusic/cherrymusic/cherrymusic/cherrymusicserver/metainfo.py", line 55, in getSongInfo
    tag = TinyTag.get(filepath)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 78, in get
    tag.load(tags=tags, duration=duration)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 93, in load
    self._parse_tag(self._filehandler)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 201, in _parse_tag
    self._parse_id3v1(fh)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 231, in _parse_id3v1
    self._set_field('title', fh.read(30), transfunc=asciidecode)
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 104, in _set_field
    setattr(self, fieldname, transfunc(bytestring))
  File "/home/cherrymusic/cherrymusic/cherrymusic/tinytag/tinytag.py", line 230, in <lambda>
    asciidecode = lambda x: self._unpad(codecs.decode(x, 'ASCII'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 22: ordinal not in range(128)

I don't have access to the live output of the server session right now, or I'd look up which file's responsible.

Anyway, I suppose you can recognize this from the occasional "error getting song metadata" in cherrymusic.

Hangs when parsing of a specific wav file (getting duration)

I have particular wav file (don't have rights to share though sorry), but when I load with TinyTag the process hangs.

Digging it issue I see that the process is stuck in a loop of scanning the file. I have a fix prepared and I will share branch for consideration.

Invalid duration

Hello. I open this ticket which seems to correspond to #37.
I have an album with tracks about twice as long as their true duration. I send a mp3 file by email.

utf-8 error

I get this error with some unicode song titles.MP4Box was used to pack the m4a file
and the title is "Cold Water (feat. Justin Bieber & MØ)".
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 34: invalid start byte
Any help is appreciated.

ZeroDivision in Flac._determine_duration

I have a flac file that raises a ZeroDivisionError when its duration is parsed:

>>> from tinytag import TinyTag
>>> TinyTag.get('barcelona.flac', duration=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 78, in get
    tag.load(tags=tags, duration=duration)
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 434, in load
    self._determine_duration(self._filehandler)
  File "/home/til/git/hub/devsnd/tinytag/tinytag/tinytag.py", line 477, in _determine_duration
    self.bitrate = self.filesize/self.duration*8/1024
ZeroDivisionError: float division by zero

To recreate: I created the flac by transcoding it from mp3 with python-audiotranscode. The mp3 has a duration of about 2:45*; the flac file plays just fine (and completely), except the duration remains unknown in the player I've tried (audacious).

$ flac --version
flac 1.3.0
$ ffmpeg -version
ffmpeg version 2.2.2
[...]

As a fallback for when the duration remains unknown, I'd set duration and bitrate to None: it makes for fast failing calculations and is a better match for missing/unknown values than 0, NaN or Inifinity. (The latter are proper float values and could be mistaken for the actual value, in which case they'd most probably be wrong.) Except I don't know if None values work well with what the rest of tinytag does, so I'm leaving the fix up to you. :)


* Might be interesting: both mp3 and flac stop playing after 2:45. Audacious shows a duration of 2:46 for the mp3, while tinytag computes a different mp3 duration of 172.31..., which is ~2:52.

ZeroDivisionError when duration is 0

Got a ZeroDivisionError while processing this:

self.duration = xframes * ID3.samples_per_frame / float(self.samplerate)
                        self.bitrate = byte_count * 8 / self.duration

Traceback can be found here:

tag = TinyTag.get("music.mp3")

File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 136, in load self._determine_duration(self._filehandler)

File "/usr/lib/python3.6/site-packages/tinytag/tinytag.py", line 528, in _determine_duration
self.bitrate = byte_count * 8 / self.duration

ZeroDivisionError: float division by zero

You may want to add some error-handling to avoid compatibility problems, I can fix it for you if you want to.

bitrate is in bps (not kbps) for VBR mp3 files

Hi,

Any VBR mp3 file I point tinytag at gives me a bitrate value which is about 1000x what audio software reports. I expect that this value is bps rather than kbps.

For example, for the file at (https antisol dot org slash brass dot mp3), tinytag gives me a bitrate value of 195206.785168, whereas various audio software reports the file as being 195kbps, including:

  • audacious
  • qmmp
  • vlc ('input bitrate' on 'statistics' screen lists ~195kbps, but codec info tab shows 128kbps)

I've seen this with all other VBR mp3 files I've tried, they all give a value ~1000x the value reported by audacious.

As a workaround I was able to do something like:

if bitrate > 320: bitrate = bitrate / 1099

to get a pretty good bitrate value. But it would be nice to see a more accurate value come from TinyTag.

Thanks!

README.md has a tiny spelling mistake

hi, when I read the list of possible attributes that I can get with TinyTag, I find a tiny spelling mistake. The detailed info is as follows:

tag.title # title of the sonf

I think the sonf should be song : )

Album and track names longer than 30 characters are truncated

For instance, tag.album gives 'The Double EP: A Sea of Split ' rather than 'The Double EP: A Sea of Split Peas'. The album name is correctly determined by other programs such as easytag or rythmbox, so I believe it to be correctly encoded in the file.

Edit: seems to occur for track names as well.

ID3 improperly decodes.

I encountered this issue while parsing over some of my MP3 (I haven't tested this on other formats). The issue presents in all text fields.

{
 'audio_offset': 2058, 'track': '\x0311', 'year': '2000', 'filesize': 5382993, 
 'album': '\x03Relationship of Command', 'title': '\x03Nonâ\x80\x90Zero Possibility', 
 'samplerate': 44100, 'duration': 336.43710016824866, 'track_total': '11', 
 'bitrate': 128.0, 'artist': '\x03At the Driveâ\x80\x90In'
}

I receive \x03 END OF TEXT at this beginning of each string data point (always happens, at least on the handful of albums I've tested on) as well as a garbled decode on -. I believe the issue is related to ID3 making the decision on how to decode text: either ISO-8859-1 or UTF-16 depending on how the file reads.

I believe the best course of action would be an option keyword argument on TinyTag.get that allows the user to specify the preferred decoding and then either falling back onto a standard (ISO-8859-1, which is currently used, perhaps) or passing the buck back to the user for error handling.

I haven't had the chance to test this change myself, yet. I'm currently dealing with the issue with this bandage hack:

def fixer(value, ignore=(AttributeError, UnicodeEncodeError), handle=None):
    '''Actual fixer function for fix_track

    ignore is a tuple of exceptions we should just discard.
    handle is an optional exception handler.
    '''
    try:
        value = value.encode('latin-1').decode('utf-8')
        # matching \x03 is frustrating
        # again, just a crutch to lean on
        if not value[0].isprintable():
            value = value[1:]
    except ignore as e:
        if handle:
            handle(e)
        else:
            pass
    finally:
        # we always end up here
        return value


def fix_track(
              track, 
              fixer=fixer, 
              fields=('artist', 'album', 'title', 'track', 'year', 'track_total'),
              int_convert=('track', 'year', 'track_total')
              ):
    '''Fix encoding issue encountered on some tracks.

    Accepts a track object and attempts to massage the data in our favor.
    * fixer is the function we want to run on this track to correct data
    * fields in the specific fields we'd like to attempt to correct
    * int_convert is a subset of fields that is data that should be integers
    '''
    for f in fields:
        value = getattr(track, f)
        if not value: 
            # value is likely None
            # we'll pass on this value
            # to avoid blowing up
            continue
        else:
            value = fixer(value)
        if f in int_convert:
            try:
                value = int(value)
            except ValueError:
                pass
        setattr(track, f, value)

    # TODO: need to make this mutable
    # for now, it's hardcoded as TinyTag
    # stores duration as a float
    track.duration = int(track.duration)

    # returning the track allows us
    # to be flexible in application
    # of this function
    return track

Expected: At the Drive-In - Relationship of Command - 11 - Non-Zero Possibility
Actual: At the Drive�In - Relationship of Command - 11 - Non�Zero Possibility
After Fixing: At the Drive‐In - Relationship of Command - 11 - Non‐Zero Possibility

Comment Fields

Would it be possible for tinytag to expose the ID3 "comment" field?

Invalid tags from imported m4a

Bona-fide .m4a files as imported from an idevice don't parse:

>>> TinyTag.get('~/Music/iPhone/9764342624706351712.m4a')
{'filesize': 15616586, 'album': None, 'albumartist': None, 'artist': None, 'audio_offset': None, 'bitrate': 256.0, 'channels': 2, 'comment': None, 'composer': None, 'disc': None, 'disc_total': None, 'duration': 411.0110101 'genre': None, 'samplerate': 44100, 'title': None, 'track': None, 'track_total': None, 'year': None}

Binary paths are not handled

It seems that the python spec allows pathes to be either strings or binary data. According to stackoverflow this is cause on linux, filenames do not have an encoding but are just a buch of bytes.

In my case, I have a path that is not UTF-8 and converting it to an UTF-8 string does not work cause the conversion fails with encoding errors.

Using the os functions (ie. os.stat) to access the file through its binary name works. (like os.stat(b"some\x12nasty\x34file\xff"))

However TinyTag fails to access the file. This is caused by tinytag.py line 122:
filename = os.path.expanduser(str(filename)) # cast pathlib.Path to str

In the light of how pathes work on linux, casting to str is obviously the wrong way.

UnicodeDecodeError on my mp3

hi, when I do info = TinyTag.get('my.mp3') i get this error
I can provide mp3

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-2-015becc94593> in <module>()
----> 1 info = TinyTag.get('my.mp3')

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in get(cls, filename, tags, length)
     56         if filename.lower().endswith('.mp3'):
     57             with open(filename, 'rb') as af:
---> 58                 return ID3(af, tags=tags, length=length)
     59         elif filename.lower().endswith(('.oga', '.ogg')):
     60             with open(filename, 'rb') as af:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in __init__(self, filehandler, tags, length)
    117     def __init__(self, filehandler, tags=True, length=True):
    118         TinyTag.__init__(self)
--> 119         self.load(filehandler, tags=tags, length=length)
    120
    121     def _determine_length(self, fh):

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in load(self, filehandler, tags, length)
     77         """
     78         if tags:
---> 79             self._parse_tag(filehandler)
     80             filehandler.seek(0)
     81         if length:

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_tag(self, fh)
    159
    160     def _parse_tag(self, fh):
--> 161         self._parse_id3v2(fh)
    162         if not self.has_all_tags():  # try to get more info using id3v1
    163             fh.seek(-128, 2)  # id3v1 occuppies the last 128 bytes

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_id3v2(self, fh)
    183             while parsed_size < size:
    184                 is_id3_v22 = major == 2
--> 185                 frame_size = self._parse_frame(fh, is_v22=is_id3_v22)
    186                 if frame_size == 0:
    187                     break

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _parse_frame(self, fh, is_v22)
    219                     self._parse_track(content)
    220                 else:
--> 221                     self._set_field(fieldname, content, self._decode_string)
    222             return frame_size
    223         return 0

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _set_field(self, fieldname, bytestring, transfunc)
     88             return
     89         if transfunc:
---> 90             setattr(self, fieldname, transfunc(bytestring))
     91         else:
     92             setattr(self, fieldname, bytestring)

/Users/ib/Documents/Python/tst/tinytag/tinytag/__init__.py in _decode_string(self, b)
    228             return self._unpad(codecs.decode(b[1:], 'ISO-8859-1'))
    229         if b[0:3] == b'\x01\xff\xfe':
--> 230             return self._unpad(codecs.decode(b[3:], 'UTF-16'))
    231         return self._unpad(codecs.decode(b, 'ISO-8859-1'))
    232

/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_16.pyc in decode(input, errors)
     14
     15 def decode(input, errors='strict'):
---> 16     return codecs.utf_16_decode(input, errors, True)
     17
     18 class IncrementalEncoder(codecs.IncrementalEncoder):

UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 20: truncated data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.