Giter Club home page Giter Club logo

nptdms's Introduction

npTDMS

PyPI Version

Build status

Documentation Status

Code coverage

npTDMS is a cross-platform Python package for reading and writing TDMS files as produced by LabVIEW, and is built on top of the numpy package. Data is read from TDMS files as numpy arrays, and npTDMS also allows writing numpy arrays to TDMS files.

TDMS files are structured in a hierarchy of groups and channels. A TDMS file can contain multiple groups, which may each contain multiple channels. A file, group and channel may all have properties associated with them, but only channels have array data.

Typical usage when reading a TDMS file might look like:

from nptdms import TdmsFile

tdms_file = TdmsFile.read("path_to_file.tdms")
group = tdms_file['group name']
channel = group['channel name']
channel_data = channel[:]
channel_properties = channel.properties

The TdmsFile.read method reads all data into memory immediately. When you are working with large TDMS files or don't need to read all channel data, you can instead use TdmsFile.open. This is more memory efficient but accessing data can be slower:

with TdmsFile.open("path_to_file.tdms") as tdms_file:
    group = tdms_file['group name']
    channel = group['channel name']
    channel_data = channel[:]

npTDMS also has rudimentary support for writing TDMS files. Using npTDMS to write a TDMS file looks like:

from nptdms import TdmsWriter, ChannelObject
import numpy

with TdmsWriter("path_to_file.tdms") as tdms_writer:
    data_array = numpy.linspace(0, 1, 10)
    channel = ChannelObject('group name', 'channel name', data_array)
    tdms_writer.write_segment([channel])

For more detailed documentation on reading and writing TDMS files, see the npTDMS documentation.

Installation

npTDMS is available from the Python Package Index, so the easiest way to install it is by running:

pip install npTDMS

There are optional features available that require additional dependencies. These are hdf for hdf export, pandas for pandas DataFrame export, and thermocouple_scaling for using thermocouple scalings. You can specify these extra features when installing npTDMS to also install the dependencies they require:

pip install npTDMS[hdf,pandas,thermocouple_scaling]

Alternatively, after downloading the source code you can extract it and change into the new directory, then run:

python setup.py install

Source code lives at https://github.com/adamreeve/npTDMS and any issues can be reported at https://github.com/adamreeve/npTDMS/issues. Documentation is available at http://nptdms.readthedocs.io.

Limitations

This module doesn't support TDMS files with XML headers or with extended precision floating point data.

Contributors/Thanks

Thanks to Floris van Vugt who wrote the pyTDMS module, which helped when writing this module.

Thanks to Tony Perkins, Ruben De Smet, Martin Hochwallner and Peter Duncan for contributing support for converting to Pandas DataFrames.

Thanks to nmgeek and jshridha for implementing support for DAQmx raw data files.

nptdms's People

Contributors

achilles1515 avatar adamreeve avatar adelcast avatar carlodri avatar eifi1 avatar gurumanoj avatar jamesmyatt avatar jgiannuzzi avatar jimkring avatar jon-deng avatar joschkazj avatar jshridha avatar kasparnagu avatar loooptools avatar marthoch avatar matrixx567 avatar nathanielatom avatar pdunc avatar prismv avatar rfriedma avatar rprzysowa avatar rubdos avatar toddrme2178 avatar toperkin avatar wwoods avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nptdms's Issues

Plans for working on XML support

Hi,
I'm currently working with TDM files created with Labview and trying to move from DIAdem to Python for data analysis purposes.
However, it seems the files created both by labview and DIAdem are in my case most definitely XML files.

Are there any plans to add support for that ?

Translator which translates DAQmxRawData files to legacy TDMS files

I think we are close to wrapping up #24 and I see that if you want to read one of these raw-formatted files into Matlab you first have to fire up LabView and translate the raw file into the legacy format.

I think Matlab users would like to have a command line utility which runs outside of LabView so I suggest we include a conversion program that installs in a 'bin' folder like tdmsinfo.

How about

    tdmsconvert infile.tdms outfile.tdms

tdmsfile.groups() returning empty list,

H I Adam,
tdmsfile.groups() returns with []
tdmsinfo.main() stops too early for -P and -D shows strange floating point data for wf_start_offset
tdms_file.object('high','MEMS 1') works fine
start file -P
/
properties:
description: T&M Signal Logger
datetime: 2012-01-02 10:20:31.549620
author: 10.127.10.228
version: V2.3.7
lastdatetime: 2012-01-02 10:30:30.549620
name: 3404911229_00005723_03433203_1_228.tdms
campaign_timestamp: 2011-11-23 16:40:28.549620
seq_id: 5723
***
end file -P
***
*****start file -D
DEBUG:nptdms.tdms:Reading segment at 0
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is False
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 28
DEBUG:nptdms.tdms:Reading metadata for object /
DEBUG:nptdms.tdms:Object has no data in this segment
DEBUG:nptdms.tdms:Reading 0 properties
DEBUG:nptdms.tdms:Reading segment at 45
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is False
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 73
DEBUG:nptdms.tdms:Reading metadata for object /
DEBUG:nptdms.tdms:Object has no data in this segment
DEBUG:nptdms.tdms:Reading 1 properties
DEBUG:nptdms.tdms:Property description: T&M Signal Logger
DEBUG:nptdms.tdms:Reading segment at 130
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is False
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 158
DEBUG:nptdms.tdms:Reading metadata for object /
DEBUG:nptdms.tdms:Object has no data in this segment
DEBUG:nptdms.tdms:Reading 1 properties
DEBUG:nptdms.tdms:Property datetime: 2012-01-02 10:20:31.549620
DEBUG:nptdms.tdms:Reading segment at 207
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is False
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 235
DEBUG:nptdms.tdms:Reading metadata for object /
DEBUG:nptdms.tdms:Object has no data in this segment
DEBUG:nptdms.tdms:Reading 1 properties
DEBUG:nptdms.tdms:Property author: 10.127.10.228
DEBUG:nptdms.tdms:Reading segment at 283
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is False
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 311
DEBUG:nptdms.tdms:Reading metadata for object /
DEBUG:nptdms.tdms:Object has no data in this segment
DEBUG:nptdms.tdms:Reading 1 properties
DEBUG:nptdms.tdms:Property version: V2.3.7
DEBUG:nptdms.tdms:Reading segment at 353
DEBUG:nptdms.tdms:Property kTocDAQmxRawData is False
DEBUG:nptdms.tdms:Property kTocNewObjList is True
DEBUG:nptdms.tdms:Property kTocInterleavedData is False
DEBUG:nptdms.tdms:Property kTocRawData is True
DEBUG:nptdms.tdms:Property kTocMetaData is True
DEBUG:nptdms.tdms:Property kTocBigEndian is False
DEBUG:nptdms.tdms:Reading metadata at 381
DEBUG:nptdms.tdms:Reading metadata for object /'high'/'Signal 0'
DEBUG:nptdms.tdms:Object data type: tdsTypeSingleFloat
DEBUG:nptdms.tdms:Object number of values in segment: 5120
DEBUG:nptdms.tdms:Reading 16 properties
DEBUG:nptdms.tdms:Property wf_time_pref: absolute
DEBUG:nptdms.tdms:Property wf_start_time: 2012-01-02 10:20:32.549620
DEBUG:nptdms.tdms:Property wf_start_offset: 1.683945893e-314
DEBUG:nptdms.tdms:Property wf_increment: 0.0001953125
DEBUG:nptdms.tdms:Property wf_samples: 5120
DEBUG:nptdms.tdms:Property Corrzero: 0.0
DEBUG:nptdms.tdms:Property DownSampleFactorHigh: 1
DEBUG:nptdms.tdms:Property DownSampleFactorLow: 1
DEBUG:nptdms.tdms:Property Enabled: 1
DEBUG:nptdms.tdms:Property Gain: 1.0
DEBUG:nptdms.tdms:Property NI_ChannelName: Signal 0
DEBUG:nptdms.tdms:Property NI_UnitDescription:
DEBUG:nptdms.tdms:Property Offset: 0.0
DEBUG:nptdms.tdms:Property SensorID: 0
DEBUG:nptdms.tdms:Property wf_xname: Signal 0
DEBUG:nptdms.tdms:Property wf_xunit_string: s
DEBUG:nptdms.tdms:Reading metadata for object /'high'/'System Temp'
DEBUG:nptdms.tdms:Object data type: tdsTypeSingleFloat
DEBUG:nptdms.tdms:Object number of values in segment: 1
DEBUG:nptdms.tdms:Reading 16 properties
DEBUG:nptdms.tdms:Property wf_time_pref: absolute
DEBUG:nptdms.tdms:Property wf_start_time: 2012-01-02 10:20:32.549620
DEBUG:nptdms.tdms:Property wf_start_offset: 1.683945893e-314
DEBUG:nptdms.tdms:Property wf_increment: 1.0
DEBUG:nptdms.tdms:Property wf_samples: 1
DEBUG:nptdms.tdms:Property Corrzero: 0.0
DEBUG:nptdms.tdms:Property DownSampleFactorHigh: 5120
DEBUG:nptdms.tdms:Property DownSampleFactorLow: 5120
DEBUG:nptdms.tdms:Property Enabled: 1
DEBUG:nptdms.tdms:Property Gain: 1.0
DEBUG:nptdms.tdms:Property NI_ChannelName: System Temp
DEBUG:nptdms.tdms:Property NI_UnitDescription:
DEBUG:nptdms.tdms:Property Offset: 0.0
DEBUG:nptdms.tdms:Property SensorID: 3
DEBUG:nptdms.tdms:Property wf_xname: System Temp
DEBUG:nptdms.tdms:Property wf_xunit_string: s
***
**************end file -D

Unable to open file

Loading a TDMS file generated with NI-DAQmx with the following

from nptdms import TdmsFile
tdms_file = TdmsFile("example.tdms")

Gives the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 148, in __init__
    self._read_segments(tdms_file)
  File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 160, in _read_segmen
ts
    previous_segment)
  File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 346, in read_metadat
a
    segment_obj._read_metadata(f)
  File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 655, in _read_metada
ta
    self.data_type.length * self.dimension)
TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

However, the file can be easily read using C and NI's DLL.

Read String Data

I have a TDMS file that apparently has string data in it since I am getting the following error:

  File "~\GitHub\npTDMS\nptdms\tdms.py", line 652, in _read_metadata
    self.tdms_object.data_type.name)
ValueError: Unsupported data type: tdsTypeString

It would be fantastic if you could add support for this instance, or help point me in the correct direction to add that functionality. The question I have is how do you know how many chars are in the string? If I knew this, I could probably implement the request.

Thanks!
Justin

Time track incorrect do to failure to read offset

I have a tdms file which I know has a time offset for some channels because I can view it in LabView. However when I get the time with the below it returns an incorrect time.

tdms_obj = tdms_file.object(GROUP, channel)
t = tdms_obj.time_track()

When I looked into it deeper, tdms_obj.property('wf_start_offset') = 0 for all channels, which is incorrect. Since some of the channels have an offset, when you plot all of the data things don't line up as they should. I am using relative times throughout.

nptdms crashes python.exe as soon as TdmsFile() is used

Hi. I have been using nptdms off and on for a few months and really enjoying it, the team has done great work here.

Unfortunately today I noticed I cannot run it anymore to save my life... any time I try to load a tdms file in it crashes python.exe.

There's very little for me to go on with this, and I've even started from square 1 with a simple:
from nptdms import TdmsFile
tdms_file = TdmsFile("D:\Temp\In\test.tdms")

just to see how it works, and it immediately crashes. I've tried with with several files I know worked in the past. What steps can I take to find out the root of this issue?

I'm running numpy 1.10.4, Windows 7 64-bit, and python3 (64bit)

Thanks and Regards,
Jeff

Booleans are 1 byte long

Here's the patch. It includes the previous one, but I' too lazy to fix that.

diff --git a/nptdms/tdms.py b/nptdms/tdms.py
index a3d27a1..73b479e 100644
--- a/nptdms/tdms.py
+++ b/nptdms/tdms.py
@@ -52,7 +52,7 @@ tdsDataTypes = dict(enumerate((
 tdsDataTypes.update({
     0x19: DataType('tdsTypeSingleFloatWithUnit', None, 4, None),
     0x20: DataType('tdsTypeString', None, None, None),
-    0x21: DataType('tdsTypeBoolean', 'b', 8, np.bool8),
+    0x21: DataType('tdsTypeBoolean', 'b', 1, np.bool8),
     0x44: DataType('tdsTypeTimeStamp', 'Qq', 16, None),
     0xFFFFFFFF: DataType('tdsTypeDAQmxRawData', None, None, None)
 })
@@ -421,6 +421,7 @@ class TdmsObject(object):
         elif self.raw_data_index == 0x00000000:
             pass
         else:
+            self.has_data = True
             index_length = self.raw_data_index

             # Read the data type

read only a number of samples?

Hi!
Would it be possible to read a certain number of samples instead of all?
I did this for a different file format by using open, seek and read the file only by the amount needed, instead of np.fromfile. This sped up operation on large files (>1GB each). Now I am having same file sizes with TDMS and was wondering, if this could be applied here as well? (headers should be read each time). Thanks!

The kTocRawData flag is misleading

For most of the LabVIEW generated files I have, all segments have the kTocRawData flag set, but some of the segments contain no data at all. Therefore I suggest to add a check like the following to the code:


--- tdms.py~    2012-02-27 21:50:22.000000000 +0100
+++ tdms.py     2012-05-09 18:33:22.999671001 +0200
@@ -285,6 +285,8 @@
         data_size = sum([
                 o.data_size
                 for o in self.ordered_objects])
+        if data_size <= 0:
+            return
         total_data_size = self.next_segment_offset - self.raw_data_offset
         if total_data_size % data_size != 0:
             raise ValueError("Data size %d is not a multiple of the "

multiprocessor, multi-thread and nptdms

Hi nptdms,

I've recently started working with your code.

I am working on .tdms files which are 300GB large - something like 161billion elements in the data.

With a file this large, doing simple things like finding min and max across that many elements takes some time.. I am wondering about how I would fair accessing the memmap'ed TdmsFile object via multiple threads or on multiple processors. Is this something you would recommend?

Thanks,

Nick

Use OrderedDict() instead of {}

May I suggest to use OrderedDict() instead of {} throughout the code? The order of the objects (and properties) will stay the same as in the TDMS file and at least for my application will be much less confusing.

The OrderedDict is available in Python 2.7 and up. For older python versions there is:
http://pypi.python.org/pypi/ordereddict/1.1

Thanks.

lazy loading of raw data

Is it possible to implement "lazy" loading of the TDMS (segment) raw data? I imagine reading just the metadata first (potentially using the tdms_index if present) and on calling channel_data only load the raw data of this channel. Any thoughts?

Accessing attributes

First off, this is a great package. It's been a big help to me.

In writing a TDMS file from labview, it's natural to add in some 'attributes' in addition to the channels and groups. Details of an experiment for example. However, these don't exist like a standard Group/Channel in your framework and so it took me a little while to work out that i could get to them with:

atts = {}
for name, value in TDMS_data.object('Attributes').properties.items():
atts[name] = value

Perhaps it would be a good idea to have a wrapper around this so a list of attributes can be generated more easily for the newcomer?

Pickling a memmap'd tdms

Hey nptdms,

Thanks for your response to the previous questions.

I ended up being able to successfully open 24 tdms files, 12 of which were ~200MB each and 12 of which were ~300GB each; and all that using multiprocessing on 24 Xeon cores... speeding the opening process up quite a bit.

I assumed that dumping and loading tdmsobjects via pickle would be faster than using TdmsFile to init the tdmsobjects. The process of pickling worked on the 200MB files, and loading the pickles was indeed faster than initializing a tdmsobject from a tdms file.

Pickling the 300GB tdmsobjects however causes the pickle module to return a "MemoryError." I am assuming that this is due to the use of memmaping.

Any suggestions for how I can get around this and pickle a memmap'd tdmsobject?

nt

Getting all metadata

@avstenit, you can create an issue here if you have questions about nptdms. What problem are you having exactly?

"Unsupported data type to read, tdsTypeString." & unreadable TDMS files

Hey,

we're using TDMS files for our measurement data. The are being read fine in Origin, Excel and Matlab but unfortunately npTDMS can't open them. :/

I created a repository with a sample tmds file: https://github.com/HaMF/tdmsissues

Upon loading, there's the error "Unsupported data type to read, tdsTypeString." popping up a couple of times. And in the channel data has only one entry (as opposed to e.g. 102 for the above dataset.) the value, however is read correctly from the TDMS file. (First commit, https://github.com/HaMF/tdmsissues/commit/9c0e3da7bd0c8ef6adeeb936ac16fdc7f49e1850 , in the repository.)

After defragmenting the tdms w/ labview 2013 (latest commit, https://github.com/HaMF/tdmsissues/commit/5f2e40c81ffa3b60f368e70d305d86724133ba1c , on the repo above). npTDMS can't read the file anymore, failing w/ "OverflowError: date value out of range".

This is a fresh install of npTDMS 0.6.1 with Python 2.7.

Can you help track down this problem? I unfortunately don't have any knowledge of the TDMS file structure yet and the measurement programm has been growing for some years (read: it's a monster) however if it's necessary I can try to get some insight...

Cheers,
Hannes

EDIT: Sorry if that's already in the "not supported" list, as mentioned, I haven't looked into the TDMS file structure.

timestamp

My tdms files consist of an explicit timestamp channel (not implicit sampling information in waveforms).
As the NI timestamp is incompatible with Python types, npTDMS has to convert it. But I don't know why the basic type datetime is used instead of numpy datetime64 . Consequently, a timestamp channel in npTDMS is not an array but a list and TdmsWriter is not able to save them.

tdms=TdmsFile('vibration.tdms')
with TdmsWriter(outfile) as tdms_writer:
tdms_writer.write_segment(tdms.group_channels('RMS')

The code stops at the object <TdmsObject with path /'RMS'/'Timestamp'> with a list inside it:
{'_data': [datetime.datetime(2017, 6, 4, 14, 4, 28, 966494, tzinfo=),
datetime.datetime(2017, 6, 4, 14, 4, 29, 153948, tzinfo=), ...

286
287 try:
--> 288 array.tofile(file)
289 except (TypeError, IOError, UnsupportedOperation):
290 # tostring actually returns bytes

AttributeError: 'list' object has no attribute 'tofile'

ValueError: Data size is not a multiple of the chunk size

I'm trying to open a 4.4GB tdms file, and I get the following error from tdmsinfo:

Traceback (most recent call last):
  File "/usr/bin/tdmsinfo", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python3.5/site-packages/nptdms/tdmsinfo.py", line 26, in main
    tdmsfile = tdms.TdmsFile(args.tdms_file)
  File "/usr/lib/python3.5/site-packages/nptdms/tdms.py", line 153, in __init__
    self._read_segments(tdms_file)
  File "/usr/lib/python3.5/site-packages/nptdms/tdms.py", line 166, in _read_segments
    tdms_file, self.objects, previous_segment)
  File "/usr/lib/python3.5/site-packages/nptdms/tdms.py", line 456, in read_metadata
    self.calculate_chunks()
  File "/usr/lib/python3.5/site-packages/nptdms/tdms.py", line 485, in calculate_chunks
    "chunk size %d" % (total_data_size, data_size))
ValueError: Data size 4435200000 is not a multiple of the chunk size 4000256

I saw that there are closed issues with the same error, but I just installed v0.7.1 so I use the last version.
The error appear on multiple files. Those files can be read with a LabView program so they are valid. Also LabView didn't crash while writing the files.

I'm on Arch Linux 64 bits with python 3.5 with 32 GB of RAM.

Error: Data size is not a multiple of the chunk size

Hello,

I get an exception while reading our tdms file (a different one is read properly). You can get the file here: http://db.tt/B2s066i7

In [4]: f = nptdms.TdmsFile('74mbar_L2.tdms')

ValueError Traceback (most recent call last)
in ()
----> 1 f = nptdms.TdmsFile('74mbar_L2.tdms')

/sw/python/2.7.4/lib/python2.7/site-packages/nptdms/tdms.pyc in init(self, file, memmap_dir)
146 # Is path to a file
147 with open(file, 'rb') as tdms_file:
--> 148 self._read_segments(tdms_file)
149
150 def _read_segments(self, tdms_file):

/sw/python/2.7.4/lib/python2.7/site-packages/nptdms/tdms.pyc in _read_segments(self, tdms_file)
158 break
159 segment.read_metadata(tdms_file, self.objects,
--> 160 previous_segment)
161
162 self.segments.append(segment)

/sw/python/2.7.4/lib/python2.7/site-packages/nptdms/tdms.pyc in read_metadata(self, f, objects, previous_segment)
362 obj._previous_segment_object = segment_obj
363
--> 364 self.calculate_chunks()
365
366 def calculate_chunks(self):

/sw/python/2.7.4/lib/python2.7/site-packages/nptdms/tdms.pyc in calculate_chunks(self)
389 if total_data_size % data_size != 0:
390 raise ValueError("Data size %d is not a multiple of the "
--> 391 "chunk size %d" % (total_data_size, data_size))
392 else:
393 self.num_chunks = total_data_size // data_size

ValueError: Data size 360000 is not a multiple of the chunk size 714000

Can this be somehow corrected?

Thank you in advance,
Jakub

Writing tdms files

Hi,

I'm trying to write tdms files that are copies of a loaded file, but with subsets of the loaded file data.
I figure this is the solution to the problem I raised in Issue 67, which I've closed. Essentially I am working with large files and need to make them smaller in order to work with them effectively.

So this is where I am at...

my loaded (300GB) tdms file has group "group" and channel "channel 1."

At the python interpreter:
tdms_file = TdmsFile("bigFile.tdms", memmap_dir="./tmp")
root_object = tdms_file.object()
group_object = tdms_file.object('group')
channel_object = tdms_file.object("group","channel 1")
new_channel = ChannelObject("group","channel 1", channel_object.data[0:200],properties={})
with TdmsWriter("littleFile.tdms") as tdms_writer:
tdms_writer.write_segment([root_object,group_object,new_channel])
Gives:
error: 'l' format requires -2147483648 <= number <= 2147483647

At the python interpreter:
channel_object.data
Gives:
memmap([-2, 8, 0, ..., 12, 0, 0], dtype=int16)

At the python interpreter:
new_channel.data[0:200]
looks pretty good:
memmap([ -2, 8, 0, 6, 6, -2, -2, -2, -2, 12, -2, 6, 6,
-2, -4, 2, 10, -4, 8, -6, 4, 4, 6, 6, 4, -2,
4, 0, 0, 0, -4, 0, 0, 2, 0, 4, 2, -4, -10,
-10, -12, -10, -8, -10, -16, -10, -4, -8, -2, 2, -8, -6,
0, -4, -6, 6, -6, 4, -6, -8, 6, -8, -4, -4, 0,
2, 0, 0, 2, 6, -2, -4, 0, 6, 8, -4, 12, 2,
8, 4, -2, -6, 6, 6, 2, -2, 8, -2, 6, 2, 2,
10, -2, 0, -2, 4, 0, -4, -10, 4, 4, 2, 2, 2,
-6, -10, -4, -4, -2, -10, -4, -2, -4, 4, -8, -4, 2,
-10, -4, 4, 2, -6, 4, 8, 0, 2, 10, 6, 6, 10,
4, 6, 6, -2, 10, 10, 14, 2, 16, 10, 10, 8, 8,
4, 12, 14, 6, 6, 4, 10, 12, 8, 10, 2, 6, 4,
4, 0, 2, 8, -2, 8, 0, 6, 6, 2, -2, 10, -10,
8, 4, 2, 0, 12, -6, -4, 2, 2, 8, 0, -4, 0,
8, 2, 0, 10, 4, 10, 4, 6, 6, 8, 2, 8, 4,
6, 10, 8, 10, 6], dtype=int16)

So it looks like the memmap values are not being passed into the TdmsWrite?

Regards,

nt

Larger TDMS file return errors

There seems to be an upper limit on how large a TDMS file I can extract information from with npTDMS.

I have several TDMS files, which were all generated from the same test system. They can all be opened using the Excel Importer, and the extraction through Excel returns no errors. The files of size 20.7 MB and 39 MB open fine, but the file of size 54 MB returns an error.

Are there some settings that I can change or suggestions about how to handle the larger files with npTDMS?

tdmsinfo bug.

I've recently tried to run tdmsinfo on file created by LabVIEW and it crashed. If you look at the following little patch, you know why: if a channel is defined, but no data are written to it, there is no data type defined.

diff --git a/nptdms/tdmsinfo.py b/nptdms/tdmsinfo.py
index 63e5b0b..60bb7b6 100644
--- a/nptdms/tdmsinfo.py
+++ b/nptdms/tdmsinfo.py
@@ -45,7 +45,8 @@ def main():
display("%s" % channel.path, level)
if args.properties:
level = 3

  •            display("data type: %s" % channel.data_type.name, level)
    
  •            if channel.data_type != None:
    
  •                display("data type: %s" % channel.data_type.name, level)
             display_properties(channel, level)
    

And yes, I should probably learn to use GitHub properly, but I'm too old :-)

Speeding things up using C/C++?

Hi!

Some background: I'm working with a 270MB tdms file for debugging my programs. It takes about 4 seconds for my Python script to do all the work we need on that file. That's already a factor 25 from the previous implementation.

I'm thinking about reimplementing this library (or parts of it, mostly the loading/reading of the file) in C or C++. At first it will be an experiment (am I able to do this? Will it speed things up?).

Would you accept a C/C++ reimplementation, given that the API will stay exactly the same?

Version numbering

Hi Adam,

Some of my teammembers are using version 0.6.2. But when I compare tdms.py they are all different.
Would it be possible to add a patch level so that we can see immediately the difference?
I suggest to set the version number now to 0.6.3.0000 or so, as a start for synching.

Thanks,

Bob

TdmsObject.has_data is not correctly updated.

has_data can and sometimes will change from segment to segment and therefore must be updated whenever metadata are read. I suggest to simply do the following:

diff --git a/nptdms/tdms.py b/nptdms/tdms.py
index a3d27a1..4121537 100644
--- a/nptdms/tdms.py
+++ b/nptdms/tdms.py
@@ -421,6 +421,7 @@ class TdmsObject(object):
         elif self.raw_data_index == 0x00000000:
             pass
         else:
+            self.has_data = True
             index_length = self.raw_data_index

             # Read the data type

Thank you for npTDMS.

Writing to TDMS Files?

There is exactly one Python module right now, cTDMS(which uses the NI dll) to write TDMS files. Would it be possible to implement writing to a TDMS file in npTDMS? Or is there some constraint (technical or otherwise) that I'm missing?

I have no problem implementing it myself, if there is no issue in doing it. I just don't want to spend a lot of effort and then bang up against some insurmountable roadblock that I don't know about. Any pointers?

TdmsFile.object('xxx') cannot access root object

This is a little nit, as it can be worked around by direct dictionary access, but just as a 'minor' functionality bug it should be noted that access to the root tdms object is impossible through the class interface.

Example:

tdms_root = f.objects['/'] - works
tdms_root = f.object('') - doesn't work (nor other variants to try and access '/')

This was important on my project as I need the properties off of the root object. Either way this is pretty trivial. Thank you very much for this library, it has saved me a ton of time!

crash on nbsp

The library is unable to read a tdms file when there is a non breaking space (0xff) in the property value. I got the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 196: invalid start byte

This bug does not occur with @rubdos C++ implementation.

tdms.TdmsFile("path_to_file.tdms") fails.

This seems to be due to an error at line 97 of version 0.1.1 (object has no attribute 'file').

97 with open(self.file, 'rb') as tdms_file:

should read:

97 with open(file, 'rb') as tdms_file:

Sorry for my poor bug reporting skills here.

David

Add Support for 'tdsTypeDAQmxRawData'

Hi guys,

I tried loading a TDMS file generated using DAQmx file streaming, the data is stored as tdsTypeDAQmxRawData.

This throws the ValueError: Unsupported data type: tdsTypeDAQmxRawData exception. Looking at the source there is a section:

tdsDataTypes.update({
    ...
    0xFFFFFFFF: DataType('tdsTypeDAQmxRawData', None, None, None)
})

Would it be easy to add support for the data? pyTDMS also throws an exception when parsing this file so I wonder if there is something non-trivial about it.

Scale Data not being applied

Hello,

I have been using npTDMS for while now, and I just realized that if I have only one scaling information in the metadata of a channel it does not get applied to the raw data. I saw in the code that a 1 index is used and Labview starts at zero.

Thank you,

Charles

tdmsinfo error

Hello,
npTDMS is a great project and I am using it frequently. Also under linux I am using tdmsinfo to read data file properties and also made my own script to write properties that I need into dat file. However I have found that tdmsinfo with my larger files always reports an Error as following:
Traceback (most recent call last): File "/home/atedalv/.local/bin/tdmsinfo", line 11, in <module> sys.exit(main()) File "/home/atedalv/.local/lib/python2.7/site-packages/nptdms/tdmsinfo.py", line 29, in main root = tdmsfile.object() File "/home/atedalv/.local/lib/python2.7/site-packages/nptdms/tdms.py", line 155, in object raise KeyError("Invalid object path: %s" % object_path) KeyError: 'Invalid object path: /'

which for the identical file but with less amount of data (smaller acquisition time) everything works fine. This is the same for npTDMS installed using pip or pip3, sudo or --user. May you please help me with this issue.

Refactor Segment metadata reader

@adamreeve and @jshridha, Maybe this is too low level for an issue but I'm not sure where else to log it.
I've been working on the pull request for #24 and feel like the segment metadata reading code is ripe for refactoring. There are essentially three types of metadata:

  • Legacy metadata (not raw format)
  • DAQmx raw format metadata (the stuff I am working on for #24)
  • Segment metadata with only properties (other data repeats from previous segment)

I envision a property-only metadata class as base class, maybe a base class for the
other two formats, covering the parts they have in common, then two derived classes where one is for Legacy and the other is for Raw format.

This would refactor the _TdmsSegmentObject._read_metadata method's big if-then-else
structure (which is effectively a case statement on the raw_data_index value).

I'm proposing only refactoring ... no change in functionality. Ideally we would have more sample TDMS files to test the new DAQmx raw data format. Because of the limited test data I expect that over time we'll get a couple bug reports and have to apply some updates. This refactoring will make that process easier for everyone.

What do you think?

IOError from np.fromfile() when running tests

Having an issue running the test suite - this is running in a venv with numpy 1.9.2 and pandas 0.16.1 installed.


Error
Traceback (most recent call last):
  File "C:\Users\pduncan\AppData\Local\Continuum\Anaconda\lib\unittest\case.py", line 329, in run
    testMethod()
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\test\tdms_test.py", line 178, in test_data_read
    tdmsData = test_file.load()
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\test\tdms_test.py", line 77, in load
    return tdms.TdmsFile(self.file, *args, **kwargs)
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\tdms.py", line 149, in __init__
    self._read_segments(file)
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\tdms.py", line 183, in _read_segments
    segment.read_raw_data(tdms_file)
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\tdms.py", line 537, in read_raw_data
    obj._read_values(f, endianness))
  File "C:\Users\pduncan\PycharmProjects\npTDMS\nptdms\tdms.py", line 859, in _read_values
    return np.fromfile(file, dtype=dtype, count=self.number_values)
IOError: first argument must be an open file
-------------------- >> begin captured logging << --------------------
nptdms.tdms: DEBUG: Reading segment at 0
nptdms.tdms: DEBUG: Property kTocDAQmxRawData is False
nptdms.tdms: DEBUG: Property kTocNewObjList is True
nptdms.tdms: DEBUG: Property kTocBigEndian is False
nptdms.tdms: DEBUG: Property kTocInterleavedData is False
nptdms.tdms: DEBUG: Property kTocMetaData is True
nptdms.tdms: DEBUG: Property kTocRawData is True
nptdms.tdms: DEBUG: Reading metadata at 28
nptdms.tdms: DEBUG: Creating a new segment object
nptdms.tdms: DEBUG: Reading metadata for object /'Group'
nptdms.tdms: DEBUG: Object has no data in this segment
nptdms.tdms: DEBUG: Reading 2 properties
nptdms.tdms: DEBUG: Property prop (tdsTypeString): value
nptdms.tdms: DEBUG: Property num (tdsTypeI32): 10
nptdms.tdms: DEBUG: Creating a new segment object
nptdms.tdms: DEBUG: Reading metadata for object /'Group'/'Channel1'
nptdms.tdms: DEBUG: Object data type: tdsTypeI32
nptdms.tdms: DEBUG: Object number of values in segment: 2
nptdms.tdms: DEBUG: Reading 0 properties
nptdms.tdms: DEBUG: Creating a new segment object
nptdms.tdms: DEBUG: Reading metadata for object /'Group'/'Channel2'
nptdms.tdms: DEBUG: Object data type: tdsTypeI32
nptdms.tdms: DEBUG: Object number of values in segment: 2
nptdms.tdms: DEBUG: Reading 2 properties
nptdms.tdms: DEBUG: Property wf_start_offset (tdsTypeDoubleFloat): 0.0
nptdms.tdms: DEBUG: Property wf_increment (tdsTypeDoubleFloat): 0.1
nptdms.tdms: INFO: Read metadata: Took 3.00894216235 ms
nptdms.tdms: INFO: Allocate space: Took 0.060221352437 ms
nptdms.tdms: DEBUG: Reading 16 bytes of data at 241 in 1 chunks
nptdms.tdms: DEBUG: Data is contiguous
nptdms.tdms: INFO: Read data: Took 0.273121545464 ms
--------------------- >> end captured logging << ---------------------

Use numpy memmaped arrays if have large datasets

I have many TDMS files that will not fit into RAM. Instead of using np.fromfile, use np.memmap and access data like channel.data[:] for all of the values or channel.data[:100] when getting slices (this would be similar to the h5py interface).

Precision lost in polynomial scaling

The polynomial coefficients are recast as merely floats in enumerate (??). I revised the code in scaling.py for PolynomialScaling.scaling to
scaled_data = np.polynomial.polynomial.polyval(data, self.coefficients)
and my data is now full precision (and the same as output from the propriety NI conversion software).

nptdms.TdmsFile crashes when opening large file

I have a very large TDMS file (220 Mb) that I am trying to open using nptdms; when I run a script which only includes the following lines, Python (or IPython) crashes most spectacularly.

import nptdms
inputFileNameString = "HAC-20141017-093246.tdms"
tdmsFile = nptdms.TdmsFile(inputFileNameString)

I can open smaller files without a problem using npTDMS, and can also open large files in other applications (more specifically, convertTDMS.m, available on the Mathworks Matlab user website at http://www.mathworks.com/matlabcentral/fileexchange/44206-converttdms--v10-). The resulting errors are listed below. Please let me know if this is a problem that can be fixed.

I am running Python 2.7.3, 32 bit, on a Windows 7 machine.

Traceback (most recent call last):
File "dataplot.py", line 5, in
myDataFrame = myFunc2.loadTDMSDataFrame(inputFileNameString)
File "C:\Folder\myFunc2.py", line 41, in loadTDMSDataFrame
tdmsFile = nptdms.TdmsFile(inputFileNameString)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 148, in init
self._read_segments(tdms_file)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 160, in _read_segments
previous_segment)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 369, in read_metadata
segment_obj._read_metadata(f)
File "C:\Python27\lib\site-packages\nptdms\tdms.py", line 683, in _read_metadata
log.debug("Reading %d properties" % num_properties)
MemoryError

Problem with "as_frame"

Hi,
This script used to work,

#!/usr/bin/env python
from nptdms import TdmsFile
import matplotlib.pyplot as plt
from scipy import signal
from numpy import linspace

tdms_file = TdmsFile('cRIO.tdms')

frame=tdms_file.as_dataframe()
print frame.columns

after a while, and switching to Ubuntu 16.04 (probably unrelated), I get this error:

./plot_tdms.py
WARNING:nptdms.tdms:Last segment of file has unknown size, not attempting to read it
Traceback (most recent call last):
File "./plot_tdms.py", line 9, in
frame=tdms_file.as_dataframe()
File "/usr/local/lib/python2.7/dist-packages/nptdms/tdms.py", line 225, in as_dataframe
temp.append((key, pd.Series(data=value.data, index=index)))
File "/usr/local/lib/python2.7/dist-packages/nptdms/tdms.py", line 716, in data
scale = scaling.get_scaling(self)
File "/usr/local/lib/python2.7/dist-packages/nptdms/scaling.py", line 29, in get_scaling
return next(s for s in scalings if s is not None)
File "/usr/local/lib/python2.7/dist-packages/nptdms/scaling.py", line 29, in
return next(s for s in scalings if s is not None)
File "/usr/local/lib/python2.7/dist-packages/nptdms/scaling.py", line 27, in
scalings = (_get_object_scaling(o) for o in _tdms_hierarchy(channel))
File "/usr/local/lib/python2.7/dist-packages/nptdms/scaling.py", line 51, in _get_object_scaling
log.warning("Unsupported scale type: %s", scale_type)
NameError: global name 'log' is not defined

My data file indeed contains the "unknown scale type" strain.
I don't think that should generate a name-error though?
If I change "log" to "logging" in scaling.py, the warning is simply printed instead

Any suggestions?
Thanks
//Erik

Zero channel data size but non-zero data length based on segment offset

I've got some valid data that I can open in labview but can't open using npTDMS.
I get the error message "ValueError: Zero channel data size but non-zero data length based on segment offset"

I've tried doing some trouble shooting but nothing seems to be clarifying the situation yet.
The two files I've been trying
File Size variable "total_data_size"
86,090 KB 88018944
5,604,870 KB 5738590208

Java Version of npTDMS

Hi Adam,

I sent an email to your old university email, but thought I'd also attempt to contact you here. I apologize if neither is your preferred method of communication -- just let me know what is and I'll happily use it.

I am considering the development of a Java version of npTDMS, and I wanted to know if you would be able to provide some insight and advice if I do (or perhaps even contribute).

Feature Request - Add properties to TdmsObject for group and channel

First, great library. Thank you very much. Can you please add properties for channel and group to TdmsObject? It would useful to get those conveniently when iterating over all objects. Currently I have ugly code to get those from the full tdms path. If it is an object is a group only and not a channel, perhaps channel could be None. Thanks!

Licensing

Hi,

Small, but important thing I noticed when adding the LGPLv3 license on my TDMS c++ library.
You have included the LGPL additional permissions in the file LICENSE.txt.

I cite "How to use GNU licenses for your own software" by the FSF:

If you are releasing your program under the LGPL, you should also include the text version of the LGPL, usually in a file called COPYING.LESSER. Please note that, since the LGPL is a set of additional permissions on top of the GPL, it's important to include both licenses so users have all the materials they need to understand their rights.

I suggest doing mv LICENSE.txt COPYING.LESSER and wget https://www.gnu.org/licenses/gpl.txt -O COPYING and committing that.

Option for getting channel data as dataframe

You can get the full tdms file as a Pandas DataFrame, but you can't do as_dataframe on a channel (which you get by TdmsFile.object('Group', 'Channel')).

Is there any interest in making that work? I need this pretty soon. If this isn't implemented yet, I'd give it a go.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.