hpc4cmb / tidas Goto Github PK
View Code? Open in Web Editor NEWTIDAS (TImestream DAta Storage)
License: Other
TIDAS (TImestream DAta Storage)
License: Other
Under two different OS X toolchains (clang++ / mpich from conda, and gcc-mp / mpi from macports) the python MPI unit tests produce a segmentation fault.
Currently the python unit tests are just sort of ad-hoc. The C++ unit tests use gtest, and the Python unit tests should use the built in unittest package, with a custom runner called by the tidas.test() method.
Before release, we need:
We should enable travis integration. There are relatively few dependencies, so this should be easy.
The C and Python bindings do not yet have all the functionality of the C++ interface. This should be fixed eventually.
Currently the selection of the backend format (e.g. HDF5) and all options to that backend (compression, chunksize, etc) are set at the volume level and applied to all groups within the volume. This is not ideal, since different kinds of data will require different options. Each object has all of the backend options stored in its location (backend_path object). We should devise a way to set these differently on a per-object basis. One possibility that would be easy to implement would be to use a selection string to specify which objects to set to a particular configuration.
Installed tidas from github with autoconf, mpi-disabled and default prefix
Running Ubuntu 16.04, python 2.7.12, numpy 1.12.0
When trying to import tidas from python2.7:
Traceback (most recent call last):
File "./demo_telescope.py", line 29, in
import tidas
File "/usr/local/lib/python2.7/dist-packages/tidas/init.py", line 17, in
from .ctidas import (
File "/usr/local/lib/python2.7/dist-packages/tidas/ctidas.py", line 153, in
npu8 = wrapped_ndptr(dtype=np.uint8, ndim=1, flags="C_CONTIGUOUS")
File "/usr/local/lib/python2.7/dist-packages/tidas/ctidas.py", line 146, in wrapped_ndptr
base = npc.ndpointer(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/numpy/ctypeslib.py", line 288, in ndpointer
num = _num_fromflags(flags)
File "/usr/local/lib/python2.7/dist-packages/numpy/ctypeslib.py", line 163, in _num_fromflags
num += flagdict[val]
KeyError: u''
The docs use sphinx, and the C++ code is first passed through doxygen before importing the results using the "breathe" plugin for sphinx. For some reason, using the doxygenclass command from breathe puts both the C++ class docs AND a poorly formatted copy of the corresponding Python class. See for example the source here:
https://github.com/hpc4cmb/tidas/blob/master/docs/sphinx/group.rst
and the generated output here:
http://hpc4cmb.github.io/tidas/group.html
Looking at the html source it seems like maybe the html div ID for the constructor of the C++ class is getting mixed up with the python class. This makes things ugly and should be fixed.
Although read and broadcast of metadata seems to work, the mpi gather and replay of transactions seems to have a problem.
The SQLite index is so slow that it is a blocking factor. Looking through the code, there are several key mistakes:
If the MPI compilers are not found, the configure check assigns MPICC and MPICXX to the serial compilers rather than disabling MPI.
Currently the read_times() and write_times() methods force I/O of the entire timestamp vector. This should be changed to support partial I/O, just like normal fields.
The serial volume class uses an SQLite DB as its metadata store. When adding new objects to the volume, this DB is updated. Since we can't predict what the user is doing, each object insertion is a discrete SQL transaction. This takes a fraction of a second, but when first creating a volume and adding thousands of objects, this can be slow.
Note that this does not impact the MPI volume, since in that case metadata operations are stored locally in memory (very fast) and replayed to the main SQLite DB during a metadata sync using a single transaction (also very fast).
One solution is to also use an in-memory metadata store even in the serial case with explicit sync to disk.
Create C bindings, needed for python and Fortran bindings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.