Nanocubes: an in-memory data structure for spatiotemporal data cubes
Nanocubes are a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs โ Research. Visualizations powered by nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases nanocubes uses sufficiently little memory that you can run a nanocube in a modern-day laptop.
Releases
Number | Description |
---|---|
2.1 | Javascript front-end, CSV Loading, Bug Fixes |
2.0 | New feature-rich querying API |
1.0 | Original release with a simple querying API |
Compiling latest release
Prerequisites The nanocubes server is written in C++ 11 (gcc >=4.7.2). You'll need a recent version of boost (>=1.48) and the GNU build system.
Run the following commands to compile nanocubes on your linux/mac system.
$ wget https://github.com/laurolins/nanocube/archive/2.1.zip
$ unzip 2.1.zip
$ cd nanocube-2.1
$ ./bootstrap
$ ./configure
$ make
If a recent version of gcc is not the default, you can run configure
with the specfic recent version of gcc in your system. For example
$ CXX=g++-4.8 ./configure
If your system has tcmalloc installed, we suggest linking nanocubes with it. For example
$ CXX=g++-4.8 ./configure LIBS=<path-to-tcmalloc>/libtcmalloc_minimal.a
Loading a CSV file into a nanocube
This procedure assumes your machine is not running anything on ports 8000 and 29512
-
Installing Pandas in a separate python env (Optional)
$ wget http://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz $ tar xfz virtualenv-1.11.4.tar.gz $ python virtualenv-1.11.4/virtualenv.py myPy # activate the virtualenv, type "deactivate" to disable the env $ source myPy/bin/activate $ pip install argparse numpy pandas
-
Start a web server in the "web" directory and send it to background
$ cd web $ python -m SimpleHTTPServer &
-
Run the script and pipe it to the nanocubes server using the included example dataset (Chicago Crime)
$ cd ../scripts $ python csv2Nanocube.py --catcol='Primary Type' crime50k.csv | NANOCUBE_BIN=../src ../src/ncserve --rf=100000 --threads=100
This should be it. Point your browser to http://localhost:8000 for the viewer
Parameters for csv2Nanocube.py
--catcol='Column names of categorical variable'
--latcol='Column names of latitude'
--loncol='Column names of longitude'
--countcol='Column names of the count'
--timecol='Column names of the time variable'
--timebinsize='time bin size in seconds(s) minutes(m) hours(h) days(D)
--sep='Delimiter of Columns'
e.g. 1D/30m/60s'
Further Details
For a better understanding on how to ingest data into nanocubes and how to query nanocubes follow this link. For larger datasets or if you want more flexibility on ingesting/querying data using nanocubes the CSV loading method illustrated above might not be the most efficient way to go.
Asking for help
IRC: #nanocubes channel on irc.freenode.net
mailing list: http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net