Giter Club home page Giter Club logo

old-dask-examples's Introduction

Dask examples

This project is no longer maintained. Instead see examples documentation instead.

Collection of dask examples

Binder

Some of these notebooks are live-runnable thanks to Binder.

Binder

How to update

Anyone can, at any time, update the notebooks on binder by going to http://mybinder.org/ and entering dask/dask-examples for the repository, selecting requirements.txt and then pressing Make.

old-dask-examples's People

Contributors

adamchainz avatar chdoig avatar cowlicks avatar glemaitre avatar jcrist avatar michhar avatar mrocklin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

old-dask-examples's Issues

Archived?

Should this repo be archived as it is deprecated?

ImportError in Binder timeseries example

With the Python 2 interpreter, the Cumulative Sum, datetime resampling, and plotting section of the time-series-binder notebook raises the following ImportError when run in Binder:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-0791d5c4aecd> in <module>()
----> 1 df.A.cumsum().resample('1w', how='mean').compute().plot()

/home/main/anaconda2/lib/python2.7/site-packages/dask/dataframe/core.pyc in resample(self, rule, how, closed, label)
    980     @derived_from(pd.Series)
    981     def resample(self, rule, how=None, closed=None, label=None):
--> 982         from .tseries.resample import _resample
    983         return _resample(self, rule, how=how, closed=closed, label=label)
    984 

ImportError: No module named tseries.resample

Extend nyctaxi live binder example

Someone should do some data science on the NYC Taxi dataset that we can host on dask-examples with binder? This notebook could use some love https://github.com/blaze/dask-examples/blob/master/nyctaxi-2013.ipynb . This work could include the following:

  1. Think about analyses to do (or stealing analyses from previous work (of which there is quite a lot))
  2. Create visualizations (perhaps with bokeh)
  3. Think about how these analyses can be more fun for interactive users to play with (e.g. what are good examples where people can tweak parameters and get novel results)
  4. Work with anaconda cluster folk to add more datasets (like fare-2013 or the new datasets for 2014/2015)
  5. Compare differences between the old and new data

dask-array-basics

Hello,

Just wanted to let you know that while running the dask-array-basics.ipynb, I got the following error:

$ ipython
Python 2.7.11 (default, Dec 15 2015, 16:46:19) 
Type "copyright", "credits" or "license" for more information.

IPython 4.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import dask.array as da

In [2]: x = da.fromfunction(lambda i, j, k: i + j + k**2, chunks=(5, 512, 512), shape=(100, 2048, 2048), dtype='f8')

In [3]: x.to_hdf5('myfile.hdf5', '/x', compression='lzf', shuffle=True)

In [4]: import h5py

In [5]: f = h5py.File('myfile.hdf5')

In [6]: dset = f['/x']

In [7]: dset
Out[7]: <HDF5 dataset "x": shape (100, 2048, 2048), type "<f8">

In [8]: dset.chunks
Out[8]: (5, 512, 512)

In [9]: import dask.array as da

In [10]: x = da.from_array(dset, chunks=(5, 512, 512))

In [11]: x
Out[11]: dask.array<from-ar..., shape=(100, 2048, 2048), dtype=float64, chunksize=(5, 512, 512)>

In [12]: a = x[0, :5, :5]

In [13]: b = x[:, :5, :5]

In [14]: c = a - b.mean()

In [15]: c.compute()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-15-0c5d552d1b01> in <module>()
----> 1 c.compute()

/usr/local/lib/python2.7/dist-packages/dask/base.pyc in compute(self, **kwargs)
     32 
     33     def compute(self, **kwargs):
---> 34         return compute(self, **kwargs)[0]
     35 
     36     @classmethod

/usr/local/lib/python2.7/dist-packages/dask/base.pyc in compute(*args, **kwargs)
    100                 for opt, val in groups.items()])
    101     keys = [arg._keys() for arg in args]
--> 102     results = get(dsk, keys, **kwargs)
    103     return tuple(a._finalize(a, r) for a, r in zip(args, results))
    104 

/usr/local/lib/python2.7/dist-packages/dask/threaded.pyc in get(dsk, result, cache, num_workers, **kwargs)
     55     results = get_async(pool.apply_async, len(pool._pool), dsk, result,
     56                         cache=cache, queue=queue, get_id=_thread_get_id,
---> 57                         **kwargs)
     58 
     59     return results

/usr/local/lib/python2.7/dist-packages/dask/async.pyc in get_async(apply_async, num_workers, dsk, result, cache, queue, get_id, raise_on_exception, rerun_exceptions_locally, callbacks, **kwargs)
    481                 _execute_task(task, data)  # Re-execute locally
    482             else:
--> 483                 raise(remote_exception(res, tb))
    484         state['cache'][key] = res
    485         finish_task(dsk, key, state, results, keyorder.get)

AttributeError: 'thread._local' object has no attribute 'astype'

Traceback
---------
  File "/usr/local/lib/python2.7/dist-packages/dask/async.py", line 263, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python2.7/dist-packages/dask/async.py", line 244, in _execute_task
    args2 = [_execute_task(a, cache) for a in args]
  File "/usr/local/lib/python2.7/dist-packages/dask/async.py", line 245, in _execute_task
    return func(*args2)
  File "/usr/local/lib/python2.7/dist-packages/dask/array/core.py", line 48, in getarray
    c = a[b]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/dataset.py", line 367, in __getitem__
    if self._local.astype is not None:

My system:

$ uname -a
Linux lealpc 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

No module named castra

Error during evaluation in binder or in notebook on desktop for the

  • "Time Series and Dask DataFrame"
  • "NYC Taxi 2013"

notebooks an error is encountered with the text that no module named castra was found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.