Comments (5)
For performance comparison with "pure" pymongodb see
In [234]: %time df_retrieved = pd.DataFrame(list(db.ticks.find()))
CPU times: user 39.6 s, sys: 27.1 s, total: 1min 6s
Wall time: 1min 21s
In [236]: df_retrieved
Out[236]:
Ask Bid Spread Volume _id
0 0.88922 0.88796 0.00126 1 567c324fcc9915206eb18cc8
1 0.88914 0.88805 0.00109 1 567c324fcc9915206eb18cc9
2 0.88910 0.88809 0.00101 1 567c324fcc9915206eb18cca
3 0.88908 0.88811 0.00097 1 567c324fcc9915206eb18ccb
4 0.88887 0.88808 0.00079 1 567c324fcc9915206eb18ccc
... ... ... ... ... ...
1913358 0.87589 0.87525 0.00064 1 567c32b1cc9915206ecebed6
1913359 0.87589 0.87527 0.00062 1 567c32b1cc9915206ecebed7
1913360 0.87588 0.87531 0.00057 1 567c32b1cc9915206ecebed8
1913361 0.87574 0.87531 0.00043 1 567c32b1cc9915206ecebed9
1913362 0.87574 0.87531 0.00043 1 567c32b1cc9915206ecebeda
[1913363 rows x 5 columns]
from arctic.
we should use it store more ticks data in one record by pandas DataFrame , right?
from arctic.
Let's use same file for benchmarking https://drive.google.com/file/d/0B8iUtWjZOTqla3ZZTC1FS0pkZXc/view?usp=sharing
see also pydata/pandas-datareader#153
I wonder if they (manahl Arctic dev team) shouldn't use Monary instead of pymongo
https://github.com/ksuarz/monary https://monary.readthedocs.org/
Read this https://pypi.python.org/pypi/Monary/0.4.0.post2
It is possible to get (much) more speed from the query if we bypass the PyMongo
driver. To demonstrate this, I've developed *monary*, a simple C library and
accompanying Python wrapper which make use of MongoDB C driver.
see https://bitbucket.org/djcbeach/monary/issues/19/use-pandas-series-dataframe-and-panel-with
from arctic.
I think there's quite a lot of overlap between what Monary does and Arctic.
Monary makes it fast to marshall primitive types (numpy int, floats, etc) into and out of MongoDB. We do something similar, except we do compression and batching on the client side. A lot of the win (in network and disk I/O terms) comes from financial data being highly compressible. Because we batch in the client, we end up performing few pymongo operations relative to the number of ticks/rows.
For profiling perhaps try: %prun
in ipython
from arctic.
Thanks for your comments. I have made a mistake, that I should not insert single row to Arctic but with batch way. Happy new year. XD
from arctic.
Related Issues (20)
- Impossible to asignate datetime index
- Migrating existing tickstore to ArcticDB HOT 3
- Update tests to use MongoDB 4.4
- most recent version not pip installing on mac M1 HOT 3
- [Question] - how to design data (store_type/chunk_size) HOT 6
- MongoDB 4.2 EOL April 2023 - What's Next? HOT 8
- Dask integration for tickstore
- arctic.exceptions.QuotaExceededException: Mongo Quota Exceeded: xxx 10.366 / 10 GB used HOT 1
- Index Monotonic Sort Bug in class DateChunker
- best practice usage HOT 1
- Missing last chunk in CHUNK_STORE HOT 1
- argument of type 'NoneType' is not iterable (when updating) HOT 1
- When to upgrade to python3.10? HOT 1
- circleci build container new Unix version breaks mongo install HOT 1
- Test arctic with mongodb 4.4 HOT 1
- VersionStore delete old snapshots very slow with large numbers of snapshots HOT 1
- Versionstore: Arctic automatically assigns index a name - Perhaps needs some warning HOT 1
- Java interface (JDBC) support HOT 1
- using the numpy no more than 1.18.5. This will cause issue when work with other libraries, for example matplotlib. HOT 1
- "AttributeError: 'NoneType' object has no attribute 'append'" using 4 or more threads. Fewer threads works fine. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arctic.