Trying out Rust's DataFusion, compare to Apache Spark.
Here is the full blog post
Trying out Rust's DataFusion, compare to Apache Spark.
Trying out Rust's DataFusion, compare to Apache Spark.
Here is the full blog post
Thank you for providing the informative article, DataFusion courtesy of Rust, vs Spark. Performance and other thoughts and code.
The article does not mention the source of the data, but I assume it's divvy-tripdata. Could you please elaborate on what data were used in your tests?
I've run the same Rust example, using the current DataFusion version, 32.0.0. The data files I used span from 202004-divvy-tripdata.csv
through 202309-divvy-tripdata.csv
, i.e 42 files.
I get a total run time of just over half a second (on a Macbook Prod M2):
...
...
| 57 | 2022.0 | 9.0 | 9.0 | Honore St & Division St |
+------+--------+-------+------+-----------------------------------------------------+
Elapsed: 670.41ms
For comparison, on the same laptop, I get a time of around 3 seconds from Spark.
Hence, I'm very surprised by the 180 second runtime that you got. Perhaps it is related to the earlier version of DataFusion that was used in your example. Would you mind running with the latest version of DataFusion?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.