deanla / rotavirus Goto Github PK
View Code? Open in Web Editor NEWJapan RSV transmission
Japan RSV transmission
While reading a DataFrame there is one parameter which is called low_memory
and it's set to True by default. It's function is to decide minimal data type that is required to fit values of each column which seems to be for memory optimization purposes. In order to detect correct data type we need to consider all values in a column which doesn't seem to be optimal for big DataFrame because of 2 reasons I guess: memory and data loading time. And my assumption is that Pandas is optimizing both. That's why this parameter is True by default. I didn't dig into the implementation of that optimized version, how it detects data types (maybe reading some chunk of DataFrame take the minimal requirement).
The problem is that sometimes it gives unexpected results. Once I spent one week of some heavy calculations on chunks of data with a hope that I could assemble it back using index which was definitely unique. But I didn't check one specific detail that index was 8digit at the beginning of data and it was becoming 16digits (it was takes from some db with different versions primary key). While reading chunks of data I was actually getting first 8digits from 16digit index since low_memory
was set to True by default and didn't check all index values. Finally I ended up with the calculations with no hope to assemble back and merge to original data.
I told such a long and dramatic story because that low_memory
option is very strange, nobody takes it seriously but it becomes very critical in some cases.
So, please consider that case and put some warnings about that in dovpanda.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.