Comments (4)
Here is how to optimize the current storage code, but this still doesn't solve the fast growth of the sheer number of cubes:
np.save(cache_path, np.packbits(np.asarray(polycubes, dtype=np.int8), axis=-1), allow_pickle=False)
notes:
- it uses
np.packbits
allow_pickle
by default isTrue
. But we don't have object arrays, so we don't need pickles.- the
np.load
should also haveallow_pickle=False
. Its output should then be processed withnp.unpackbits()
withaxis=-1
, and also you'll have to undo the effect of packing a size that is not a multiple of 8; possibly withsize
parameter, or manually crop the last zeros
from cubes.
About storing the cubes
Note:
Here I've ignored
- lossless compression
- information theory? - have no idea if it would be useful at all (to encode the polycubes more efficiently - to store them using less space. Have no idea how.)
For n=16 (the current record):
The theoretical minimum to store 50 billion DIFFERENT THINGS (not even polycubes, but at least their ids):
To uniquely identify each of the 50 billion things - we need at least log_2_{50e9} = 36 bits for each id. Needed storage space:
~ 50e9 * 36 bits = 225 GB
For n=20:
Let's say we want to make progress until n=20.
Assuming the number of polycubes grows by a factor of 7 with each n.
n_cubes = 50e9 * 7**4 # approx.
n_bits = np.log2(n_cubes)
need_bytes = n_cubes * n_bits / 8
need_bytes / 1e12 # terabytes
~ 700 TB
For n=30:
n_cubes = 50e9 * 7**14 # approx.
n_bits = np.log2(n_cubes)
need_bytes = n_cubes * n_bits / 8
need_bytes / 1e21 # zettabytes
~ 317 ZB
This number is on the scale of the whole internet.
- So it's safe to assume that nobody will ever want to store all the cubes for large n.
from cubes.
And not to mention that you have to store all of that in ram before you actually write it, meaning that if it remains uncompressed, your computer (or program) will crash if you make the polycube too big. Even then it's not a matter of "if", it's a matter of "when". Compressing it will only make it crash earlier and may make it run slower.
I'm not against the idea of compression, these are just things to consider.
from cubes.
you have to store all of that in ram before you actually write it
You don't actually have to store it all in RAM at the same time.
Polycubes can be processed and counted separately from each other (it will just take longer), and it can be distributed across multiple machines.
See an example algorithm I wrote here: mikepound/opencubes#7 (comment) (maybe an even better approach exists).
There is also a link to the paper which describes useful ideas for reaching n=16.
from cubes.
Related Issues (20)
- slap jit from numba on it HOT 1
- Suggestion for orientation fixing HOT 5
- Suggestion: for rotation or translation fixed HOT 10
- crop_cube() very expensive HOT 2
- n=12 npy file. HOT 1
- Suggestion to reduce number of run encodings/lookups/rotations needed in many cases while keeping memory use to a minimum HOT 1
- Add permissive license to project HOT 2
- Rust version
- [Discussion] A possible strategy for optimizing search space and symmetries HOT 7
- [Discussion] Traversal as opposed to grid representation HOT 1
- Doesn't run? HOT 1
- Lookup time
- Go Version
- Java solver that's a contender for finding N=17 HOT 10
- Typo in function 'render_shapes' HOT 1
- Potential Idea For Finding Rotations Quicker HOT 1
- Potential new method for generating unique strings to describe the shapes. HOT 1
- Rotation invariant representation for 2-D shapes HOT 3
- RUST Compile optimisations built on the excellent work of others. Linux & Windows. With a RUST setup guide for Windows
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cubes.