mikepound / cubes Goto Github PK

View Code? Open in Web Editor NEW

166.0 166.0 43.0 14 KB

This code calculates all the variations of 3D polycubes for any size (time permitting!)

License: MIT License

Python 100.00%

cubes's People

Contributors

Stargazers

Watchers

cubes's Issues

Add permissive license to project

I would like to request the addition of a permissive license to the project. Currently, the project does not have a clear licensing statement, which may restrict its usage and redistribution. By adding a permissive license, we aim to provide a legal framework that allows users to freely use, modify, and distribute the project with minimal restrictions.

Suggestion: for rotation or translation fixed

change of view
instead of thinking about the cubes as structure, think about a mash of edges and node (group theory?). Each node is the center of a cube and each edge is the next bordering cube. In the following ideas only the nodes are used to identify a cube and in the translation fixed idea the edges are possible paths.

Rotation fixed
For a rotation fixed storage, it would be an idea to store the location in polar coordinates, because the radius would always be the same without the need to know the rotation. Important is only which point you use as center. Otherwise it gets pretty complicated to get unique radii for different cubes. All radii always give the same formation of cubes independent of rotation and the same hash as well.
Translation fixed
For a translation fixed storage it would be like a turtle game. Start with the turtle at some corner cube and tell it the move to the next cube. Important is, that cubes, which are only neighbour of one other cube, are the start or the end or, if both not possible, some predefined path is used. to the next cube. This way it is not important where the combination of cubes is. because the path is always the same and the hash of it as well.

Java solver that's a contender for finding N=17

I knew a trick from my futile attempt to solve the challenge in Matt Parker's net folding video. (i.e. 'Can the Same Net Fold into Two Shapes?')
With that trick, I was able to find f(13) = 138457022 in less than 90 minutes. What's nice is that it doesn't have a big memory footprint and it could be distributed between multiple CPUs if needed. As of July 15th, I think the program can solve f(17) in about 8^4 * 90 minutes = 256 days... so maybe under a year? Currently, I'm planning on making it go slightly faster, and then documenting it, and then parallelizing it.

Here's the code:
https://github.com/hidny/MikeTsCubeCode/blob/main/src/NumPolyShapeSolve/DFSPolyCubeCounter.java

slap jit from numba on it

resource: https://numba.pydata.org/numba-doc/latest/user/5minguide.html
from numba import njit

Lookup time

I am probably saying the same thing as other just I don't have the proper terms probably but to save on lookup time:

Take the binary representation of every transformation and sum them, and that becomes the object hash.

When evaluating a new object do the same and that hash should match any rotation with just one match lookup?

Probably a way to make a tree of these as well but I imagine the hashing is effectively doing that anyway.

[Discussion] I made a graph that allows me to estimate about how big a cubes_n.npy file will get (in bytes) when given n cubes.

https://www.desmos.com/calculator/fea4uymhix
According to this graph, file sizes will get ridiculously large even by the 12th iteration. Perhaps the storage format should be optimized?

crop_cube() very expensive

I've done some profiling runs to see where the most time is spent. In the main (mikepound) branch, 8% of the runtime for n=9 is spent within the crop_cube() function. In my merged branch, containing both my PR and @bertie2's PR, it's 18% of runtime, due to the biggest offender (rle()) being called less and executing faster.

It's the next logical step to attempt a speedup. I think the best path here is that rather than cropping naively, track where the new cube was added and crop where we know we only added 0s. This saves us from iterating over every cube in a side before deciding to trim it. This won't be easy and expand_cube() probably won't end up with as clean an implementation.

Rust version

I made a version in rust: https://gitlab.com/dzamlo/polycubes

It's quick and dirty code.

On my laptop, I can compute up to n=13. For n=14, the process consume too much memory and is killed.

You ask for a language which accounts for rotation, but you really want a language that erases both rotation and reflection.

You need a language in which writing down any point requires you to also write down every point which could correspond to that under taxicab rotation/reflection. This means every time you write down a point, you also write down eleven other points (in three dimensions). Testing whether any two points are equal is equivalent to a set comparison (the two sets contain all the same points). Testing whether any sets of points are equal extends from this.

There's your maximal language of k-ominoes. Enjoy.

[Discussion] A possible strategy for optimizing search space and symmetries

For an n-polycube:

All n-polycubes can be generated in a space with dimensions (n, n, n).
Most n-polycubes exist in more compact spaces.

Idea: Partition space into cells of dimension x, y, z.

Constraints:
x, y, z are all greater than or equal to 1.
Cells must be large enough to accomodate the n-polycube: x*y*z >= n
Cells should not be larger than needed: x+y+z = n+2
Permutations of x,y,z yield redundant cells.

Python function for generating unique set of cells:

from itertools import combinations_with_replacement

def find_unique_cells(n):
    """
    Finds the smallest set of cells to be populated by a polycube of size n.

    Parameters:
    n (integer): the size of the polycubes.

    Returns:
    Ordered list of tuples defining the dimensions for cells.

    """
    cells = set()
    for combo in combinations_with_replacement(range(1, n + 1), 3):
        x, y, z = sorted(combo, reverse=True)  # Sort in reverse order
        if x * y * z >= n and x + y + z == n + 2:
            cells.add((x, y, z))
    return sorted(
        cells, key=lambda group: group[0], reverse=True
    )  # Sort by the largest digit

Examples:
n = 1 | Cells: [(1, 1, 1)]
n = 2 | Cells: [(2, 1, 1)]
n = 3 | Cells: [(3, 1, 1), (2, 2, 1)]
n = 4 | Cells: [(4, 1, 1), (3, 2, 1), (2, 2, 2)]
n = 5 | Cells: [(5, 1, 1), (4, 2, 1), (3, 3, 1), (3, 2, 2)]
n = 6 | Cells: [(6, 1, 1), (5, 2, 1), (4, 3, 1), (4, 2, 2), (3, 3, 2)]
n = 7 | Cells: [(7, 1, 1), (6, 2, 1), (5, 2, 2), (5, 3, 1), (4, 3, 2), (4, 4, 1), (3, 3, 3)]
n = 8 | Cells: [(8, 1, 1), (7, 2, 1), (6, 2, 2), (6, 3, 1), (5, 3, 2), (5, 4, 1), (4, 4, 2), (4, 3, 3)]
n = 9 | Cells: [(9, 1, 1), (8, 2, 1), (7, 3, 1), (7, 2, 2), (6, 3, 2), (6, 4, 1), (5, 3, 3), (5, 4, 2), (5, 5, 1), (4, 4, 3)]
n = 10 | Cells: [(10, 1, 1), (9, 2, 1), (8, 2, 2), (8, 3, 1), (7, 3, 2), (7, 4, 1), (6, 3, 3), (6, 4, 2), (6, 5, 1), (5, 5, 2), (5, 4, 3), (4, 4, 4)]

Consider n = 4. Instead of a space of total size 4*4*4 = 64 we reduce it to (4*1*1)+(3*2*1)+(2*2*2) = 18

Cells come in 4 categories: (x=y=z), (x>y=z), (x=y>z), (x>y>z)

The Isometric or Cubic Cell: (x=y=z)
The Tetragonal Cells: (x>y=z) & (x=y>z)
The Orthorhombic Cell: (x>y>z)

Each cell category will only permit some rotations as cell dimension/shape must be preserved.
Effectively you don't want to rotate your n-polycube out of it's cell.

This should reduce the number of calculations of rotations and results to look up.

Things to consider further:

How to ensure all unique n-polycubes are generated in a cell
Is there a fixed way to build all?
Could this make the crop_cube(cube) function redundant?
Put the (n-1)-polycubes into cells for n-polycubes to generate new ones?
Which (n-1)-polycubes go into which n-polycube cells and how?

I thought of starting at (1, 1, 1) but that will lead to some issues. with for instance not generating the 7-polycube of octahedral shape.
Should you start in the middle of the x,y-plane? In the middle of a cell?

It's also possible to generate shapes by applying symmetry operations on n-polycubes in cells. The 7-polycube of octahedral shape could be constructed from something like the (2, 2, 1) cell:
[[1 0]
[1 1]]
But I'm not sure how this would work with ensuring all possible shapes get generated.

Any comments as to why any of this shouldn't work or improve runtime and clever ideas to explore further are welcome.

Rotation invariant representation for 2-D shapes

My attempt to represent the shapes in a rotation invariant way is inspired by Local Binary Patterns (LBP).

LBP is used in machine vision to extract descriptors from an image. It is quite simple in principle (if you are not familiar with it, maybe read the article first!).

How does it work?

See the source

Say we have a shape like the one in the image (green blocks):

1. Add zero padding for calculation

We then add 2 levels of zero padding around the shape, which looks like this:

2. Calculate the cell values using modified LBP

Then we can start calculating the feature matrix. We start from the second cell of the second row (cell (1, 1), see the figure below) so that there are not edges next to it. The cell value for cell (1, 1) is calculated based on its 8-neighboring values as well as its own value. Each cell value is a 9-bit integer (0-512).

Here the cell (1, 1) is highlighted in orange:

Start from the top-left neighbor. If there is a block (=green), set the first bit (MSB) of the cell value to 1, otherwise 0.
Then proceed to the top-mid neighbor. Again, if there is a block, set the second bit of the cell value to 1.
Repeat for the top-right neighbor and set bit position 3 accordingly.
Repeat for the mid-right neighbor and set bit position 4 accordingly.
Repeat for the bottom-right neighbor and set bit position 5 accordingly.
Repeat for the bottom-mid neighbor and set bit position 6 accordingly.
Repeat for the bottom-left neighbor and set bit position 7 accordingly.
Repeat for the mid-left neighbor and set bit position 8 accordingly.
Repeat for the middle cell (the cell itself!) and set bit position 9 (LSB) accordingly.

Now we have a 9-bit cell value for the first cell: 00001_0000, which is 16 in decimal. We can then move to the next cell. This is repeated for all cells except cells that are on the edge (so skipping the first and last columns as well as the first and last rows).

Here are all the cell values for the example shape:

Then, the feature vector of this shape is:

[[ 16  24  28  12   4   0]
 [ 32  33  51  27  14   4]
 [ 64 192 496 425 263   2]
 [  0   0 112 201 390 256]
 [  0   0  96 129 258   0]
 [  0   0  64 128 256   0]]

3. Calculate the sum of the matrix values

Finally, we sum all the cell values in the feature matrix to get the fingerprint of that shape. This fingerprint is the same for all rotations of the same shape.

Comparing the variants

Here are all the variants of the example shape, along with their feature matrices and their sums (the fingerprints).

Matrix:
[[ 16  24  28  12   4   0]
 [ 32  33  51  27  14   4]
 [ 64 192 496 425 263   2]
 [  0   0 112 201 390 256]
 [  0   0  96 129 258   0]
 [  0   0  64 128 256   0]]

Sum: 3577

Matrix:
[[  0  16   8   4   0   0]
 [ 16  56  29  30  12   4]
 [ 48 105 167 291   3   2]
 [112 201 454 448 384 256]
 [ 96 129 258   0   0   0]
 [ 64 128 256   0   0   0]]

Sum: 3577

Matrix:
[[  0  16   8   4   0   0]
 [  0  48   9   6   0   0]
 [ 16 120 141 262   0   0]
 [ 32 113 155 286  12   4]
 [ 64 224 417 291   3   2]
 [  0  64 192 448 384 256]]

Sum: 3577

Matrix:
[[  0   0   0  16   8   4]
 [  0   0   0  48   9   6]
 [ 16  24  28 124 141 262]
 [ 32  33  51 107 135 258]
 [ 64 192 480 449 386 256]
 [  0   0  64 128 256   0]]

Sum: 3577

As you see, all variants have the same fingerprint 3577!

Final words

I have tested this with maybe tens of various 2-D shapes. So far so good. Check my Notebook for all the tests.

This may not work in 3D at all, but pretty interesting nevertheless.
One fear I have is that the fingerprint is not unique. There might be some other shapes that have the same fingerprint. I will check this later as well.
It is possible that this is computationally more complex than the already found solutions.
I was quite tired (and sick) when I did this so it may well be overly complicated and I am just missing the obvious.

I have not yet proven nor disproven that this will work. However, I am planning to do that later.

Suggestion for orientation fixing

Already posted this on the YT vid but I'll repeat it here anyways.

For fixing orientation you 1st want to determine the longest or shortest section of the shape, doesn't matter what way it's rotated at this point, you just set that section as the central axis. Next you find the opposite (so if set longest as central then now you want shortest) and set that as the next axis to orientate around. Now you do the basic √(A² * B²) to find the shortest or longest hypotenuse between those 2 lengths and set that as "up" or "down". You should have a fixed orientation at this point, if not then some more calculations based on the remaining length are needed but that you should be able to work out yourself.

As for the probable number of unique shapes, start with W * H * D of the total area available for orienting in and then apply whatever formula it was for calculating the number of variants in sequences of length N like 2=2(XY,YX) & 3=6(XYZ, XZY, YXZ, YZX, ZXY, ZYX), that will give you a suitable amount of shapes to potentially store.

Finally you can speed up the search for same by just using a list of offsets to that pre-allocated array, the offsets will be for the start of the shapes containing the same number of cubes and a length of said array. From there I have no further ideas for optimal searching.

Typo in function 'render_shapes'

In function 'render_shapes', the name PLT has not been defined. I assume it should be defined as some graphical object, but I cannot work out what.
Can this be fixed please?

Suggestion to reduce number of run encodings/lookups/rotations needed in many cases while keeping memory use to a minimum

The idea is to use the projections of the polycube on each of the three axes to come up with a set of orientation rules.

For each axis:

First take the 2D projection of the bounding box on this axis using np.any() and compute the sum of the projection
Next take the two 1D projections of the 2D projection and compute the sum of each projection

This gives a three-tiered hierarchy to orient into a canonical rotation with:

2D projection area
Larger 1D projection length
Smaller 1D projection length

Degeneracy still needs to be taken into account. Even in the best case, there would still need to be 4 calculations of the run encoding and 4x memory use. In the case where two pairs of faces are degenerate, 8, and in the case where all three are degenerate, 24.

Combine this strategy with storing the degenerate orientations in the hash table instead of computing the different orientations when testing a new possible polycube, and I think this would be a reasonable way to reduce the processing load while keeping memory use low.

Doesn't run?

python cubes.py 3
File "", line 1
python cubes.py 3
^^^^^
SyntaxError: invalid syntax

Potential Idea For Finding Rotations Quicker

import numpy as np

m = np.array([[1,1,0],
              [0,1,0,],
              [0,1,0]])

stored_shapes = {}


def generate_super_shape(array, counter = 1, shapes=None): # get rotated versions of input shape and input rotated 90 degrees
    if shapes == None:
        shapes = []
    
    shapes.append(array)
    shapes.append(np.fliplr(array))
    shapes.append(np.flipud(array))
    shapes.append(np.rot90(array, 2))

    if counter == 2:
        return shapes
    else:
        return generate_super_shape(np.rot90(array), counter = counter + 1, shapes=shapes)


def is_shape_stored(array, hashmap):
    res = generate_super_shape(array)

    super_shape_1 = np.array([res[0], res[1], res[2], res[3]])
    super_shape_2 = np.array([res[4], res[5], res[6], res[7]])
    final_super_shape = np.array([super_shape_1, super_shape_2])

    oned_final_super_shape = final_super_shape.ravel()
    oned_final_super_shape = tuple(oned_final_super_shape.tolist())
    
    if oned_final_super_shape in hashmap:
        return True
    else:
        hashmap[oned_final_super_shape] = 1
        return False

The idea is to hopefully make it so that you only spend time generating the superposed shape which is composed of all possible rotations and then store this in a hashtable for O(1) lookup time. This means that the next time you generate a shape, to check if its already had a rotated counterpart stored, just generate the superposed shape and lookup the hashtable for it. In theory the superposed shape should be the same if I use a rotated version of a shape I have already seen as the superposed version overlays all the shapes into one (combining all the 2d arrays into one flattened array).

The issue is that I'm not sure how to make it so that the arrays are always aligned correctly as the superposed shape will be in a different order depending on the input shape and that will mean looking it up in the hash table will result in a false result as the superposed shape will be different.

Not sure if this works at all and if it's even faster, just had an idea.

Drawing Example:

Go Version

I'd like to try and port it over to go, to see how it compares with speed

[Discussion] Traversal as opposed to grid representation

Explanation

Consider a list of directions instead of a representation of the polycubes inside a n-dimensional array. Start with 2D for simplicity. For 2D we will just go "north (n), south (s), east (e), west (w)."

Size	Direction Sets	Notes
1	no movement allowed
2	n	e, w, s are rotations
3	(n: n) (n: e)	we can only start from previous terms
4	(n,n: n) (n,n: e) (n,e: n) (n,e: s) (n,e+n or n,n: s,e)??	how to define the "T" tetromino splitting or backtracking?

Using a logic like this it feels like we could develop a set of rules for traversal. We might be able to determine that certain "turns" would simply be unallowed (overlap previous blocks, produce rotations, etc).

To elevate this to 3D, we would only have to add up and down.

Size	Direction Sets	Notes
1	no movement allowed
2	n	e, w, s, u, d are all "rotations"
3	(n: n) (n: e)	-
4	(n,n: n) (n,n: e) (n,e: n) (n,e: s) (n,e: u) (n,e+n or n,n: s,e) (n,e+u or n,e: w,u) (n,w: u or n,u: e)??	the last one is a mirror of (n,e: u) and not derived from the previous set

We'd have to go a few more terms to try to find more "rules" when adding on to previous sets.

While this could easily be hashed, ideally we would not have to hash anything and just have a set of rules that are followed until completion, giving us our number of shapes possible.

Immediate Concerns

Right away there is the question of how we "know" that there are rotations. Limiting movement to "only east or up," "only branch in opposite directions or at 90 degress" or something may work for the first few terms, but I'm scared of the strong law of small numbers and finding some extra rule at n=10 or n=100.
When thinking about n=5, (n,e,e,s) is already not derived from any n=4 path. We could do another split early and go (n+e,-reset-,n,e). Maybe slitting earlier would be prefable?
Also, we need to handle multiple splits and decide on a better representation of "subpaths" off of split branches. For example in n=7, we have a "big T" that could be (n,n,(w,w),(e,e)) or something weird like that.

Thanks!

Thank you for the ability to think about this. These questions are the kind of questions that got me to be a programmer. Maybe I'll get carried away and take a stab at writing some code to generate some numbers!

n=12 npy file.

you didn't say where to put the n>11 files so I will be dumping them in this issue.

https://drive.google.com/file/d/1oKcpgGCj4CCuYqJyWKglPqRbO8WNjaB5/view?usp=sharing

count agrees with kevin gongs numbers exactly.

please note this was generated by my own parallelized version of your code at @ https://github.com/bertie2/cubes/tree/parallelism
so I may have introduced some weird bug, however since the numbers match with kevins I'm pretty confident.

also these files are going to get biiiiig as time goes on, so might want to look for a sane way to host them.

in the mean time i'm throwing more cores at the issue and looking at porting over to c# so will see you at n=13

Potential new method for generating unique strings to describe the shapes.

I've only tested the method that I've come up with in 2 dimensions, but it seems to work for the small extents to which I've tested it, though I myself have little coding knowledge and don't know how to write a programme to test it further.
The method is as follows: For each square in the polyform you will generate a list which relates its position to all other squares in the polyform (by a method which I will describe shortly). Then you add together all of the lists that you generated for each of the squares ex: (1,3,7)+(0,2,3) = (1,5,10) and that summed list will be unique to the shape from which it was generated, this holds through any transformation or rotation because it depends on the relative points of the shape and not the position of the shape (and in the manner in which I've done it, separate lists will be generated for shapes which are right-handed and left-handed).

The method for generating the strings for each square is this: You move outwards one space from each face of the square you are checking. Then, in these four adjacent squares, you add up how many of those four spaces are parts of the shape and make that sum, anywhere from 0 to 4, the first number in the list, then you move clockwise and check the four spaces at the corners of the square and make that total the second number in the list. After that you check all of the spaces two spaces away from the faces of the square you're doing, then you rotate the spaces checked clockwise until you've moved to a position that you've checked before wherein you move even further from the faces of the square. You continue checking the spaces of these successively larger rings until you've accounted for all of the squares in the shape by which you have generated a string describing the relative position of a square to all other squares in the shape ex: (1,0,1) (the example is from an end of a three long and one high polyform).

I add everything at the end so that it doesn't matter what order you do it in and I check four spaces at a time because then you'll get the same numbers no matter the rotation of the object. I've also attempted to attach an image to visually explain the square checking sequence.

RUST Compile optimisations built on the excellent work of others. Linux & Windows. With a RUST setup guide for Windows

Expanding on some great work done by other contributors, I created a few optimisations for the RUST implementations

e.g.

Additional compile options for RUST in Linux based on @dzamlo native cpu script - https://gitlab.com/dzamlo/polycubes2
Linux and Windows RUST implementation using the script by @pbj4 - https://github.com/pbj4/polycubes
Additional steps on Linux to setup a ram disk, copy the project to it run in ram
Windows batch script for RUST converted on the above
RUST install for Windows tutorial

Note: When compiling I found that AMD processors where more efficient than Intel. Although I only had two systems to test. My personal PC with an AMD R7 5700X (8C/16T) overclocked and an Intel XEON Silver 4210 (10C/20T) but that may be down to processor clock speed.

There were a few more contributors such as @hidny that inspired ideas.

My guides are at pbj4/polycubes#1
Main thread where I posted some timings while going through the compile optimisations - mikepound/opencubes#52

Hope this helps

Tony