eve-ning / glcm-cupy Goto Github PK

View Code? Open in Web Editor NEW

6.0 2.0 5.0 105.33 MB

GLCM in CUDA

License: MIT License

Python 100.00%

cuda cupy glcm python computer-vision feature-engineering

glcm-cupy's Introduction

I am Data Magician Eve-ning! 🪄

glcm-cupy's People

Contributors

Stargazers

Watchers

Forkers

jcfaracco akn0717 anthony-jpg

glcm-cupy's Issues

Add support to tox.

Is it worth for this project?

https://tox.wiki/en/latest/

Reduction of original image size

Hello, really great work in implementing this. One question...:

I understand the image shrinks because of the effect of the sliding window on the edges. Wouldn't there be a way around that by padding the original image on its outside dimensions with artificial values (a number not used in the grayscale binning range - let's say 0, while the scale of the image would be (1-16), instead of (0-15). GLCMs rows and columns with 0 would then have to be removed before haralick features calculations.

Low Priority: Removal of view_as_windows from CUCIM if redundant

currently in glcm_cupy/glcm/glcm.py, we have

from cucim.skimage.util.shape import \
    view_as_windows as view_as_windows_cucim

from glcm_cupy.utils import view_as_windows_cp

...

if USE_CUCIM:
    ij = view_as_windows_cucim(
        im_chn, (self._diameter, self._diameter)
    )
else:
    ij = view_as_windows_cp(im_chn, (self._diameter, self._diameter))

which may be redundant, if view_as_windows_cucim does not perform any essential speedup over view_as_windows_cp.

view_as_windows_cp is an adaptation of skimage.utils.view_as_windows, with cp.ndarray substituting np.ndarray

IndexError: tuple index out of range in glcm_cupy/glcm/glcm.py

Hello,

I am facing this error. Error details and my usage are shown below:

g = GLCM(
File "/anaconda/envs/cucim/lib/python3.9/site-packages/glcm_cupy/glcm_base.py", line 107, in run
self.progress = tqdm(total=self.glcm_cells(im),
File "/anaconda/envs/cucim/lib/python3.9/site-packages/glcm_cupy/glcm/glcm.py", line 82, in glcm_cells
return np.prod(self.glcm_shape(im[..., 0])) *
File "/anaconda/envs/cucim/lib/python3.9/site-packages/glcm_cupy/glcm/glcm.py", line 90, in glcm_shape
im_chn.shape[1] - 2 * self.step_size - 2 * self.radius)
IndexError: tuple index out of range

def get_glcm_gpu(patch):
    ar = np.asarray(patch)
    # g = GLCM(directions=(Direction.EAST, Direction.SOUTH_EAST, Direction.SOUTH, 
    #             Direction.SOUTH_WEST), bin_from=256, bin_to=16).run(patch)
    g = GLCM(
            directions=(Direction.EAST, Direction.SOUTH_EAST),
            bin_from=256, bin_to=16).run(ar)

    return g

patch is nothing but a grayscale image of size (50, 50). Please help.

Disable/enable features on demand

If you calculate GLCM today, it returns all the 6 features implemented.

Most researchers or developers don't need all features calculated and if you have a huge size of images to calculate it could mean a huge memory consumption.

In my case I have a dataset of 2 GB and if I multiply by 6 (number of features), my code needs 8 GB at least to run. (Ok, I know... there are several workarounds for it 😃, but I'm imagining a real time scenario).

This feature request would be interesting because we cal also compare the speedup GLCM calculation with scikit image for each props separately.

Fix tqdm cupy ndarray compatibility

Currently tqdm is down for cupy

Solution

The main cause is that glcm_cells returned an cp.ndarray which tqdm rejected.

Since it's not necessary that it's an ndarray (we're expecting an int), we cast it to int.

Implement CI/CD

Currently trying to implement CI/CD however, there's some trouble with dealing with GitHub Actions' CUDA

https://github.com/Eve-ning/glcm-cupy/runs/7160528149?check_suite_focus=true

 E   ================================================================
E   Failed to import CuPy.
E   
E   If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.
E   
E   On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
E   On Windows, try setting CUDA_PATH environment variable.
E   
E   Check the Installation Guide for details:
E     https://docs.cupy.dev/en/latest/install.html
E   
E   Original error:
E     ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
E   ================================================================

See: #24

Hash-Comparison GLCM Integration Test

Hotfix for many issues

normalize_features should be normalized_features as they both exist
glcm and glcm_cross should explicitly call signatures. E.g. fn(explicit_arg=arg)
test__from_windows() fails and is redundant
test_from_2d_image fails due to missing 3rd dimension
test_image_tiff is redundant
GLCM._binner is now binner in utils

Replacing view_as_windows_np in glcm_cross breaks assertion

See glcm_cross.py make_windows. By uncommenting and using view_as_windows_cp, the unit tests fail for some reason.

The I/O of the fns are the same
This only fails if diameter = input shape (only 1 GLCM window to window comparison
This difference in output is significant, around 0 ~ 0.05 on most features.

Implement dim for multi-image batch processing

In our case, we have (H, W, C) as input, we may expand to (B, H, W, C)

B will act similarly to C, just that it'll be the dim for separate images instead of channels

I believe this is the best order of ndim as it's similar to syntax of PyTorch

PyTorch uses: (N,Cout,Hout,Wout), however, I feel that C in the 2nd dim doesn't make sense compared to how conventional image loaders order the dimensions. (H, W, C) or (W, H, C)

Allow processing data directly from cupy input.

The example bellow does not work because it expects a numpy data from input. The code should also consider cupy data. Specially for workflows that are handling data directly from GPU.

import cupy as cp
from PIL import Image

# Here, we load in the array
# We divide the image by / 16 as it'll take too long

ar = cp.asarray(Image.open("../../data/image.jpg"))[::4,::4]

# We may use the class variant to run GLCM
from glcm_cupy import GLCM, Direction

g = GLCM(
    directions=(Direction.EAST, Direction.SOUTH_EAST),
    bin_from=256, bin_to=16).run(ar)

# Alternatively, use the function variant
from glcm_cupy import glcm

g = glcm(ar, bin_from=256, bin_to=16)

# We yield the features using constants defined in conf
from glcm_cupy.conf import CONTRAST, CORRELATION, ASM

print(g[..., CONTRAST])
print(g[..., CORRELATION])
print(g[..., ASM])

# Alternatively, since these constants are simply integers
print(g[..., 0])
print(g[..., 1])
print(g[..., 2])

It returns basically a TypeError:

TypeError: `arr_in` must be a numpy ndarray