pwoznicki / autoradiomics Goto Github PK

View Code? Open in Web Editor NEW

32.0 3.0 7.0 19.02 MB

The easiest tool for experimenting with radiomics features.

License: Apache License 2.0

Dockerfile 0.28% Python 97.53% Makefile 0.67% Jinja 0.48% Starlark 1.05%

medical-image-processing medical-image-computing radiomics

autoradiomics's People

Contributors

Stargazers

Watchers

Forkers

xiaoxiu0906 yjseol94 stephano41 zy20030535 harel-coffee radreports laqua-stack

autoradiomics's Issues

Add code generation for segmentation with nnUNet

Example code from Supplementary S1 does not work.

Hello,

I just noticed that your small example in the Supplement referred to as Figure S1 does not work as the Inferrer function has been modified in the meantime.

I was just wondering how we can evaluate the best-trained model on the test data. as this might not be included in the WORC example.

Best,
Jonas

bug: Installing statsmodel 0.13.2 (fixed dependency) fails on python 3.11.

Trying to install statsmodels==0.13.2 fails on python 3.11.

This is likely related to statsmodels/statsmodels#8868.

Error log

```shell Collecting statsmodels==0.13.2 (from autorad==0.2.6) Downloading statsmodels-0.13.2.tar.gz (17.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.9/17.9 MB 67.7 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [151 lines of output]
:19: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
:218: DeprecationWarning:

    `numpy.distutils` is deprecated since NumPy 1.23.0, as a result
    of the deprecation of `distutils` itself. It will be removed for
    Python >= 3.12. For older Python versions it will remain present.
    It is recommended to use `setuptools < 60.0` for those Python versions.
    For more details, see:
      https://numpy.org/devdocs/reference/distutils_status_migration.html
  
  
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
      # smooth
      for i in range(n):
          prev = (n+i-1) % n
  
          # s[t-m] = xhat[prev, 2+m-1]
          yhat[i] = (xhat[prev, 0] * xhat[prev, 1]**phi) + xhat[prev, 2+m-1]
                                                         ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:156:55: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
  
          # s[t-m] = xhat[prev, 2+m-1]
          yhat[i] = (xhat[prev, 0] * xhat[prev, 1]**phi) + xhat[prev, 2+m-1]
          # l_t = a * (y_t - s_t-m) + (1-a) * (l_t-1 * b_t-1**phi)
          xhat[i, 0] = (alpha * (y[i] - xhat[prev, 2+m-1])
                        + (1 - alpha) * (xhat[prev, 0] * xhat[prev, 1]**phi))
                        ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:159:22: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
          # l_t = a * (y_t - s_t-m) + (1-a) * (l_t-1 * b_t-1**phi)
          xhat[i, 0] = (alpha * (y[i] - xhat[prev, 2+m-1])
                        + (1 - alpha) * (xhat[prev, 0] * xhat[prev, 1]**phi))
          # b_t = (b*) * (l_t / l_t-1) + (1 - (b*)) * b_t-1**phi
          xhat[i, 1] = (beta_star * (xhat[i, 0] / xhat[prev, 0])
                        + (1 - beta_star) * xhat[prev, 1]**phi)
                        ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:162:22: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
      # smooth
      for i in range(n):
          prev = (n+i-1) % n
  
          # s[t-m] = xhat[prev, 2+m-1]
          yhat[i] = (xhat[prev, 0] * xhat[prev, 1]**phi) * xhat[prev, 2+m-1]
                                                         ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:191:55: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
  
          # s[t-m] = xhat[prev, 2+m-1]
          yhat[i] = (xhat[prev, 0] * xhat[prev, 1]**phi) * xhat[prev, 2+m-1]
          # l_t = a * (y_t / s_t-m) + (1-a) * (l_t-1 * b_t-1**phi)
          xhat[i, 0] = (alpha * (y[i] / xhat[prev, 2+m-1])
                        + (1 - alpha) * (xhat[prev, 0] * xhat[prev, 1]**phi))
                        ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:194:22: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
          # l_t = a * (y_t / s_t-m) + (1-a) * (l_t-1 * b_t-1**phi)
          xhat[i, 0] = (alpha * (y[i] / xhat[prev, 2+m-1])
                        + (1 - alpha) * (xhat[prev, 0] * xhat[prev, 1]**phi))
          # b_t = (b*) * (l_t / l_t-1) + (1 - (b*)) * b_t-1**phi
          xhat[i, 1] = (beta_star * (xhat[i, 0] / xhat[prev, 0])
                        + (1 - beta_star) * xhat[prev, 1]**phi)
                        ^
  ------------------------------------------------------------
  
  statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx:197:22: Cannot assign type 'npy_float64 complex' to 'float64_t' (alias of 'double')
  Compiling statsmodels/tsa/_stl.pyx because it changed.
  Compiling statsmodels/tsa/holtwinters/_exponential_smoothers.pyx because it changed.
  Compiling statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx because it changed.
  Compiling statsmodels/tsa/_innovations.pyx because it changed.
  Compiling statsmodels/tsa/regime_switching/_hamilton_filter.pyx because it changed.
  Compiling statsmodels/tsa/regime_switching/_kim_smoother.pyx because it changed.
  Compiling statsmodels/tsa/innovations/_arma_innovations.pyx because it changed.
  Compiling statsmodels/nonparametric/linbin.pyx because it changed.
  Compiling statsmodels/robust/_qn.pyx because it changed.
  Compiling statsmodels/nonparametric/_smoothers_lowess.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_initialization.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_representation.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_kalman_filter.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_filters/_conventional.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_filters/_inversions.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_filters/_univariate.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_filters/_univariate_diffuse.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_kalman_smoother.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_smoothers/_alternative.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_smoothers/_classical.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_smoothers/_conventional.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_smoothers/_univariate.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_smoothers/_univariate_diffuse.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_simulation_smoother.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_cfa_simulation_smoother.pyx because it changed.
  Compiling statsmodels/tsa/statespace/_tools.pyx because it changed.
  [ 1/26] Cythonizing statsmodels/nonparametric/_smoothers_lowess.pyx
  [ 2/26] Cythonizing statsmodels/nonparametric/linbin.pyx
  [ 3/26] Cythonizing statsmodels/robust/_qn.pyx
  [ 4/26] Cythonizing statsmodels/tsa/_innovations.pyx
  [ 5/26] Cythonizing statsmodels/tsa/_stl.pyx
  [ 6/26] Cythonizing statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx
  Traceback (most recent call last):
    File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
      return hook(config_settings)
             ^^^^^^^^^^^^^^^^^^^^^
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
      return self._get_build_requires(config_settings, requirements=['wheel'])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
      self.run_setup()
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 487, in run_setup
      super().run_setup(setup_script=setup_script)
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
      exec(code, locals())
    File "<string>", line 344, in <module>
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 1154, in cythonize
      cythonize_one(*args)
    File "/tmp/pip-build-env-_wel6lhq/overlay/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 1321, in cythonize_one
      raise CompileError(None, pyx_file)
  Cython.Compiler.Errors.CompileError: statsmodels/tsa/exponential_smoothing/_ets_smooth.pyx
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.



</p>
</details> 

/

doc: Document pyradiomics wrapping (IBSI compliance)

As @fedorov pointed out it does not become immediately clear, that we wrap autoradiomics around pyradiomics

A PR would close #34 aswell.

Add tutorial

doc: Currently installing on Python 3.11 does not work from pip/source with the given dependencies.

We should clearly state that currently only Python 3.10 is supported.

Add docs

BUG: Time and memory inefficient concating in pandas on every case.

In the feature extraction, we concat a pd.DataFrame for every case. AFAIK this construction of a pd.DataFrame leads to a new memory allocation (and copying) every time, which is highly memory inefficient. Especially, when parallelized on many CPUs, combined with the already memory intensive forking in joblib this can lead to OOM-Events (and is slow of course). Wouldn't it be more convenient to return only the feature set, that is currently processed.

AutoRadiomics/autorad/feature_extraction/extractor.py

Lines 109 to 115 in e475893

 try: 

 feature_vector = self.extractor.execute(image_path, mask_path) 

 except ValueError: 

 log.error(f"Error extracting features for case {id_}") 

 raise ValueError(f"Error extracting features for case {id_}") 

 # copy the all the metadata for the case 

 feature_series = pd.concat([case, pd.Series(feature_vector)])

These are subsequently collected in results anyways:

AutoRadiomics/autorad/feature_extraction/extractor.py

Lines 135 to 144 in e475893

 def get_features_parallel(self, num_threads: int) -> pd.DataFrame: 

 df = self.dataset.get_df() 

 try: 

 with Parallel(n_jobs=num_threads) as parallel: 

 results = parallel( 

 delayed(self._get_features_for_single_case)(df_row) 

 for _, df_row in df.iterrows() 

 ) 

 feature_df = pd.concat(results, axis=1).T 

 return feature_df

bug: Installing autorad from pypy fails due to an issue with scipy==1.9 dependency.

Installing autorad from pypy fails with:

Collecting scipy==1.9 (from autorad==0.2.6)
  Using cached scipy-1.9.0.tar.gz (42.0 MB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'error'

  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [67 lines of output]
      The Meson build system
      Version: 0.62.2
      Source dir: /tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd
      Build dir: /tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-mwri95hi/build
      Build type: native build
      Project name: SciPy
      Project version: 1.9.0
      C compiler for the host machine: cc (gcc 11.4.0 "cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
      C linker for the host machine: cc ld.bfd 2.38
      C++ compiler for the host machine: c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
      C++ linker for the host machine: c++ ld.bfd 2.38
      Host machine cpu family: x86_64
      Host machine cpu: x86_64
      Compiler for C supports arguments -Wno-unused-but-set-variable: YES
      Library m found: YES
      
      ../../meson.build:41:0: ERROR: Unknown compiler(s): [['gfortran'], ['flang'], ['nvfortran'], ['pgfortran'], ['ifort'], ['g95']]
      The following exception(s) were encountered:
      Running "gfortran --version" gave "[Errno 2] No such file or directory: 'gfortran'"
      Running "gfortran -V" gave "[Errno 2] No such file or directory: 'gfortran'"
      Running "flang --version" gave "[Errno 2] No such file or directory: 'flang'"
      Running "flang -V" gave "[Errno 2] No such file or directory: 'flang'"
      Running "nvfortran --version" gave "[Errno 2] No such file or directory: 'nvfortran'"
      Running "nvfortran -V" gave "[Errno 2] No such file or directory: 'nvfortran'"
      Running "pgfortran --version" gave "[Errno 2] No such file or directory: 'pgfortran'"
      Running "pgfortran -V" gave "[Errno 2] No such file or directory: 'pgfortran'"
      Running "ifort --version" gave "[Errno 2] No such file or directory: 'ifort'"
      Running "ifort -V" gave "[Errno 2] No such file or directory: 'ifort'"
      Running "g95 --version" gave "[Errno 2] No such file or directory: 'g95'"
      Running "g95 -V" gave "[Errno 2] No such file or directory: 'g95'"
      
      A full log can be found at /tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-mwri95hi/build/meson-logs/meson-log.txt
      + meson setup --native-file=/tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-native-file.ini -Ddebug=false -Doptimization=2 --prefix=/home/flaqua/.conda/envs/umm-nieren /tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd /tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-mwri95hi/build
      Traceback (most recent call last):
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 909, in get_requires_for_build_wheel
          with _project(config_settings) as project:
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/contextlib.py", line 137, in __enter__
          return next(self.gen)
                 ^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 888, in _project
          with Project.with_temp_working_dir(
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/contextlib.py", line 137, in __enter__
          return next(self.gen)
                 ^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 547, in with_temp_working_dir
          yield cls(source_dir, tmpdir, build_dir)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 463, in __init__
          self._configure(reconfigure=bool(build_dir) and not native_file_mismatch)
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 494, in _configure
          self._meson(
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 477, in _meson
          return self._proc('meson', *args)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-dhlsvuw_/overlay/lib/python3.11/site-packages/mesonpy/__init__.py", line 472, in _proc
          subprocess.check_call(list(args))
        File "/home/flaqua/.conda/envs/umm-nieren/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['meson', 'setup', '--native-file=/tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-native-file.ini', '-Ddebug=false', '-Doptimization=2', '--prefix=/home/flaqua/.conda/envs/umm-nieren', '/tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd', '/tmp/pip-install-l7sjy5uk/scipy_7ef57d7b7c744ee4b01c593bebc81ccd/.mesonpy-mwri95hi/build']' returned non-zero exit status 1.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

I think this is related to this issue in scipy build 1.9.0.
scipy/scipy#16737 (comment)

I would propose bumping scipy==1.9.3 in the requirements.txt. This should work and we wouldn't need to this too much about changes in scipy at the current point in time.

Preprocessing features fails during machine learning

Describe the bug

Trying to use Machine Learning in the self-hosted webapp, as well as in example_WORC.ipynb fails.

Steps/Code to Reproduce

import pandas as pd
from pathlib import Path
from autorad.external.download_WORC import download_WORCDatabase

# Set where we will save our data and results
base_dir = Path.cwd() / "autorad_tutorial"
data_dir = base_dir / "data"
result_dir = base_dir / "results"
data_dir.mkdir(exist_ok=True, parents=True)
result_dir.mkdir(exist_ok=True, parents=True)

%load_ext autoreload
%autoreload 2

download data (it may take a few minutes)
download_WORCDatabase(
dataset="Desmoid",
data_folder=data_dir,
n_subjects=100,
)

from autorad.utils.preprocessing import get_paths_with_separate_folder_per_case

# create a table with all the paths
paths_df = get_paths_with_separate_folder_per_case(data_dir, relative=True)
paths_df.sample(5)


from autorad.data.dataset import ImageDataset
from autorad.feature_extraction.extractor import FeatureExtractor
import logging

logging.getLogger().setLevel(logging.CRITICAL)

image_dataset = ImageDataset(
    paths_df,
    ID_colname="ID",
    root_dir=data_dir,
)

# Let's take a look at the data, plotting random 10 cases
image_dataset.plot_examples(n=10, window=None)

extractor = FeatureExtractor(image_dataset, extraction_params="MR_default.yaml")
feature_df = extractor.run()

feature_df.head()

label_df = pd.read_csv(data_dir / "labels.csv")
label_df.sample(5)

from autorad.data.dataset import FeatureDataset

merged_feature_df = feature_df.merge(label_df, left_on="ID",
    right_on="patient_ID", how="left")
feature_dataset = FeatureDataset(
    merged_feature_df,
    target="diagnosis",
    ID_colname="ID"
)

splits_path = result_dir / "splits.json"
feature_dataset.split(method="train_val_test", save_path=splits_path)

from autorad.models.classifier import MLClassifier
from autorad.training.trainer import Trainer

models = MLClassifier.initialize_default_sklearn_models()
print(models)

trainer = Trainer(
    dataset=feature_dataset,
    models=models,
    result_dir=result_dir,
    experiment_name="Fibromatosis_vs_sarcoma_classification",
)
trainer.run_auto_preprocessing(
        selection_methods=["boruta"],
        oversampling=False,
        )

Expected Results

Initialising the trainer and running preprocessing on the features

Actual Results

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [15], in <cell line: 7>()
      1 trainer = Trainer(
      2     dataset=feature_dataset,
      3     models=models,
      4     result_dir=result_dir,
      5     experiment_name="Fibromatosis_vs_sarcoma_classification",
      6 )
----> 7 trainer.run_auto_preprocessing(
      8         selection_methods=["boruta"],
      9         oversampling=False,
     10         )

File ~/AutoRadiomics/autorad/training/trainer.py:78, in Trainer.run_auto_preprocessing(self, oversampling, selection_methods)
     70 preprocessor = Preprocessor(
     71     normalize=True,
     72     feature_selection_method=selection_method,
     73     oversampling_method=oversampling_method,
     74 )
     75 try:
     76     preprocessed[selection_method][
     77         oversampling_method
---> 78     ] = preprocessor.fit_transform(self.dataset.data)
     79 except AssertionError:
     80     log.error(
     81         f"Preprocessing with {selection_method} and {oversampling_method} failed."
     82     )

File ~/AutoRadiomics/autorad/preprocessing/preprocessor.py:66, in Preprocessor.fit_transform(self, data)
     64 result_y = {}
     65 all_features = X.train.columns.tolist()
---> 66 X_train_trans, y_train_trans = self.pipeline.fit_transform(
     67     X.train, y.train
     68 )
     69 self.selected_features = self.pipeline["select"].selected_features(
     70     column_names=all_features
     71 )
     72 result_X["train"] = pd.DataFrame(
     73     X_train_trans, columns=self.selected_features
     74 )

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/sklearn/pipeline.py:434, in Pipeline.fit_transform(self, X, y, **fit_params)
    432 fit_params_last_step = fit_params_steps[self.steps[-1][0]]
    433 if hasattr(last_step, "fit_transform"):
--> 434     return last_step.fit_transform(Xt, y, **fit_params_last_step)
    435 else:
    436     return last_step.fit(Xt, y, **fit_params_last_step).transform(Xt)

File ~/AutoRadiomics/autorad/feature_selection/selector.py:47, in CoreSelector.fit_transform(self, X, y)
     44 def fit_transform(
     45     self, X: np.ndarray, y: np.ndarray
     46 ) -> tuple[np.ndarray, np.ndarray]:
---> 47     self.fit(X, y)
     48     return X[:, self.selected_columns], y

File ~/AutoRadiomics/autorad/feature_selection/selector.py:124, in BorutaSelector.fit(self, X, y, verbose)
    122 with warnings.catch_warnings():
    123     warnings.simplefilter("ignore")
--> 124     model.fit(X, y)
    125 self.selected_columns = np.where(model.support_)[0].tolist()
    126 if not self.selected_columns:

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/boruta/boruta_py.py:201, in BorutaPy.fit(self, X, y)
    188 def fit(self, X, y):
    189     """
    190     Fits the Boruta feature selection with the provided estimator.
    191 
   (...)
    198         The target values.
    199     """
--> 201     return self._fit(X, y)

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/boruta/boruta_py.py:251, in BorutaPy._fit(self, X, y)
    249 def _fit(self, X, y):
    250     # check input params
--> 251     self._check_params(X, y)
    252     self.random_state = check_random_state(self.random_state)
    253     # setup variables for Boruta

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/boruta/boruta_py.py:517, in BorutaPy._check_params(self, X, y)
    513 """
    514 Check hyperparameters as well as X and y before proceeding with fit.
    515 """
    516 # check X and y are consistent len, X is Array and y is column
--> 517 X, y = check_X_y(X, y)
    518 if self.perc <= 0 or self.perc > 100:
    519     raise ValueError('The percentile should be between 0 and 100.')

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/sklearn/utils/validation.py:964, in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
    961 if y is None:
    962     raise ValueError("y cannot be None")
--> 964 X = check_array(
    965     X,
    966     accept_sparse=accept_sparse,
    967     accept_large_sparse=accept_large_sparse,
    968     dtype=dtype,
    969     order=order,
    970     copy=copy,
    971     force_all_finite=force_all_finite,
    972     ensure_2d=ensure_2d,
    973     allow_nd=allow_nd,
    974     ensure_min_samples=ensure_min_samples,
    975     ensure_min_features=ensure_min_features,
    976     estimator=estimator,
    977 )
    979 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric)
    981 check_consistent_length(X, y)

File ~/miniconda3/envs/AutoRadiomics/lib/python3.10/site-packages/sklearn/utils/validation.py:746, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    744         array = array.astype(dtype, casting="unsafe", copy=False)
    745     else:
--> 746         array = np.asarray(array, order=order, dtype=dtype)
    747 except ComplexWarning as complex_warning:
    748     raise ValueError(
    749         "Complex data not supported\n{}\n".format(array)
    750     ) from complex_warning

ValueError: could not broadcast input array from shape (60,1015) into shape (60,)

Add SHAP explanations

Add intuitive visualizations using https://github.com/slundberg/shap

IBSI feature correspondence

For the purposes of harmonization and to enable comparisons across toolkits (e.g., against pyradiomics), it would be quite helpful if there was assignment of IBSI codes to those features that match the definitions in https://arxiv.org/abs/1612.07003. Did you consider doing something like this?

example_WORC.ipynb not being up to date with the repository

Describe the bug

In example_WORC.ipynb there are function calls that do not work due to code in the repository being changed while the example_WORC.ipynb code wasn't updated to reflect those changes

Steps/Code to Reproduce

import pandas as pd
from pathlib import Path
from autorad.external.download_WORC import download_WORCDatabase

# Set where we will save our data and results
base_dir = Path.cwd() / "autorad_tutorial"
data_dir = base_dir / "data"
result_dir = base_dir / "results"
data_dir.mkdir(exist_ok=True, parents=True)
result_dir.mkdir(exist_ok=True, parents=True)

%load_ext autoreload
%autoreload 2

download data (it may take a few minutes)
download_WORCDatabase(
dataset="Desmoid",
data_folder=data_dir,
n_subjects=100,
)



from autorad.data.utils import get_paths_with_separate_folder_per_case  # 1

# create a table with all the paths
paths_df = get_paths_with_separate_folder_per_case(data_dir, relative=True)
paths_df.sample(5)


from autorad.data.dataset import ImageDataset
from autorad.feature_extraction.extractor import FeatureExtractor
import logging

logging.getLogger().setLevel(logging.CRITICAL)

image_dataset = ImageDataset(
    paths_df,
    ID_colname="ID",
    root_dir=data_dir,
)

# Let's take a look at the data, plotting random 10 cases
image_dataset.plot_examples(n=10, window=None)

extractor = FeatureExtractor(image_dataset, extraction_params="default_MR.yaml") # 2
feature_df = extractor.run()

Expected Results

1: Importing the function get_paths_with_separate_folder_per_case

2: Using default_MR.yaml as value for extraction_params

Actual Results

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 from autorad.data.utils import get_paths_with_separate_folder_per_case
      3 # create a table with all the paths
      4 paths_df = get_paths_with_separate_folder_per_case(data_dir, relative=True)

ModuleNotFoundError: No module named 'autorad.data.utils'

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [18], in <cell line: 1>()
----> 1 extractor = FeatureExtractor(image_dataset, extraction_params="default_MR.yaml")
      2 feature_df = extractor.run()

File ~/AutoRadiomics/autorad/feature_extraction/extractor.py:41, in FeatureExtractor.__init__(self, dataset, feature_set, extraction_params, n_jobs)
     39 self.dataset = dataset
     40 self.feature_set = feature_set
---> 41 self.extraction_params = self._get_extraction_param_path(
     42     extraction_params
     43 )
     44 log.info(f"Using extraction params from {self.extraction_params}")
     45 self.n_jobs = set_n_jobs(n_jobs)

File ~/AutoRadiomics/autorad/feature_extraction/extractor.py:55, in FeatureExtractor._get_extraction_param_path(self, extraction_params)
     53     result = default_extraction_param_dir / extraction_params
     54 else:
---> 55     raise ValueError(
     56         f"Extraction parameter file {extraction_params} not found."
     57     )
     58 return result

ValueError: Extraction parameter file default_MR.yaml not found.

Fix

1: change from autorad.data.utils to from autorad.utils.preprocessing
2: change extractor = FeatureExtractor(image_dataset, extraction_params="default_MR.yaml") to extractor = FeatureExtractor(image_dataset, extraction_params="MR_default.yaml")

Thanks for your great job!

	try:
	feature_vector = self.extractor.execute(image_path, mask_path)
	except ValueError:
	log.error(f"Error extracting features for case {id_}")
	raise ValueError(f"Error extracting features for case {id_}")
	# copy the all the metadata for the case
	feature_series = pd.concat([case, pd.Series(feature_vector)])

	def get_features_parallel(self, num_threads: int) -> pd.DataFrame:
	df = self.dataset.get_df()
	try:
	with Parallel(n_jobs=num_threads) as parallel:
	results = parallel(
	delayed(self._get_features_for_single_case)(df_row)
	for _, df_row in df.iterrows()
	)
	feature_df = pd.concat(results, axis=1).T
	return feature_df

pwoznicki / autoradiomics Goto Github PK

autoradiomics's People

Contributors

Stargazers

Watchers

Forkers

autoradiomics's Issues

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Fix

Recommend Projects

Recommend Topics

Recommend Org