hzdr / paris Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 0.0 2.14 MB

Portable and Accelerated 3D Reconstruction tool for radiation based Imaging Systems

License: GNU General Public License v3.0

C++ 78.41% Cuda 14.71% CMake 6.88%

paris's People

Contributors

Stargazers

Watchers

paris's Issues

Question: Synchronization in stage and not in framework ddrf

I am currently writing the paper, and I ask myself why the synchronization between the stages is not performed through the queue anymore but in all stages.
As far as I understood, this leads to a lot of code duplication, e.g. save_pop and push in each stage. Like this, the user who wants to use ddrf has to take care on its own about the synchronization. With a thread-safe queue this would not be necessary.

Probably, I did not get the point so far. It would be nice, if you could explain it to me. ;)

Add support for OpenMP

The current program only supports parallelization by using CUDA. The user should also be able to run it on the Xeon Phi architecture.

Slices disappearing when the volume is not evenly divisible by the number of subvolumes

As in the title. Example: The output volume has 422 slices and is reconstructed by an 8 subvolume reconstruction. 422 / 8 = 52.75

This leads to some weird offsets. The program doesn't crash but the output volume is missing some slices.

Implementation of real geometry parameters

Adaptation of reconstruction volume orientation

Adhere to modern C++ style.

snake_case instead of camelCase
4 spaces instead of tabs
exceptions everywhere

Perform performance tests with updated GLADOS

As described in the (poetic) title

life sign

Hallo Jan. Erreiche ich dich noch über dieses Board? Ich möchte in naher Zukunft mit dem Paris und Glados Programm wieder arbeiten und checken ob du mir über diesen Kanal mit helfen kannst. Viele Grüße... André.

Integrate CPT_2016_Extend-FDK

The results look promising, so they should be integrated into one of the next releases.

Utilize symmetries

The reconstruction algorithm is the bottleneck of the application. In order to decrease computation time we should utilize horizontal and/or vertical symmetries.

Improve backend interfaces

After refactoring there are still two major interface issues:

There are too many assumptions about the (accelerator) buffer types in the frontend. There is no guarantee that the buffer type will always expose unique_ptr-like behavior. AFAIK OpenCL uses an opaque buffer type, not pointers, thus making ddrf's unique_ptrclone unusable.
The backend interface is not consistent. Sometimes the buffer is passed directly, sometimes the projection structure itself. This should be cleaned up.

Refactor the code

This is a prerequisite for #15, #16 and #20. The current program is tightly coupled with the CUDA runtime, even in places where this wouldn't be necessary (e.g. geometry calculation, loading and saving). In order to support other platforms than CUDA this coupling should be removed.

Change licence to GPL

Internal discussions revealed that we should use GPL instead of the EUPL.

Artifact generation with a large number of subvolumes

A volume size of 4.4 GiB requires up to 8 subvolumes during reconstruction for standard consumer GPUs. Reconstruction succeeds but generates artifacts on some slices which doesn't happen on e.g. Tesla GPUs which need a smaller number of subvolumes.

Saving crashes with active ROI

Syntax correction

Change:
./ddafa [geometry,input,output] --quality 2

To
./ddafa [geometry,input,output] --quality n

Add support for small CPU RAM sizes

Reconstructing reasonable large volumes (1025x1025x422, float) currently fails on small CPU RAM sizes as the out of memory killer will stop the process.

Add preprocessing support

attenuation calculation
inclination correction

These have to be done manually at the moment, it would be nice to have support in ddafa as well.

ddafa crashes randomly after reconstruction on multiple GPUs

As the error messages are exchangeable (cuFFT errors, cudaMemcpy errors, ...) it seems something is wrong with the memcpys.

Add generic support

The program should be able to run on platforms without CUDA, OpenCL or OpenMP (hence, CPU only).

Analysing the FDK cuda code

As Matthias is currently Jan's supervisor from ZiH (TU Dresden) side, I asked him (and Stephan Boden too) to analyze the current code and to propose enhancements if necessary. Let's go!

Modify the name of the program

... do we find something in the StarTrek universe like I proposed as alternative for ddrf?

Update documentation for 0.3.0

Before releasing 0.3.0 the documentation should be updated to include the new features.

Add Header specifications to the descriptions

Header size and Storage Version

Rework the calculation of needed GPU memory

See title. Currently we don't take the pitch into consideration which might lead to some issues with allocating large volumes we are currently experiencing.

Change build system to CMake

The current build system relies on Eclipse CDT's .cproject files being shared. This is a nuisance once those files are changed. We should generate those files once with CMake and otherwise ignore them.

Improve GPU memory handling

Currently we are constantly allocating and freeing the memory needed for the projections on the GPU. We should use some sort of memory pool instead to prevent unnecessary waiting between kernel executions.

Correct version number 0.2.0

Add usability features

It would be nice to be able to specify a region of interest interactively.

Default vs. Manual (interactive communication)
Default: Geometry, Input, Output
Manual: Geometry, Input, Output, ROI, GPU and subvolume distribution choice, Projectionposition/Angle file, tbc
Set Quality (Number of projections to be processed)

Improve CPU memory handling

Currently the CPU loads much more projections than needed which leads to problems on systems with smaller RAM sizes. We should try to load projections (more or less) on the fly instead of keeping hundreds of them in a buffer.

Abandon GLADOS pipelines

Recent profilings have shown that GLADOS pipelines no longer benefit the development of PARIS. The CUDA and OpenCL backends can easily emulate the sequential order of operations by using streams / OpenCL pipelines while the "each stage in its own thread" approach is really bad for the OpenMP backend in general.

Additionally, 50% - 90% of the runtime (depending on the backend) are spent in the backprojection kernel, forcing the other stages to wait and thus consuming more resources than necessary.

I therefore propose the removal of GLADOS pipelines from the PARIS program.

Please add more desciption

Papers (Kak et al, Hou et al ..... all papers, we looked for and used)
Link to Tobias Conference Paper
Link to Tobias CPC Paper (so far available :-))
Add Interpolation Method,
Data processing strategy
Memory management
etc.

hzdr / paris Goto Github PK

paris's People

Contributors

Stargazers

Watchers

paris's Issues

Recommend Projects

Recommend Topics

Recommend Org