hzdr / paris Goto Github PK
View Code? Open in Web Editor NEWPortable and Accelerated 3D Reconstruction tool for radiation based Imaging Systems
License: GNU General Public License v3.0
Portable and Accelerated 3D Reconstruction tool for radiation based Imaging Systems
License: GNU General Public License v3.0
I am currently writing the paper, and I ask myself why the synchronization between the stages is not performed through the queue anymore but in all stages.
As far as I understood, this leads to a lot of code duplication, e.g. save_pop and push in each stage. Like this, the user who wants to use ddrf has to take care on its own about the synchronization. With a thread-safe queue this would not be necessary.
Probably, I did not get the point so far. It would be nice, if you could explain it to me. ;)
The current program only supports parallelization by using CUDA. The user should also be able to run it on the Xeon Phi architecture.
As in the title. Example: The output volume has 422 slices and is reconstructed by an 8 subvolume reconstruction. 422 / 8 = 52.75
This leads to some weird offsets. The program doesn't crash but the output volume is missing some slices.
Adaptation of reconstruction volume orientation
As described in the (poetic) title
Hallo Jan. Erreiche ich dich noch über dieses Board? Ich möchte in naher Zukunft mit dem Paris und Glados Programm wieder arbeiten und checken ob du mir über diesen Kanal mit helfen kannst. Viele Grüße... André.
The results look promising, so they should be integrated into one of the next releases.
The reconstruction algorithm is the bottleneck of the application. In order to decrease computation time we should utilize horizontal and/or vertical symmetries.
After refactoring there are still two major interface issues:
unique_ptr
-like behavior. AFAIK OpenCL uses an opaque buffer type, not pointers, thus making ddrf's unique_ptr
clone unusable.Internal discussions revealed that we should use GPL instead of the EUPL.
A volume size of 4.4 GiB requires up to 8 subvolumes during reconstruction for standard consumer GPUs. Reconstruction succeeds but generates artifacts on some slices which doesn't happen on e.g. Tesla GPUs which need a smaller number of subvolumes.
Change:
./ddafa [geometry,input,output] --quality 2
To
./ddafa [geometry,input,output] --quality n
Reconstructing reasonable large volumes (1025x1025x422, float) currently fails on small CPU RAM sizes as the out of memory killer will stop the process.
These have to be done manually at the moment, it would be nice to have support in ddafa as well.
As the error messages are exchangeable (cuFFT errors, cudaMemcpy errors, ...) it seems something is wrong with the memcpys.
The program should be able to run on platforms without CUDA, OpenCL or OpenMP (hence, CPU only).
As Matthias is currently Jan's supervisor from ZiH (TU Dresden) side, I asked him (and Stephan Boden too) to analyze the current code and to propose enhancements if necessary. Let's go!
... do we find something in the StarTrek universe like I proposed as alternative for ddrf?
Before releasing 0.3.0 the documentation should be updated to include the new features.
Header size and Storage Version
See title. Currently we don't take the pitch into consideration which might lead to some issues with allocating large volumes we are currently experiencing.
The current build system relies on Eclipse CDT's .cproject files being shared. This is a nuisance once those files are changed. We should generate those files once with CMake and otherwise ignore them.
Currently we are constantly allocating and freeing the memory needed for the projections on the GPU. We should use some sort of memory pool instead to prevent unnecessary waiting between kernel executions.
It would be nice to be able to specify a region of interest interactively.
Currently the CPU loads much more projections than needed which leads to problems on systems with smaller RAM sizes. We should try to load projections (more or less) on the fly instead of keeping hundreds of them in a buffer.
Recent profilings have shown that GLADOS pipelines no longer benefit the development of PARIS. The CUDA and OpenCL backends can easily emulate the sequential order of operations by using streams / OpenCL pipelines while the "each stage in its own thread" approach is really bad for the OpenMP backend in general.
Additionally, 50% - 90% of the runtime (depending on the backend) are spent in the backprojection kernel, forcing the other stages to wait and thus consuming more resources than necessary.
I therefore propose the removal of GLADOS pipelines from the PARIS program.
Volumes can become quite large. It would be better to save the subvolumes as soon as possible instead of the whole volume.
The program is currently limited to NVIDIA GPUs. It would be nice if it could also run on AMD and Intel GPUs, so there should be an OpenCL implementation as well.
The license texts and program strings need to be updated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.