Giter Club home page Giter Club logo

tike's Introduction

Tike

Tike is a toolbox for tomographic reconstruction of 3D objects from ptychography data.

The aim of Tike is to provide fast and accurate implementations of a wide variety of reconstruction algorithms, and to provide a common platform for the synchrotron research community.

Current Features (March 2021)

Scan

  • Lissajous and 2D spiral trajectories
  • hexagonal and rectangular grids

Ptychography

  • FFT-based operator with linear position interpolation
  • single-energy
  • multiple probes per diffraction pattern (multi-mode probes)
  • probe variation correction (eigen probes)
  • one shared (multi-modal) probe per angular view
  • multi-GPU conjugate-gradient descent solver
  • multi-GPU least-squares + gradient descent solver

Laminography

  • USFFT-based operator for cubic field-of-view
  • single tilt angle
  • multi-GPU conjugate-gradient descent solver

Alignment

  • Lanczos-based rotation and flow operators
  • Cross-correlation rigid alignment solver

Citations

This software has a DOI for use in citations.

Gursoy, Doga, and Ching, Daniel J. "Tike." Computer software. December 01, 2022. https://github.com/AdvancedPhotonSource/tike.git. https://doi.org/10.11578/dc.20230202.1.

tike's People

Contributors

a4894z avatar carterbox avatar dgursoy avatar mdw771 avatar nikitinvv avatar xiaodong-yu avatar xmap avatar yudongyao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tike's Issues

Default data types

Enforce the use of 32bit data types in order to make the creation plugins and extensions easier. Also, reduce the the complexity of tike.utils; I think there is some redundancy that could be removed.

New ePIE, DM, MLs, API based on the new needed signature in ptychopy.

1 Need to add more functionality in ptychopy.c.
2 The new calling sequence would be different from the original one, keeping the original one to keep the ability for doing reconstruction from passing a file path.Ex, ePIE_File(), ePIE_Array();
3 Two sets of API, one set is for step mode to check the middle results. Another set could be for the whole reconstruction process.

Separate cost functions into own module

I would suggest having a separate module for cost functions and for gradients. Any cost function and gradient only need fwd and adj operators to be computed. So users can define their own cost functions and gradients. It would be nice to have def cost_gaussian(fwd,adj,...), def gradient_gaussian(fwd,adj,..),etc. I think the same concept is used in ODL.https://odlgroup.github.io/odl/ @nikitinvv

Memory leak when restarting reconstruction?

I seems that restarting a reconstruction has a memory overhead that increases with each restart. This causes out of memory error. For example, restarting the reconstruction 10 times with 10 iterations each time consumes more memory than doing one reconstruction with 100 iterations.

Clean all references to num_batch from code

The number of batches, is a much more natural way of expressing how to divide something in the context of multiple-devices. Given the size of a batch, you have to do some math to divide the size between all the devices.

However, the real problem is that these two methods of expressing the same thing: num_batch vs batch_size are both used which is confusing for users.

Compare linear interpolation with Fourier interpolation in Ptycho operator

Currently, linear interpolation in the near plane domain is performed in order to handle floating-point scan positions. Another and accurate way of doing so is to perform shifts in the frequency domain. Say a scan position is given as s = si+sf where si is integer part and sf is floating part. We need to compute the Fourier transform of Q\psi(x+si+sf). By using Fourier properties, F(Q\psi(x+si+sf))(\xi)=F(Q\psi(x+si))(\xi) * exp(-2\pi i \xi sf), and the floating-point position problem boils down to multiplication in the frequency domain.
It would be good to compare two approaches (linear interpolation and multiplication in the frequency domain) in terms of accuracy and performance on CPU and GPU.

Normalize the object array to 1 on average

In order to avoid amplitude ambiguity in psi and keep it 1 on average, on each iteration we can compute |FQψ| and compare its norm with the norm of data. Then compute the constant c = norm(|FQψ) / norm(sqrt(data)) and do Q/=c. So this way we simply replace prb by prb/c and ψ by ψc, which does not affect the functional |FQ/c ψc| = |FQψ|). @nikitinvv

Allow arbitrary grid size for USFFT/Lamino Operators

Currently, the Laminography operator has a fixes size ratio between projections sampling and field of view sampling. i.e. for projections sized 32x32, the reconstructed volume is 32x32x32 and is sampled in a 64x64x64 frequency space. We should break this fixed ratio so that we can reconstruct downsampled fields of view i.e. 32x32 projections through a 8x8x8 field of view or 32x32 projections in a 32x32x32 frequency space for 32x32x32 field of view.

Handling of Jupyter Notebooks in the repository

This issue is to discuss if we need a version control on Notebooks in the repository.

The Notebooks are meant to be interactive, which means we encourage users to play with them. So it is a question if we need versioning for these files? I'd rather have static Python scripts for tutorials that devs have control over and optimize in time, and have ipynb extensions listed in the .gitignore file.

multislice ptychography

A lot of ptycho-tomography/laminography data we acquired recently are from thick objects for which the single-slice ptychography model no longer holds. We need to incorporate the multislice model to get better reconstruction.

Unifying communicator API

There are a lot of these types of switches in the solver implementations. The Comm API should be reworked to avoid these switches.

if comm.use_mpi:
    cost = comm.Allreduce_reduce(cost, 'cpu')
else:
    cost = comm.reduce(cost, 'cpu')

Reorder scanning coordinates

theta, h, v coordinates convention should be changed to theta, v, h coordinates because of the following:

  • This is a more natural order for data collection where h is a slower axis than v
  • This matches the ordering for tomopy input arrays

v0.4.0 profiling benchmark

35.249 <module>  profile_admm.py:3
└─ 35.225 admm_profile_workload  profile_admm.py:59
   ├─ 29.160 admm  tike/tike.py:88
x   │  ├─ 15.025 forward  tike/tomo.py:256
   │  ├─ 12.937 reconstruct  tike/ptycho.py:433
   │  │  └─ 12.929 grad  tike/ptycho.py:345
   │  │     ├─ 1.873 combine_grids  tike/ptycho.py:204
   │  │     │  └─ 1.616 shift  scipy/ndimage/interpolation.py:474
x   │  │     ├─ 1.768 angle  numpy/lib/function_base.py:1914
   │  │     ├─ 1.688 uncombine_grids  tike/ptycho.py:274
   │  │     │  └─ 1.535 shift  scipy/ndimage/interpolation.py:474
   │  │     ├─ 1.069 ifft2  numpy/fft_intel/fftpack.py:880
   │  │     │  └─ 1.069 ifftn  numpy/fft_intel/fftpack.py:692
   │  │     ├─ 1.062 fft2  numpy/fft_intel/fftpack.py:791
   │  │     │  └─ 1.062 fftn  numpy/fft_intel/fftpack.py:594
x   │  │     └─ 0.956 pad  numpy/lib/arraypad.py:1094
   │  │        ├─ 0.525 _append_const  numpy/lib/arraypad.py:111
   │  │        └─ 0.371 _prepend_const  numpy/lib/arraypad.py:77
   │  └─ 0.798 reconstruct  tike/tomo.py:161
   │     └─ 0.793 recon  tomopy/recon/algorithm.py:75
   │        └─ 0.777 _dist_recon  tomopy/recon/algorithm.py:362
   │           └─ 0.665 __exit__  concurrent/futures/_base.py:580
   │              └─ 0.665 shutdown  concurrent/futures/thread.py:133
   │                 └─ 0.665 join  threading.py:1022
   │                    └─ 0.665 _wait_for_tstate_lock  threading.py:1060

  • Replace tike.forward with tomopy.forward
  • Replace np.angle with np.arctan2
  • Replace np.pad with tike.fast_pad

There are still 6 seconds of time inside tike.ptycho.grad which need to be investigated line-by-line.

Let's do lazy imports

tike has the potential for having a lot of optional dependencies, so instead of importing all external solvers at the module level, let's do lazy imports, where they are only imported if actually needed.

import numpy  # importing here requires it to be installed even if never used

def recon(x, algorithm='default'):
    """Reconstruct some stuff."""
    if algorithm is 'default':
        return numpy.default_method(x)
    elif algorithm is 'optional':
        from option import optional_method
        return option.optional_method(x)  # this method is only imported if chosen

Unique h,v, positions at each angle

This change will remove the ability of ptycho.reconstruct to process more than one view at a time, and require adding a for loop around the call to ptycho.reconstruct in tike.admm.

This is because our language for describing the probe geometry is not unified between the tomography and ptychography problems. Each is described by only part of the parameters: theta or (v and h).

I think it is strange that our tomography solver accepts 3D input, but our ptychography solver accepts only 2D input. The mismatch between the two is a constraint that I do not like.

Using logging statements for development

We want to set up a system for logging information about the progress of reconstruction algorithms such as convergence criteria and which number of iterations have been completed. This system should probably use the standard logger of python and print statements for storing information optionally.

Instead of changing the API of tike to accommodate development, we can use the following tools to capture data in text files and display messages to stout.

Example

Here's an example that demonstrates these concepts.

dev_logging.py:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""Define an example of the logging system to be used in tike."""

import numpy as np
import logging
# First set the logging level to logging.INFO.
logging.basicConfig(level=logging.INFO)
# Then we need to grab a reference to the existing logger or make a new one.
logger = logging.getLogger(__name__)

if __name__ == '__main__':
    # Go to https://pyformat.info/ for a summary of how to use format()
    print("{} is a random number.\nThis string will go to the standard output "
          "stream".format(np.random.randint(0, 100)))
    logger.info("This is an info level statement.\nLogging statements go to "
                "standard error.")
    logger.debug("This is a debugging level statement.\nIt appears when the "
                 "logging level is det to DEBUG or higher.")
    if __debug__:
        print("This statement appears if Python was unoptimized.\n"
              "i.e. if it was started without the -O option.")
    logger.info("Good bye!")

Run the script in a bash terminal:

$ python -O dev_logging.py > some_data.txt
INFO:__main__:This is an info level statement.
INFO:__main__:Good bye!

some_data.txt:

46 is a random number.
This string will go to the standard outputstream

Modify the ptychopy API to accept Python types such as arrays, dictionaries, integers, so that Tike can pass these parameters to ptychopy

An ideal interface would be something like this:

import ptychopy

# Function accepts inputs such as integers, tuples, and numpy arrays
ptychopy.epieinit(
    jobID='sim512',
    beamSize=110e-9,
    scanDims=(30, 30),
    step=(50e-9, 50e-9),
    i=60,
    size=512,
    lamda=2.4796837508399954e-10,
    dx_d=172e-6,
    z=1,
    simulate=1,
    blind=0,
)
ptychopy.epiestep(its=3)
data = ptychopy.epieresobj()
ptychopy.epiepost()


def epieinit(psi, probe, data, scan, ....):
    """Initializes a reconstruction for the ePIE algorithm.

    Parameters
    ----------
    psi : (..., nz, n) complex64
        The complex wavefront modulation of the object.
    probe (..., 1, 1, nprobe, probe_shape, probe_shape) : complex64
        The complex illuminatio function.
    data (..., nframe, detector_shape, detector_shape) : float32
        The square of the absolute value of `farplane` summed over `fly` and
        `modes`.
    scan : (..., nscan, 2) float32
        Coordinates of the minimum corner of the probe grid for each
        measurement in the coordinate system of psi. Vertical coordinates
        first, horizontal coordinates second.

    """
    pass


def epiestep(num_iter=1):
    """Execute one step of ePIe"""
    pass

Use sphinx "substitutions" to remove redundant text in documentation

http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#substitutions
http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#substitution-definitions

We can use substitutions to only write docstrings once for parameters that are used for multiple functions:

Example:

"""Do something module

.. |theta_docstring| replace:: This text describes theta for all functions. 
"""
import numpy

def func_a(theta):
"""Something

Parameters
---------------
theta :
    |theta_docstring|

"""
def func_b(theta):
"""Something

Parameters
---------------
theta :
    |theta_docstring|

"""

This approach is beneficial because it can reduce source code length and makes maintenance easier because updating one string updates many strings.

Reduce memory footprint of USFFT and Lamino Operators

It looks like the largest memory allocation events are for fftshift, fftplan, and pad.

fftshift

While cupy.fft.fftn has an overwrite_x parameter, fftshift does not. This is because implementation of fftshift is completely written in Python and relies on another Python-only implementation of numpy.roll.

We should write our own in-place fftshift, ifftshift, and roll functions using a CUDA Kernel because I don't think the NumPy API was designed to handle in-place reordering of dimensions.

fftplan

There is a CuPy option to avoid plans and use a series of 1D plans. These plans appear to be 686MiB, which seems quite large, but maybe they are worth it.

pad

This padding operation in eq2us may be avoidable if we do some modular arithmetic in the gather function? Likewise, there is an _unpad function which could be avoided after scatter if this function used modular arithmetic.

    Fe = xp.pad(Fe0, m, mode='wrap')
    F = gather(xp, Fe, x, n, m, mu)

Cuda kernels are written inefficiently

Cuda kernels in usfft.cu, convolution.cu are not optimal.

They are heavy and use a huge number of registers which may significantly slowdown the code. They should be split into smaller ones.

  1. If there exist 'if,else' statement like for fwd/adj operator then it is better to split it into 2, like it was at the beginning.

  2. Loops with many iterations inside the kernels should be avoided by adding new threads, this way allowing the scheduler to switch threads more efficiently, like it was at the beginning

  3. Non-sequential and uncoallesced memory access in the loops make the code slower, the following loop structure is unacceptable from cuda optimization point of view. It also looks unreadable.

 // for each image
  for (int ti = blockIdx.z; ti < nimage; ti += gridDim.z) {
    // for each scan position
    for (int ts = blockIdx.y; ts < nscan; ts += gridDim.y) {
      ....
      // for x,y coords in patch
      for (int py = blockIdx.x; py < patch_shape; py += gridDim.x) {
        for (int px = threadIdx.x; px < patch_shape; px += blockDim.x) {

Use 3d thread blocks and grids associated with each array dimension, like it was at the beginning. This will give you natural coalesced memory access.

  1. Block size should be a power of 2, optimal combinations (1024,1,1), (256,4,4), (32,32,1), (16,16,4).. What we have in the code:
block = (min(self.scatter_kernel.max_threads_per_block, (2 * m)**3),)

Say m=3, then block = 216 - not optimal.

Would It be better to return to my initial kernels implementations?

Reduce the number of arguments by packing params into dictionaries

We want to reduce the number of parameters for outward facing functions or maybe all functions in general by packing the long lists of parameters into topical dictionaries. For examples:

Parameters theta, v, h could become trajectory by defining

trajectory = {
    'theta' : None,
    'v' : None,
    'h' : None,
}

This kind of packing will definitely be employed for the reconstruction parameters of the tomo.reconstruct and pytcho.reconstruct because those need to be passed through tike.admm.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.