Giter Club home page Giter Club logo

Comments (13)

robbert-vdh avatar robbert-vdh commented on June 24, 2024 1

About how to implement a convolution: I rewrote your code to the following, and this seems to work, but I don't really know much about Gaussian convolutions, so please test if this indeed does what you intended.

<snip>

In case anyone runs into this in the future, keep in mind that does not actually compute convolution. Instead it computes cross correlation. Since the kernel is symmetrical those two will yield the same result, but you need to change the {kerx,kery} - n `A.div` 2 into n `A.div` 2 - {kerx,kery} to get convolution.

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

First let me explain the error. Accelerate has a very strict separation between array computations (Acc) and scalar-level computations (Exp). You cannot perform array computations within an Exp context. This is also explained under "Avoiding nested parallelism" in the documentation. In your program, you're entering the Exp context by writing generate (A.constant (Z :. h :. w)); the operation that you then perform in that context is compute_re . compute_plat, which does array operations. This is disallowed, hence the error.

As for how to implement convolution: if the kernel size (m,n in your code) is known at Haskell compile time, then a 2-dimensional stencil (for usage, see the docs for 1-dimensional stencils) is without doubt the most runtime-efficient way of implementing this. Stencils in general (see I guess this wikipedia article, but not a super clear source) are literally made for this. However, with Accelerate's built-in stencil operations, the size of the kernel influences the type of the stencil operation, meaning that you cannot just have a size in two Int values and do a stencil of that size. Arbitrary-size built-in stencil operations have been asked for earlier (#426); they just haven't been implemented yet.

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

About how to implement a convolution: I rewrote your code to the following, and this seems to work, but I don't really know much about Gaussian convolutions, so please test if this indeed does what you intended.

EDIT: as correctly observed by robbert-vdh below, if the kernel array is non-symmetric, this is technically cross-correlation, not convolution. (This is the same behaviour as the poster's original code.) Either reverse the kernel array in both dimensions, or replace ker{x,y} - {m,n} `A.div` 2 by {m,n} `A.div` 2 - ker{x,y}, to get convolution even for non-symmetric kernel arrays.

module Main where

import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate (Exp, Acc, Array, DIM2, DIM3, DIM4, Z(..), (:.)(..))
import qualified Data.Array.Accelerate.Interpreter as I


-- Note that I've removed the additional Int parameters; you can use `shape` to
-- request the size of an array.
gauss_convolution :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM2 Double)
gauss_convolution gauss_op image = compute_re gauss_op (compute_plat gauss_op image)

-- Add two dimensions on the bottom. This function replaces every cell in the
-- input image with a 2-dimensional array the size of gauss_op, thereby making
-- a 4-dimensional array.
compute_plat :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM4 Double)
compute_plat gauss_op image = A.generate (A.I4 h w n m) myfun
  where
    -- Size of the gaussian kernel. Use the pattern synonyms instead of `lift`
    -- and `unlift`; they are much nicer. Alternative notation here would be
    -- `Z_ ::. n ::. m`.
    A.I2 n m = A.shape gauss_op

    -- Size of the full image
    A.I2 h w = A.shape image

    -- Now this function is going to be called for the whole 4-dimensional
    -- array, which means that it gets the index in the image as the first two
    -- components and the index in the kernel as the last two components.
    --
    -- About yuan{x,y}:  `floor ((fromIntegral x :: Double) / 2)` is precisely
    --                   what `div` does. ;)
    --
    -- kery and kerx are what you called y1 and x1.
    myfun :: Exp DIM4 -> Exp Double
    myfun (A.I4 y x kery kerx) =
        image A.! A.I2 (A.min (h - 1) (A.max 0 (y + kery - n `A.div` 2)))
                       (A.min (w - 1) (A.max 0 (x + kerx - m `A.div` 2)))

-- Fold the 2-dimensional subarrays into a scalar again, bringing the
-- 4-dimensional array down to 2 dimensions again. The first argument is the
-- gaussian kernel.
compute_re :: Acc (Array DIM2 Double) -> Acc (Array DIM4 Double) -> Acc (Array DIM2 Double)
compute_re gauss_op pla =
    let A.I4 h w n m = A.shape pla
        -- Replicate the gaussian kernel over all positions in the image
        replicated_gauss_op :: Acc (Array DIM4 Double)
        replicated_gauss_op = A.replicate (A.I4 h w A.All_ A.All_) gauss_op
        -- Multiply the kernel elementwise with each neighbourhood in the image
        multiplied :: Acc (Array DIM4 Double)
        multiplied = A.zipWith (A.*) replicated_gauss_op pla
        -- Contract each neighbourhood (2-dimensional subarray) down to a
        -- 1-dimensional vector, so that we can fold it away in one go
        contracted :: Acc (Array DIM3 Double)
        contracted = A.reshape (A.I3 h w (n * m)) multiplied
    in -- Finally, sum this contracted vector to get the 2-dimensional result, the size of the original image.
       A.fold (A.+) 0 contracted

main :: IO ()
main = do
    let image = A.fromList (Z :. (7 :: Int) :. (8 :: Int))
                  [1, 0, 0, 0, 0, 0, 0, 1
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 0, 0, 1, 1, 0, 0, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,1, 0, 0, 0, 0, 0, 0, 1]
        gauss_op = A.fromList (Z :. (3 :: Int) :. (5 :: Int))
                     (map (/ 17)
                        [0, 0, 2, 0, 0
                        ,1, 3, 5, 3, 1
                        ,0, 0, 2, 0, 0])

    print $ I.runN gauss_convolution gauss_op image

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024
the

First let me explain the error. Accelerate has a very strict separation between array computations (Acc) and scalar-level computations (Exp). You cannot perform array computations within an Exp context. This is also explained under "Avoiding nested parallelism" in the documentation. In your program, you're entering the Exp context by writing generate (A.constant (Z :. h :. w)); the operation that you then perform in that context is compute_re . compute_plat, which does array operations. This is disallowed, hence the error.

As for how to implement convolution: if the kernel size (m,n in your code) is known at Haskell compile time, then a 2-dimensional stencil (for usage, see the docs for 1-dimensional stencils) is without doubt the most runtime-efficient way of implementing this. Stencils in general (see I guess this wikipedia article, but not a super clear source) are literally made for this. However, with Accelerate's built-in stencil operations, the size of the kernel influences the type of the stencil operation, meaning that you cannot just have a size in two Int values and do a stencil of that size. Arbitrary-size built-in stencil operations have been asked for earlier (#426); they just haven't been implemented yet.

Thanks, It's effective!

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024

About how to implement a convolution: I rewrote your code to the following, and this seems to work, but I don't really know much about Gaussian convolutions, so please test if this indeed does what you intended.

module Main where

import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate (Exp, Acc, Array, DIM2, DIM3, DIM4, Z(..), (:.)(..))
import qualified Data.Array.Accelerate.Interpreter as I


-- Note that I've removed the additional Int parameters; you can use `shape` to
-- request the size of an array.
gauss_convolution :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM2 Double)
gauss_convolution gauss_op image = compute_re gauss_op (compute_plat gauss_op image)

-- Add two dimensions on the bottom. This function replaces every cell in the
-- input image with a 2-dimensional array the size of gauss_op, thereby making
-- a 4-dimensional array.
compute_plat :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM4 Double)
compute_plat gauss_op image = A.generate (A.I4 h w n m) myfun
  where
    -- Size of the gaussian kernel. Use the pattern synonyms instead of `lift`
    -- and `unlift`; they are much nicer. Alternative notation here would be
    -- `Z_ ::. n ::. m`.
    A.I2 n m = A.shape gauss_op

    -- Size of the full image
    A.I2 h w = A.shape image

    -- Now this function is going to be called for the whole 4-dimensional
    -- array, which means that it gets the index in the image as the first two
    -- components and the index in the kernel as the last two components.
    --
    -- About yuan{x,y}:  `floor ((fromIntegral x :: Double) / 2)` is precisely
    --                   what `div` does. ;)
    --
    -- kery and kerx are what you called y1 and x1.
    myfun :: Exp DIM4 -> Exp Double
    myfun (A.I4 y x kery kerx) =
        image A.! A.I2 (A.min (h - 1) (A.max 0 (y + kery - n `A.div` 2)))
                       (A.min (w - 1) (A.max 0 (x + kerx - m `A.div` 2)))

-- Fold the 2-dimensional subarrays into a scalar again, bringing the
-- 4-dimensional array down to 2 dimensions again. The first argument is the
-- gaussian kernel.
compute_re :: Acc (Array DIM2 Double) -> Acc (Array DIM4 Double) -> Acc (Array DIM2 Double)
compute_re gauss_op pla =
    let A.I4 h w n m = A.shape pla
        -- Replicate the gaussian kernel over all positions in the image
        replicated_gauss_op :: Acc (Array DIM4 Double)
        replicated_gauss_op = A.replicate (A.I4 h w A.All_ A.All_) gauss_op
        -- Multiply the kernel elementwise with each neighbourhood in the image
        multiplied :: Acc (Array DIM4 Double)
        multiplied = A.zipWith (A.*) replicated_gauss_op pla
        -- Contract each neighbourhood (2-dimensional subarray) down to a
        -- 1-dimensional vector, so that we can fold it away in one go
        contracted :: Acc (Array DIM3 Double)
        contracted = A.reshape (A.I3 h w (n * m)) multiplied
    in -- Finally, sum this contracted vector to get the 2-dimensional result, the size of the original image.
       A.fold (A.+) 0 contracted

main :: IO ()
main = do
    let image = A.fromList (Z :. (7 :: Int) :. (8 :: Int))
                  [1, 0, 0, 0, 0, 0, 0, 1
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 0, 0, 1, 1, 0, 0, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,1, 0, 0, 0, 0, 0, 0, 1]
        gauss_op = A.fromList (Z :. (3 :: Int) :. (5 :: Int))
                     (map (/ 17)
                        [0, 0, 2, 0, 0
                        ,1, 3, 5, 3, 1
                        ,0, 0, 2, 0, 0])

    print $ I.runN gauss_convolution gauss_op image

WOW! Beautiful job! I need to retry it.

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024

About how to implement a convolution: I rewrote your code to the following, and this seems to work, but I don't really know much about Gaussian convolutions, so please test if this indeed does what you intended.

module Main where

import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate (Exp, Acc, Array, DIM2, DIM3, DIM4, Z(..), (:.)(..))
import qualified Data.Array.Accelerate.Interpreter as I


-- Note that I've removed the additional Int parameters; you can use `shape` to
-- request the size of an array.
gauss_convolution :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM2 Double)
gauss_convolution gauss_op image = compute_re gauss_op (compute_plat gauss_op image)

-- Add two dimensions on the bottom. This function replaces every cell in the
-- input image with a 2-dimensional array the size of gauss_op, thereby making
-- a 4-dimensional array.
compute_plat :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM4 Double)
compute_plat gauss_op image = A.generate (A.I4 h w n m) myfun
  where
    -- Size of the gaussian kernel. Use the pattern synonyms instead of `lift`
    -- and `unlift`; they are much nicer. Alternative notation here would be
    -- `Z_ ::. n ::. m`.
    A.I2 n m = A.shape gauss_op

    -- Size of the full image
    A.I2 h w = A.shape image

    -- Now this function is going to be called for the whole 4-dimensional
    -- array, which means that it gets the index in the image as the first two
    -- components and the index in the kernel as the last two components.
    --
    -- About yuan{x,y}:  `floor ((fromIntegral x :: Double) / 2)` is precisely
    --                   what `div` does. ;)
    --
    -- kery and kerx are what you called y1 and x1.
    myfun :: Exp DIM4 -> Exp Double
    myfun (A.I4 y x kery kerx) =
        image A.! A.I2 (A.min (h - 1) (A.max 0 (y + kery - n `A.div` 2)))
                       (A.min (w - 1) (A.max 0 (x + kerx - m `A.div` 2)))

-- Fold the 2-dimensional subarrays into a scalar again, bringing the
-- 4-dimensional array down to 2 dimensions again. The first argument is the
-- gaussian kernel.
compute_re :: Acc (Array DIM2 Double) -> Acc (Array DIM4 Double) -> Acc (Array DIM2 Double)
compute_re gauss_op pla =
    let A.I4 h w n m = A.shape pla
        -- Replicate the gaussian kernel over all positions in the image
        replicated_gauss_op :: Acc (Array DIM4 Double)
        replicated_gauss_op = A.replicate (A.I4 h w A.All_ A.All_) gauss_op
        -- Multiply the kernel elementwise with each neighbourhood in the image
        multiplied :: Acc (Array DIM4 Double)
        multiplied = A.zipWith (A.*) replicated_gauss_op pla
        -- Contract each neighbourhood (2-dimensional subarray) down to a
        -- 1-dimensional vector, so that we can fold it away in one go
        contracted :: Acc (Array DIM3 Double)
        contracted = A.reshape (A.I3 h w (n * m)) multiplied
    in -- Finally, sum this contracted vector to get the 2-dimensional result, the size of the original image.
       A.fold (A.+) 0 contracted

main :: IO ()
main = do
    let image = A.fromList (Z :. (7 :: Int) :. (8 :: Int))
                  [1, 0, 0, 0, 0, 0, 0, 1
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 0, 0, 1, 1, 0, 0, 0
                  ,0, 0, 1, 0, 0, 1, 0, 0
                  ,0, 1, 0, 0, 0, 0, 1, 0
                  ,1, 0, 0, 0, 0, 0, 0, 1]
        gauss_op = A.fromList (Z :. (3 :: Int) :. (5 :: Int))
                     (map (/ 17)
                        [0, 0, 2, 0, 0
                        ,1, 3, 5, 3, 1
                        ,0, 0, 2, 0, 0])

    print $ I.runN gauss_convolution gauss_op image

I read your code, it's good job. But the process of creating a four-dimensional array consumes a lot of memory, especially when I work with large images

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

But the process of creating a four-dimensional array consumes a lot of memory, especially when I work with large images

This is an understandable worry. However, have you tried it? Does it indeed consume much more memory at runtime than expected?

The reason I'm asking is that Accelerate will not actually create ("manifest") this intermediate 4-dimensional array in memory. (At least, if all is right.) The reason is array fusion: see the "Fusion" subheading in the documentation for Acc.

See the image (I hope it's readable...) for a diagram showing how this works for the program that I wrote above. In black is the original program, in red some of the fusion rules, and in blue the fused program. Note that Accelerate can actually give you the fused program: show gauss_convolution is a String that shows what it will actually compute. Notice that there is just one fold over a delayed array, which contains an expression computing each element of the three-dimensional array. (That expression is kind of unreadable, which is unfortunate, but at least you can see that there are no other array primitives left.)

gauss_convolution_fusion

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024

But the process of creating a four-dimensional array consumes a lot of memory, especially when I work with large images

This is an understandable worry. However, have you tried it? Does it indeed consume much more memory at runtime than expected?

The reason I'm asking is that Accelerate will not actually create ("manifest") this intermediate 4-dimensional array in memory. (At least, if all is right.) The reason is array fusion: see the "Fusion" subheading in the documentation for Acc.

See the image (I hope it's readable...) for a diagram showing how this works for the program that I wrote above. In black is the original program, in red some of the fusion rules, and in blue the fused program. Note that Accelerate can actually give you the fused program: show gauss_convolution is a String that shows what it will actually compute. Notice that there is just one fold over a delayed array, which contains an expression computing each element of the three-dimensional array. (That expression is kind of unreadable, which is unfortunate, but at least you can see that there are no other array primitives left.)

gauss_convolution_fusion

Thank you for your professional help.I solved it.
By the way , what functions should I use to extract some rows and columns of the matrix?
Do I need to call slice function repeatedly?

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

You can indeed use calls to slice to get a few rows and columns of a matrix; does that not work?

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024

The problem is that I need to call it repeatly, I just wan to sample every other row and col .

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

if you want, for example, the 0th, 2nd, 4th, 6th, row etc. of a matrix m :: Acc (Matrix a), why not use:

let I2 h w = shape m
in backpermute (I2 ((h + 1) `div` 2) w)         -- new shape: height goes 0->0, 1->1, 2->1, 3->2, 4->2, 5->3, etc.
               (\(I2 y x) -> m ! I2 (2 * y) x)  -- given index in new array, produce index in old array
               m                                -- array to read values from

Getting the 1st, 3rd, 5th, etc. rows is an exercise to the reader. :)

from accelerate.

wujilingfeng avatar wujilingfeng commented on June 24, 2024
backpermute

Thank you!
This question seems to go beyond the subject, I think it's time to finish it perfectly.
Thank you once again. I will study it.

from accelerate.

tomsmeding avatar tomsmeding commented on June 24, 2024

Hope this was useful :)

from accelerate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.