Giter Club home page Giter Club logo

tensorcast.jl's Introduction

TensorCast.jl

Stable Docs Latest Docs Build Status

This package lets you work with multi-dimensional arrays in index notation, by defining a few macros which translate this to broadcasting, permuting, and reducing operations.

The first is @cast, which deals both with "casting" into new shapes (including going to and from an array-of-arrays) and with broadcasting:

@cast A[row][col] := B[row, col]        # slice a matrix B into rows, also @cast A[r] := B[r,:]

@cast C[(i,j), (k,ℓ)] := D.x[i,j,k,ℓ]   # reshape a 4-tensor D.x to give a matrix

@cast E[φ,γ] = F[φ]^2 * exp(G[γ])       # broadcast E .= F.^2 .* exp.(G') into existing E

@cast _[i] := isodd(i) ? log(i) : V[i]  # broadcast a function of the index values

@cast T[x,y,n] := outer(M[:,n])[x,y]    # generalised mapslices, vector -> matrix function

Second, @reduce takes sums (or other reductions) over the indicated directions. Among such sums is matrix multiplication, which can be done more efficiently using @matmul instead:

@reduce K[_,b] := prod(a,c) L.field[a,b,c]           # product over dims=(1,3), drop dims=3

@reduce S[i] = sum(n) -P[i,n] * log(P[i,n]/Q[n])     # sum!(S, @. -P*log(P/Q')) into exising S

@matmul M[i,j] := sum(k,k′) U[i,k,k′] * V[(k,k′),j]  # matrix multiplication, plus reshape

The same notation with @cast applies a function accepting the dims keyword, without reducing:

@cast W[i,j,c,n] := cumsum(c) X[c,i,j,n]^2           # permute, broadcast, cumsum(; dims=3)

All of these are converted into array commands like reshape and permutedims and eachslice, plus a broadcasting expression if needed, and sum / sum!, or * / mul!. This package just provides a convenient notation.

From version 0.4, it relies on TransmuteDims.jl to handle re-ordering of dimensions, and LazyStack.jl to handle slices. It should also now work with OffsetArrays.jl:

using OffsetArrays
@cast R[n,c] := n^2 + rand(3)[c]  (n in -5:5)        # arbitrary indexing

And it can be used with some packages which modify broadcasting:

using Strided, LoopVectorization, LazyArrays
@cast @strided E[φ,γ] = F[φ]^2 * exp(G[γ])           # multi-threaded
@reduce @turbo S[i] := sum(n) -P[i,n] * log(P[i,n])  # SIMD-enhanced
@reduce @lazy M[i,j] := sum(k) U[i,k] * V[j,k]       # non-materialised

It should work automatically with most array types. This includes GPU arrays such as those from CUDA.jl, whose broadcasting is executed on the device.

Installation

using Pkg; Pkg.add("TensorCast")

The current version requires Julia 1.6 or later. There are a few pages of documentation.

Elsewhere

Similar notation is also used by some other packages, although all of them use an implicit sum over repeated indices. TensorOperations.jl performs Einstein-convention contractions and traces:

@tensor A[i] := B[i,j] * C[j,k] * D[k]      # matrix multiplication, A = B * C * D
@tensor D[i] := 2 * E[i] + F[i,k,k]         # partial trace of F only, Dᵢ = 2Eᵢ + Σⱼ Fᵢⱼⱼ

More general contractions are allowed by OMEinsum.jl, but only one term:

@ein Z[i,j,ξ] := X[i,k,ξ] * Y[j,k,ξ]        # batched matrix multiplication
Z = ein" ikξ,jkξ -> ijξ "(X,Y)              # numpy-style notation

Einsum.jl and Tullio.jl allow arbitrary (element-wise) functions:

@einsum S[i] := -P[i,n] * log(P[i,n]/Q[n])  # sum over n, for each i (also with @reduce above)
@einsum G[i] := 2 * E[i] + F[i,k,k]         # the sum includes everyting:  Gᵢ = Σⱼ (2Eᵢ + Fᵢⱼⱼ)
@tullio Z[i,j] := abs(A[i+x, j+y] * K[x,y]) # convolution, summing over x and y

Notice that @einsum and @tullio sum the entire right hand side, like @reduce does, while @tensor sums individual terms.

These produce very different code for actually doing what you request: The macros @tensor and @ein work out a sequence of basic tensor operations (like contraction and traces), while @einsum and @tullio write the necessary set of nested loops directly (plus optimisations). This package's macros @cast, @reduce and @matmul instead write everything in terms of whole-array operations (like reshape, permutedims and broadcasting).

For those who speak Python, @cast and @reduce allow similar operations to einshape or einops (minus the cool video, but plus broadcasting). In the tests, this file translates many examples. Python's einsum is closer OMEinsum.@ein and TensorOperations.@tensor, and this package's @matmul.

About

This was a holiday project to learn a bit of metaprogramming, originally TensorSlice.jl. But it suffered a little scope creep.

tensorcast.jl's People

Contributors

ararslan avatar ericphanson avatar github-actions[bot] avatar jfeist avatar juliatagbot avatar kristofferc avatar marwahaha avatar mcabbott avatar pallharaldsson avatar romeov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tensorcast.jl's Issues

Hope to close the both side indices check

In a specific problem, I need to use nested @cast. But due to the check of coexistence of indices, I cannot accomplish this. Here is a simplified case:

using TensorCast
table=rand(2,2,2,2);
@cast testm1[i,j]:=table[i,j,1,1];
testm2 = sum(testm1,dims=1)
# 1×2 Matrix{Float64}:
#  1.15043  0.772946

@cast test[k,l]:=sum(@cast _[i,j]:=table[i,j,k,l],dims=1)
# ERROR: LoadError: index k appears only on the right
#     inner @cast _[i, j] := (table[i, j, k, l], dims) = 1

My purpose is simple, for given indices k and l, sum the result matrix with indices i,j over i, then with different k,l, we have a new matrix. The first example may clearly express my idea. However, @cast always checks index existence on both side, I can't do this. Maybe it is better to achieve this with cloing index check.

Friendly request for new tag, for compat

Good day,

Not sure if it is easily possible to do a new tag at present please...? I'm finding a difficult compat case in dev work on RoME.jl. Locally, I can work around it via TensorCast#master which includes this commit 6262005 (which allows StaticArrays v1.0).

Dependencies force me to require StaticArrays v1.0, but current TensorCast v0.3.2 unfortunately does not:

StaticArrays = "0.10, 0.11, 0.12"

Alternatively, maybe just a patch release branch for v0.3.3 adding 6262005 (if v0.4.0 is not ready yet)?

This is a great package btw, thanks!

Best,
Dehann


xref, JuliaRobotics/RoME.jl#424

 ERROR: Unsatisfiable requirements detected for package StaticArrays [90137ffa]:
 StaticArrays [90137ffa] log:
 ├─possible versions are: 0.8.0-1.0.1 or uninstalled
 ├─restricted by compatibility requirements with CoordinateTransformations [150eb455] to versions: 0.8.0-1.0.1
 │ └─CoordinateTransformations [150eb455] log:
 │   ├─possible versions are: 0.5.0-0.6.1 or uninstalled
 │   └─restricted to versions 0.5-0.7 by RoME [91fb55c2], leaving only versions 0.5.0-0.6.1
 │     └─RoME [91fb55c2] log:
 │       ├─possible versions are: 0.13.0 or uninstalled
 │       └─RoME [91fb55c2] is fixed to version 0.13.0
 ├─restricted by compatibility requirements with Manifolds [1cead3c2] to versions: 1.0.0-1.0.1
 │ └─Manifolds [1cead3c2] log:
 │   ├─possible versions are: 0.1.0-0.4.20 or uninstalled
 │   └─restricted to versions 0.4.19-0.4 by RoME [91fb55c2], leaving only versions 0.4.19-0.4.20
 │     └─RoME [91fb55c2] log: see above
 └─restricted by compatibility requirements with TensorCast [02d47bb6] to versions: 0.10.0-0.12.5 — no versions left
   └─TensorCast [02d47bb6] log:
     ├─possible versions are: 0.1.0-0.3.2 or uninstalled
     └─restricted to versions 0.2-0.4 by RoME [91fb55c2], leaving only versions 0.2.0-0.3.2
       └─RoME [91fb55c2] log: see above

cannot @cast on views of other @cast results

This does not work,

using TensorCast, StaticArrays
world_basis_to_P(R,c; K=SMatrix{3,3}(one(eltype(R))I)) = K*[R' -R'*c]
R = rand(3,3,100)
X₁ = rand(3,100)
@cast P[i,j,k] := world_basis_to_P(R[:,:,k],X₁[:,k])[i,j]
HH = view(P,:,[1,2,4],:)
@cast invH[i,j,k] := inv(HH[:,:,k])[i,j]

but this does

using TensorCast, StaticArrays
world_basis_to_P(R,c; K=SMatrix{3,3}(one(eltype(R))I)) = K*[R' -R'*c]
R = rand(3,3,100)
X₁ = rand(3,100)
@cast P[i,j,k] := world_basis_to_P(R[:,:,k],X₁[:,k])[i,j]
HH = P,[:,[1,2,4],:]
@cast invH[i,j,k] := inv(HH[:,:,k])[i,j]

Smarter repeat?

Would be neat if this worked (from this thread):

julia> X = randn(1000, 1000);

julia> Y = zeros(2000, 2000);

julia> Y .= repeat(X, inner=(2,2));

julia> @cast Y[(a,b), (c,d)] = X[b, d];
ERROR: LoadError: index a appears only on the left
    @cast Y[(a, b), (c, d)] = X[b, d]  
    @ Main REPL[51]:1
Stacktrace:
 [1] checkallseen(canon::Vector{Any}, store::NamedTuple{(:dict, :assert, :mustassert, :seen, :need, :top, :main), Tuple{Dict{Any, Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}, Vector{Any}}}, call::TensorCast.CallInfo)
   @ TensorCast ~/.julia/packages/TensorCast/eabry/src/macro.jl:1466

julia> @cast Y[(a,b), (c,d)] = X[b, d]  a in 1:2, c in 1:2;
ERROR: LoadError: index a appears only on the left

julia> @cast Y[(a,b), (c,d)] := X[b, d]  a in 1:2, c in 1:2;
ERROR: LoadError: index a appears only on the left

Use vector of indexes

I wanted to reduce into a 2-vector by using the indexes stored in ixs. Currently it seems to ignore ixs. What do you think about this idea?

using TensorCast
vs = [[10,20,30,40,50,60] .+ rand(6) for _ in 1:10]
ixs = [1,2,1,2,1,2]
@reduce v[ixs[j]] := mean(i) vs[i][j]

@cast on functions with tuples as lvalues

is there a way to use tensor cast for functions that return tuples? e.g.,

function axis_to_random_rotation3(x::AbstractVector)
    x′ = similar(x)
    x′ .= rand(eltype(x),3)
    y = cross(x,x′)
    z = cross(x,y)
    R = [x y z]
    R .= R./norm.(eachcol(R))'

    return R
end

renorm!(n) = n ./= norm(n)

@views function plane_to_random_pose(Π)
    renorm!(Π)
    R = axis_to_random_rotation3(Π[1:3])
    t = -Π[4].*Π[1:3]
    
    return R, t
end

Here I want to use tensor cast to shape the returned pose from a set of planes e.g.

Π = rand(4,100)
@cast something something  := plane_to_random_pose(Π[:,j])[i]

Slightly misleading error (DimensionMismatch) when destination is actually not defined

Hi @mcabbott! I found the following error (= instead of :=) to be rather misleading. Instead of a nonexistent destination, the reported error is a DimensionMismatch:

using TensorCast

V = [10, 20, 30]
M = collect(reshape(1:12, 3, 4))

@cast A[j, i] = M[i, j] + 10 * V[i]
# => ERROR: DimensionMismatch("array could not be broadcast to match destination")

@cast A[j, i] := M[i, j] + 10 * V[i]
# => 4×3 Array{Int64,2}:
#     101  202  303
#     104  205  306
#     107  208  309
#     110  211  312

Do you think a different error could be thrown in this case?

EDIT:

(@v1.4) pkg> status TensorCast
Status `~/.julia/environments/v1.4/Project.toml`
  [02d47bb6] TensorCast v0.3.2

bugs with OffsetArrays

Compare

A = OffsetArray(rand(3, 3, 5, 5), 1:3, 1:3, -2:2, -2:2)
B = OffsetArray(rand(3,    5)   , 1:3,      -2:2)
δ1 = I(3)
δ2 = OffsetArray(I(5), -2:2, -2:2)

Ap = A.parent
Bp = B.parent
δ2p = δ2.parent

@cast Cp[a, b, c, d] := Ap[a, b, c, d] + Bp[a, c] * δ1[a, b] * δ2p[c, d]
@cast C[ a, b, c, d] := A[ a, b, c, d] + B[ a, c] * δ1[a, b] * δ2[ c, d]  # errors due to mistaken bounds checking

Slices and CuArrays

@MasonProtter points out that, somewhat to my surprise, mapslices-like things work on the GPU:

julia> using Zygote, TensorCast, CuArrays, BenchmarkTools
julia> CuArrays.allowscalar(false)
julia> let X = randn(10, 10, 10, 10, 10)
           Xcu = cu(X)
           f(A::AbstractArray{T, 4}) where {T} = reshape(sum(A, dims=(3,4)), size(A,1), size(A, 2))
           g(X) = @cast y[i, j, k] := f(X[:, :, :, :, k])[i, j]
           h_cu(a) = sum(g(a * Xcu))
           h_cpu(a)= sum(g(a * X))
           @show length(X)
           @btime $h_cu'(1)
           @btime $h_cpu'(1)
       end
length(X) = 100000
  1.660 ms (4676 allocations: 200.19 KiB)
  747.911 μs (374 allocations: 3.84 MiB)
351.3479618616162

julia> @pretty @cast y[i, j, k] := f(X[:, :, :, :, k])[i, j]
begin
    local armadillo = sliceview(X, (:, :, :, :, *))
    local porpoise = @__dot__(f(armadillo))
    local jaguar = red_glue(porpoise, (:, :, *))
    y = jaguar
end

It would be nice to (1) make sure this doesn't break, e.g. with #17, (2) understand which cases work or whether some don't, and (3) ideally make it fast?

support multiple return values?

I'm trying to slice-broadcast a function that has multiple return values. Ideally I'd collect them into separate arrays. Is this possible witih TensorCast?

Here's a MWE of basically what I'm trying to do:

x = randn(10, 20)
@cast v[k], i[k] := findmax(x[:, k]) 

But this throws the error:

LoadError: don't know what to do with left = (v[k], i[k])

Which is a great error message that leads me to believe I can't do this kind of thing, but I wanted to check.

In this case it's probably not too bad to just collect into a single Vector{Tuple{Float64, Int} and split it afterwards, but my actual example has some more complicated indexing stuff going on that would benefit from some TensorCast magic.

Inference failure in broadcasting

Related to mcabbott/TransmuteDims.jl#25 I think:

julia> using TensorCast

julia> fun5(G, E, x, h, μ=0.1, σ=0.1) = 
           @reduce out[j,b] := sum(i) G[i,j] * exp(((x[i,b]) - μ) * σ) * (E[i,j] - h[j,b])
fun5 (generic function with 3 methods)

julia> begin
         G = rand(3,3)
         E = rand(3,3)
         h = rand(3,10)
         x = rand(3,10)
       end;

julia> fun5(G, E, x, h, 0.2, 0.2)
3×10 Matrix{Float64}:
  0.27324    -0.695186  -0.467073  -0.105391     0.130849  -0.0926917  -0.505134  0.244531
 -0.0923791  -0.526902  -1.01656    0.562381     -0.331856  -0.34439     1.21358   0.118339
 -0.270235    0.133173   0.474618  -0.116692      0.667222  -0.531803   -0.862496  0.58566

julia> @code_warntype fun5(G, E, x, h, 0.2, 0.2)
Variables
  #self#::Core.Const(fun5)
  G::Matrix{Float64}
  E::Matrix{Float64}
  x::Matrix{Float64}
  h::Matrix{Float64}
  μ::Float64
  σ::Float64
  nobroadcast#271::Union{Array{Float64, 3}, TransmuteDims.TransmutedDimsArray{Float64, 3, (0, 1, 2), (2, 3), Matrix{Float64}}}
  nobroadcast#270::Union{Array{Float64, 3}, TransmuteDims.TransmutedDimsArray{Float64, 3, (1, 0, 2), (1, 3), Matrix{Float64}}}
  out::Any

Body::Any
1 ──        Core.NewvarNode(:(nobroadcast#271))
│           Core.NewvarNode(:(nobroadcast#270))
│           Core.NewvarNode(:(out))
└───        goto #4 if not $(Expr(:boundscheck))
...
32 ─        Core.Const(:(Base.throw))
│           Core.Const(:(Main.ArgumentError("expected a 2-tensor h[j, b]")))
└───        Core.Const(:((%92)(%93)))
33%95  = TensorCast.transmute::Core.Const(TransmuteDims.transmute)
│           (nobroadcast#270 = (%95)(x, (1, nothing, 2)))%97  = TensorCast.transmute::Core.Const(TransmuteDims.transmute)
│           (nobroadcast#271 = (%97)(h, (nothing, 1, 2)))%99  = (:dims,)::Core.Const((:dims,))
│    %100 = Core.apply_type(Core.NamedTuple, %99)::Core.Const(NamedTuple{(:dims,), T} where T<:Tuple)
│    %101 = Core.tuple(1)::Core.Const((1,))
│    %102 = (%100)(%101)::Core.Const((dims = 1,))
│    %103 = Core.kwfunc(Main.sum)::Core.Const(Base.var"#sum##kw"())
│    %104 = Base.broadcasted(Main.:-, nobroadcast#270, μ)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{Array{Float64, 3}, Float64}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{TransmuteDims.TransmutedDimsArray{Float64, 3, (1, 0, 2), (1, 3), Matrix{Float64}}, Float64}}}%105 = Base.broadcasted(Main.:*, %104, σ)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{Array{Float64, 3}, Float64}}, Float64}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{TransmuteDims.TransmutedDimsArray{Float64, 3, (1, 0, 2), (1, 3), Matrix{Float64}}, Float64}}, Float64}}}
│    %106 = Base.broadcasted(Main.exp, %105)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(exp), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{Array{Float64, 3}, Float64}}, Float64}}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(exp), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{TransmuteDims.TransmutedDimsArray{Float64, 3, (1, 0, 2), (1, 3), Matrix{Float64}}, Float64}}, Float64}}}}}
│    %107 = Base.broadcasted(Main.:-, E, nobroadcast#271)::Union{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{Matrix{Float64}, Array{Float64, 3}}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(-), Tuple{Matrix{Float64}, TransmuteDims.TransmutedDimsArray{Float64, 3, (0, 1, 2), (2, 3), Matrix{Float64}}}}}%108 = Base.broadcasted(Main.:*, G, %106, %107)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3}, Nothing, typeof(*), Args} where Args<:Tuple%109 = Base.materialize(%108)::Any%110 = (%103)(%102, Main.sum, %109)::Any%111 = (:dims,)::Core.Const((:dims,))
│    %112 = Core.apply_type(Core.NamedTuple, %111)::Core.Const(NamedTuple{(:dims,), T} where T<:Tuple)
│    %113 = Core.tuple(1)::Core.Const((1,))
│    %114 = (%112)(%113)::Core.Const((dims = 1,))
│    %115 = Base.dropdims::Core.Const(dropdims)
│    %116 = Core.kwfunc(%115)::Core.Const(Base.var"#dropdims##kw"())
│    %117 = Base.dropdims::Core.Const(dropdims)
│    %118 = (%116)(%114, %117, %110)::Any
│           (out = %118)
└───        return %118

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Macro hygeine of `TensorCast` symbol

Seems like there is a macro hygeine issue for complex tensor casts:

julia> using TensorCast: @cast

julia> X = ones(2, 3); Y = ones(5);

julia> @cast Z[i, j, k] := X[i, j] * Y[k];
ERROR: UndefVarError: `TensorCast` not defined
Stacktrace:
 [1] top-level scope
   @ ~/.julia/packages/TensorCast/gBujg/src/macro.jl:209

@cast works for hcat but not for [ ]

This works,

@cast P[i,j,k] := hcat(R[:,:,k],X₁[:,k])[i,j]

but this does not

@cast P[i,j,k] := [R[:,:,k] X₁[:,k]][i,j]

It gives the error,

DimensionMismatch: stack expects uniform slices, got axes(x) == (1:3,) while first had (1:3, 1:4)

Should it? Thanks!

Performance of nested reductions

What can be done to improve TensorCast's performance on the following nested reductions:

using TensorCast, BenchmarkTools

M = [i+j for i=1:4, j=0:4:12]
B = [M[i:i+1, j:j+1] for i in 1:2:size(M,1), j in 1:2:size(M,2)]

M2 = reduce(hcat, reduce.(vcat, eachcol(B)))
@cast M3[ik,jl] |= B[k,l][i,j]                      # \otimes<tab>;  

M == M2 == M3   # true

@btime M2 = reduce(hcat, reduce.(vcat, eachcol($B)))    #  392 ns (4 allocs: 512 bytes)
@btime @cast M3[ik,jl] |= $B[k,l][i,j]              # 1.250 μs (15 allocs: 640 bytes)

Cheers.

array arguments in @cast indexing

Here is a short example (Julia 1.9.2)

bsz = (8, 8);

#this doesnt work:
@cast b[(b1,b2), t, h,w] := img[(b1, h), (b2, w), t] (b1 in 1:bsz[1], b2 in 1:bsz[2]);

#this works:
bsz1 = bsz[1];
bsz2 = bsz[2];
@cast b[(b1,b2), t, h,w] := img[(b1, h), (b2, w), t] (b1 in 1:bsz1, b2 in 1:bsz2);

Is it expected behavior? Just learning Julia, couldn't find why this doesn't work in docs or other issues.

Repeat example from the docs is broken

julia> M = Array(reshape(1:12, 3,4))
3×4 Matrix{Int64}:
 1  4  7  10
 2  5  8  11
 3  6  9  12

julia> @cast R[r,(n,c)] := M[r,c]^2  (n in 1:3)
ERROR: LoadError: index n appears only on the left

julia> V = [10,20,30];

julia> @cast _[i,j] := V[i]   (j in 1:4)  # simpler version
ERROR: LoadError: index j appears only on the left

julia> @cast _[i,j] := V[i] + 0j   (j in 1:4)  # this works!
3×4 Matrix{Int64}:
 10  10  10  10
 20  20  20  20
 30  30  30  30

Issues with Strided

First, the current tagged version has an issue with strided:

julia> using Strided, TensorCast

julia> A = rand(3,3); B = rand(3,5);

julia> @reduce C[i, j] := sum(l) A[i, l] * B[l, j] strided
ERROR: UndefVarError: strided_permutedims not defined
Stacktrace:
 [1] top-level scope at none:0

but that is fixed by moving to master, so I'd recommend tagging a new release.

Second, using |= with strided returns a view instead of collecting to an array.

julia> @reduce C[i, j] |= sum(l) A[i, l] * B[l, j] strided
3×5 reshape(::StridedView{Float64,3,Array{Float64,1},typeof(identity)}, 3, 5) with eltype Float64:
 0.308848  0.406922  0.604683  0.240265  0.262697
 0.229795  0.423606  0.431261  0.454207  0.633704
 0.28358   0.41366   0.60017   0.28484   0.274456

Thanks for all your work on this wonderful package!

Add singleton dimensions

First off thanks for this package it works great! It would be nice if you could also add singleton dimensions, maybe using syntax similar to einops e.g:

A = rand(5,5,5)

@cast B[a,b,(),c] := A[a,b,c]

size(B) # (5,5,1,5)

World Age Issues with TensorCast calls from pyJulia

I am getting a world age bug when running TensorCast in JuliaPy, but not when using it normally.

If I run an example function such as

function testfunc()
    V = [10,20,30]
    M = collect(reshape(1:12, 3,4))
    @cast A[j,i] := M[i,j] + 10 * V[i]
    return A
end

it works when I call it in Julia via

A = testfunc()

However, when I run the Python script

from julia.api import Julia
jl = Julia(compiled_modules=False)
from julia import Main
Main.include("testfunction.jl")
from julia.Main import testfunc

A = testfunc()

I get the error

<PyCall.jlwrap (in a Julia function called from Python)
JULIA: MethodError: no method matching _trex(::Symbol, ::Type{Matrix{Int64}}, ::Tuple{Int64, Int64})
The applicable method may be too new: running in world age 32486, while current world is 38140.
Closest candidates are:
  _trex(::Any, ::Any, ::Any) at /home/andrewhardy/.julia/packages/TransmuteDims/mGkQB/src/TransmuteDims.jl:372 (method too new to be called from this world context.)

which only occurs for the @cast call. What can I do?

CUDA array scalar getindex error

The following:

using TensorCast, CUDA; CUDA.allowscalar(false)
C = cu(ones(10,2))
L = cu(ones(10,3))
@reduce D[m,a] := sum(p) C[p,a] + L[p,m]

gives a scalar getindex is disallowed error. But using @cast as an intermediate step or re-ordering indices both work fine:

@cast T[p,m,a] := C[p,a] + L[p,m]
D = reshape(sum(T, dims=1), (3,2))

or

C = cu(ones(2,10))
L = cu(ones(3,10))
@reduce D[m,a] := sum(p) C[a,p] + L[m,p]

both produce

3×2 CuArray{Float32,2,CuArray{Float32,3,Nothing}}:
 20.0  20.0
 20.0  20.0
 20.0  20.0

Question was initially raised here: TensorCast & CUDA

Extra singleton dimension appears during casting

using TensorCast, EllipsisNotation

@views project(x) = x ./ x[[end],..]
@views project!(x) = @. x = x / x[[end],..]
euclideanize(x) = project(x)[1:end-1,..]

P = rand(3,4,100)   
@cast u[i,k] := euclideanize(P[:,4,:])[i,k]

The result should be 2 dimensional but it is 3 with a singleton dimension added. Hower, this works as expected,

@cast u[i,k] := Affine.euclideanize(view(P, :, 4, :))[i,k]

Error in gradient of stacked vectors

This should work:

julia> using TensorCast, Zygote

julia> f1(x,y) = [x^2, x*y, y^2];

julia> x = 1:2; y = 1:4;

julia> @cast out[i,j,k] := f1(x[i], y[j])[k]
2×4×3 transmute(stack(::Matrix{Vector{Int64}}), (2, 3, 1)) with eltype Int64:
[:, :, 1] =
 1  1  1  1
 4  4  4  4

[:, :, 2] =
 1  2  3  4
 2  4  6  8

[:, :, 3] =
 1  4  9  16
 1  4  9  16

julia> gradient(x -> sum(@cast out[i,j,k] := f1(x[i], y[j])[k]), 1:2)
ERROR: BoundsError: attempt to access 3×2×4 transmute(::FillArrays.Fill{Int64, 3, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}, Base.OneTo{Int64}}}, (3, 1, 2)) with eltype Int64 at index [1:3, 1]

Allow naked indices?

Sometimes it is useful to directly use the index value on the right hand side. At the moment this can be done by explicitly indexing like axes(x,1)[i], but it would be nice if the index i used alone could be understood:

julia> using TensorCast, Tullio

julia> x = ones(3, 5);

julia> xi, xj = axes(x);

julia> @cast out[i,j] := xi[i] + log(x[i,j] + xj[j])  # @. xi + log(x + xj')
3×5 Matrix{Float64}:
 1.69315  2.09861  2.38629  2.60944  2.79176
 2.69315  3.09861  3.38629  3.60944  3.79176
 3.69315  4.09861  4.38629  4.60944  4.79176

julia> @tullio out[i,j] := i + log(x[i,j] + j)  # loops not broadcasting
3×5 Matrix{Float64}:
 1.69315  2.09861  2.38629  2.60944  2.79176
 2.69315  3.09861  3.38629  3.60944  3.79176
 3.69315  4.09861  4.38629  4.60944  4.79176

Support for AxisKeys

I would like to keep track of dimensions with AxisKeys through my tensorcast. Would that be possible?

let a = KeyedArray(reshape(1:6, 2, 3); time=1:2, place=1:3)
    # The ticks are \prime not single-quote.
    @cast b[time=i, time′=i′, place=j, place′=j′] := a[time=i, place=j] + a[time=i′, place=j′]
end

scalar getindex when shared index changes order

I was asked to make this an issue, but I'm not sure there's anything to be done about it.

TensorCast uses scalar indexing when a shared index changes position, which is nightmarishly slow for big calculations on gpu.

using TensorCast
using CUDA
CUDA.allowscalar(false)

a, b, c, d = 1, 2, 3, 4
A = cu(rand(a,b,c))
B = cu(rand(a,c,d))

# Fast
@reduce C[a,b,d] := sum(c) A[a,b,c] * B[a,c,d]

# Fast
@reduce C[a,d,b] := sum(c) A[a,b,c] * B[a,c,d]

# Not fast
@reduce C[b,a,d] := sum(c) A[a,b,c] * B[a,c,d]

# Fast
@reduce C[b,c,d] := sum(a) A[a,b,c] * B[a,c,d]

# Not fast
@reduce C[c,b,d] := sum(a) A[a,b,c] * B[a,c,d]

Error in summation within @reduce

There are some issues when using @reduce as summation. Here, when define a martrx, TensorCast works well.

julia> using TensorCast

julia> A=rand(4,4);

julia> @reduce _[]:=sum(i,j) A[i,j]
0-dimensional Array{Float64, 0}:
8.519131257815888

julia> @reduce _[]:=sum(i) A[i,i]
0-dimensional Array{Float64, 0}:
1.6983845333674332

However, now if I define a 4th-order array, something wrong happens:

julia> B=rand(4,4,4,4);

julia> @reduce _[]:=sum(i) B[i,i,i,i,]
ERROR: LoadError: index i repeated in [i, i, i, i]

julia>  @reduce _[]:=sum(i,j) B[i,j,i,j]
ERROR: LoadError: index i repeated in [i, j, i, j]

Eror when calcualte gradients with zygote

i get the folowing error when i try to calcualte gradients:

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations do not execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use allowscalar or @allowscalar
to enable scalar iteration globally or for the operations in question.
Stacktrace:

using TensorCast
using Flux, AMDGPU, OMEinsum
using OMEinsum

struct LayerNorm{T<:AbstractArray}
    g::T
end

function LayerNorm(dim::Int)
    g = ones(Float32, (1, 1, dim, 1))
    LayerNorm(g)
end

Flux.@layer :expand LayerNorm

function (ln::LayerNorm)(x)
    dims = 3
    eps::Float32 = 1e-5
    μ = mean(x; dims=dims)
    σ² = var(x; dims=dims)
    x_normalized = (x .- μ) .* sqrt.((eps .+ σ²))
    return x_normalized .* ln.g
end


struct LinearAttention{T<:Chain,S<:Conv}
    scale::Float32
    heads::Int
    to_qkv::S
    to_out::T
end


function LinearAttention(dim::Int; heads::Int=4, dim_head::Int=32)
    scale::Float32 = dim_head^-0.5
    hidden_dim = dim_head * heads
    to_qkv = Conv((1, 1), dim => hidden_dim * 3, bias=false)
    to_out = Chain(
        Conv((1, 1), hidden_dim => dim),
        LayerNorm(dim)  # Assuming a LayerNorm implementation as earlier
    )
    return LinearAttention(scale, heads, to_qkv, to_out)
end

Flux.@layer :expand LinearAttention
using Statistics: mean, var
function (la::LinearAttention)(x::ROCArray)
    h, w, c, b = size(x)
    qkv = Flux.chunk(la.to_qkv(x), 3; dims=3)
    q, k, v = qkv
    q, k, v = q[:, :, :, :], k[:, :, :, :], v[:, :, :, :]
    @cast q[c, (x, y), h, b] |= q[x, y, (c, h), b] h in 1:la.heads
    @cast k[c, (x, y), h, b] |= k[x, y, (c, h), b] h in 1:la.heads
    @cast v[c, (x, y), h, b] |= v[x, y, (c, h), b] h in 1:la.heads
    println(typeof(q))

    q = softmax(q, dims=1)
    k = softmax(k, dims=2)

    q *= la.scale

    v /= (h * w)
    println("typeof of k", typeof(k))
    println("typeof v", typeof(v))


    context = ein"dnhb,enhb->dehb"(k, v)
    println(typeof(context))

    # println("context: ", size(context))
    out = ein"dehb,dnhb->enhb"(context, q)
    println(typeof(out))
    # println("out: ", size(out))

    @cast out[x, y, (c, h), b] |= out[c, (x, y), h, b] (h in 1:la.heads, x in 1:h, y in 1:w)
    println(typeof(out))
    return la.to_out(out)
end


x = AMDGPU.randn(256, 256, 64, 8)
loss, grad = Flux.withgradient(layer) do l
    a = layer(x)
    sum(a)
end //this doen't work
layer(x) //this works

type of q is RocArry on inference
but it is Base.ReshapedArray{Float32, 4, TransmuteDims.TransmutedDimsArray{Float32, 5, (3, 1, 2, 4, 5), (2, 3, 1, 4, 5), ROCArray{Float32, 5, AMDGPU.Runtime.Mem.HIPBuffer}}, NTuple{4, Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}

on calculating gradients

Optimization opportunity?

I don't know how common this is, but the kernel that showed up in this thread: diag(A'B) which in TensorCast language would be

@reduce C[i] := sum(j) A[j, i] * B[j, i]

is currently slower than

dot.(eachcol(A), eachcol(B))

(however, we beat that approach if the kernel were diag(A*B) instead since eachrow is slow )

I think it would be pretty easy to detect this kernel and make it lower to dot.(eachcol(A), eachcol(B)).

Ternary operator not understood

The following won't work:

julia> using TensorCast
[ Info: Precompiling TensorCast [02d47bb6-7ce6-556a-be16-bb1710789e2b]

julia> x = rand(5, 5)
5×5 Array{Float64,2}:
 0.26903    0.601561  0.148919   0.567794   0.287378
 0.0263811  0.899677  0.631467   0.312277   0.927105
 0.932034   0.827717  0.772556   0.7509     0.669956
 0.75523    0.903847  0.0268497  0.964181   0.315231
 0.665162   0.367656  0.9304     0.0565779  0.776945

julia> y = zeros(5, 5)
5×5 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

julia> z = ones(5, 5)
5×5 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0

julia> @cast q[i, j] := (x[i, j] > 0.5) ? z[i, j] : y[i, j]
ERROR: TypeError: non-boolean (BitArray{2}) used in boolean context
Stacktrace:
 [1] top-level scope at /Users/darsnack/.julia/packages/TensorCast/rCpV9/src/macro.jl:214

For this operation, @cast is not really necessary. In practice, my x, y, and z are some sequence of operations where @cast is nice to use. I could break up the computation and use map, but it would be nice to consistently use TensorCast.jl in my code.

Concatenation / forced indexing

Hi @mcabbott,
Really enjoying this package, thanks for making it.

I was thinking about a more intuitive way of doing concatenation in higher dimensions (with differently-shaped arrays), and I wondered if the following trick with TensorCast.jl would work:

X = randn(5, 100)
y = randn(100)

@cast data[i, j] := (1 <= i <= 5) ? X[i, j] : y[j] (i in 1:6)

Essentially what I am attempting to do here is creating a new array of shape (6, 100), where the first 5 rows are from X, and the last row is from y. I see the following error:

ERROR: DimensionMismatch: range of index i must agree
Stacktrace:
 [1] top-level scope
   @ ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:209

I know that the range of indices is usually inferred from arrays, but I thought: perhaps if I pass the range explicitly like this (i in 1:6), it would ignore the inferred range.

Is this syntax for concatenation possible in any way, or does it break key assumptions in the macro?

Thanks!
Miles

Expr -> Symbol MethodError when combining mapslices and reshapes

I'm trying to write an absolute minimal amount of code to implement a multi-headed self-attention layer. I want to try to do this with TensorCast.jl, both to learn the syntax better, and perhaps as a nice demo of ML in Julia.

Right now I am trying to compute the query matrix in one go. This works:

batch = 4
length = 100
m = 32

# Data:
x = randn(Float32, length, m, batch)

heads = 10

# Layer to compute Q for all heads:
Q = Dense(m, m*heads)

# Computation:
@cast q1[ℓ,ch,n] := Q(x[ℓ,:,n])[ch]
@cast q2[ℓ,c,h,n] := q1[ℓ,(c,h),n] (h in 1:heads)

However, if I try to combine them in one go:

@cast q2[ℓ,c,h,n] := Q(x[ℓ,:,n])[(c,h)] (h in 1:heads)

it gives me the error:

ERROR: LoadError: MethodError: Cannot `convert` an object of type Expr to an object of type Symbol

Closest candidates are:
  convert(::Type{T}, ::T) where T
   @ Base Base.jl:64
  Symbol(::Any...)
   @ Base strings/basic.jl:229

Stacktrace:
  [1] setindex!(h::Dict{Symbol, Nothing}, v0::Nothing, key0::Expr)
    @ Base ./dict.jl:361
  [2] push!(s::Set{Symbol}, x::Expr)
    @ Base ./set.jl:103
  [3] checknorepeats(flat::Vector{Any}, call::TensorCast.CallInfo, msg::String)
    @ TensorCast ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:1484
  [4] standardglue(ex::Any, target::Vector{Any}, store::NamedTuple{(:dict, :assert, :mustassert, :seen, :need, :top, :main), Tuple{Dict{Any, Any}, Vararg{Vector{Any}, 6}}}, call::TensorCast.CallInfo)
    @ TensorCast ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:420
  [5] (::TensorCast.var"#3#5"{TensorCast.CallInfo, NamedTuple{(:dict, :assert, :mustassert, :seen, :need, :top, :main), Tuple{Dict{Any, Any}, Vararg{Vector{Any}, 6}}}})(x::Expr)
    @ TensorCast ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:189
  [6] walk(x::Expr, inner::Function, outer::TensorCast.var"#3#5"{TensorCast.CallInfo, NamedTuple{(:dict, :assert, :mustassert, :seen, :need, :top, :main), Tuple{Dict{Any, Any}, Vararg{Vector{Any}, 6}}}})
    @ MacroTools ~/.julia/packages/MacroTools/qijNY/src/utils.jl:112
  [7] postwalk(f::Function, x::Expr)
    @ MacroTools ~/.julia/packages/MacroTools/qijNY/src/utils.jl:122
  [8] _macro(exone::Expr, extwo::Expr, exthree::Nothing; call::TensorCast.CallInfo, dict::Dict{Any, Any})
    @ TensorCast ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:189
  [9] _macro
    @ ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:154 [inlined]
 [10] var"@cast"(__source__::LineNumberNode, __module__::Module, exs::Vararg{Any})
    @ TensorCast ~/.julia/packages/TensorCast/mQB8h/src/macro.jl:74

Is this sort of thing possible? Or maybe it's too tricky to stack notation like this?

Thanks,
Miles

@mul gives Array{TrackedReal} instead of TrackedArray

Similar to mcabbott/SliceMap.jl#3

The code below produces an array of tracked reals instead of a tracked array. If the arrays are CuArrays, then the code fails since there is no constructor for CuArray(::Array{TrackedReal})

julia> Zd = TrackedArray(randn(4,3,2)); M = TrackedArray(randn(3,2));

julia> @mul E[a,d] := Zd[a,b,d]*M[b,d]
4×2 Array{Tracker.TrackedReal{Float64},2}:
  2.1719     0.248542
  2.50329   -0.100249
 -0.90269   -0.509704
 -0.290149   1.33654 

Exploit `einops` for docs / tests / advertising

When last I checked this package did everything they do. Would be nice to

  • See if they have weird edge cases which make good tests
  • Translate some of their tutorials, either just as a source of structure / pretty pictures, or as an explicit guide to learning X if you know Y.

Xref #46 (comment)

Their docs: https://einops.rocks/#tutorials

Their paper: https://openreview.net/pdf?id=oapKSVM2bcj

Old notebook here: https://github.com/mcabbott/TensorCast.jl/blob/master/docs/einops.ipynb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.