Giter Club home page Giter Club logo

Comments (10)

wsmoses avatar wsmoses commented on June 26, 2024 2

@ZuseZ4 no gold is fine here

@brian-kelley can you add -g so we can look at a backtrace of where the error comes from. It probably makes sense to mark some mutex or something as inactive there.

also FYI Enzyme does differentiate through AMD GPUs as well per your earlier comment

from enzyme.

brian-kelley avatar brian-kelley commented on June 26, 2024 1

@ZuseZ4 Thanks for the tip. I started with that example CMakeLists.txt, and made Kokkos a subdirectory. I do get through everything until the final executable link:

(from verbose makefile)
clang++ -O0 -stdlib=libc++ -Wl,-rpath,blah/lib/x86_64-unknown-linux-gnu -fuse-ld=lld -Wl,-mllvm -Wl,-load=/blah/enzyme-install/lib/LLDEnzyme-16.so -Wl,--load-pass-plugin=/blah/enzyme-install/lib/LLDEnzyme-16.so -DKOKKOS_DEPENDENCE CMakeFiles/myProgram.dir/myProgram.cpp.o -o myProgram  kokkos/containers/src/libkokkoscontainers.a kokkos/core/src/libkokkoscore.a -ldl kokkos/simd/src/libkokkossimd.a

ld.lld: error: <unknown>:0:0: in function main i32 (i32, ptr): Enzyme: Cannot cast __enzyme_autodiff primal argument 1, found i32 0, type i32 - to arg 0 ptr

clang-16: error: linker command failed with exit code 1 (use -v to see invocation)

I assume the primal argument 1 is talking about v above? Can I tweak my declaration of __enzyme_autodiff or nrm2 to work around this? If it helps, the same error happens with the non-parallel version of nrm2 uncommented, which worked under ClangEnzyme.

from enzyme.

wsmoses avatar wsmoses commented on June 26, 2024 1

You probably want to do extern on enzyme_dup per that error.

I will forewarn that LTO comes with various compile time implications.

If it works now that's a great starting point, but we'll probably want to do some attributes directly for ease of kokkos users.

from enzyme.

wsmoses avatar wsmoses commented on June 26, 2024 1

Yeah my same comment about marking the function as allocation like (assuming I'm reading this correctly as an allocation function) applies here.

from enzyme.

ZuseZ4 avatar ZuseZ4 commented on June 26, 2024

Re visability of symbols for ClangEnzyme, did you try LLDEnzyme or one of the other paths described here? https://enzyme.mit.edu/getting_started/UsingEnzyme/#differentiating-cc

from enzyme.

wsmoses avatar wsmoses commented on June 26, 2024

In the case of several of the above functions, I think the right solution is to mark them as allocation like (we have an attribute for this).

from enzyme.

brian-kelley avatar brian-kelley commented on June 26, 2024

Thanks @wsmoses , that fixed it for the for-loop version but not the parallel_reduce version. The error messages look very similar to with ClangEnzyme, so maybe I misdiagnosed the original problem:

ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos4Impl11ViewTrackerINS_4ViewIPdJEEEED2Ev void (ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_
 at context:   %8 = call noundef ptr @_ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_(ptr noundef %7) #28


ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos4Impl11ViewTrackerINS_4ViewIPdJEEEED2Ev void (ptr): Enzyme: No reverse pass found for _ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_
 at context:   %8 = call noundef ptr @_ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_(ptr noundef %7) #30

freeing without malloc   %19 = load ptr, ptr %18, align 8

ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental4Impl19profile_fence_eventINS_6SerialEZNKS4_5fenceERKNSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEUlvE_EEvSD_NS2_19DirectFenceIDHandleERKT0_ void (ptr, i32, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos5Tools8endFenceEm
 at context:   call void @_ZN6Kokkos5Tools8endFenceEm(i64 noundef %8) #32


ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental4Impl19profile_fence_eventINS_6SerialEZNKS4_5fenceERKNSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEUlvE_EEvSD_NS2_19DirectFenceIDHandleERKT0_ void (ptr, i32, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos5Tools10beginFenceENSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEjPm
 at context:   call void @_ZN6Kokkos5Tools10beginFenceENSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEjPm(ptr noundef %5, i32 noundef %7, ptr noundef %4) #32


ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos15parallel_reduceI12Nrm2_FunctordEENSt3__19enable_ifIXntoooosr6Kokkos7is_viewIT0_EE5valuesr6Kokkos10is_reducerIS4_EE5valuesr3std10is_pointerIS4_EE5valueEvE4typeERKmRKT_RS4_ void (ptr, ptr, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos6SerialC1Ev
 at context:   call void @_ZN6Kokkos6SerialC1Ev(ptr noundef nonnull align 8 dereferenceable(16) %5) #32

from enzyme.

wsmoses avatar wsmoses commented on June 26, 2024

But also this is still odd because this implies that you didn't do full LTO with wherever these functions were implemented. And this Enzyme couldn't find the definition to differentiate and complained.

I do think this is probably at the level we should mark a custom derivative for at a higher level, but still would be good to confirm it is okay if given the definitions in llvm.

from enzyme.

brian-kelley avatar brian-kelley commented on June 26, 2024

@wsmoses Made some progress - I wasn't very familiar with the usage of LTO before. I still had to add -flto to the compilation of the Kokkos libraries and also install the LLVMgold.so plugin.

Now, it's "cannot deduce type of memset" and "cannot deduce type of copy"

ld.lld: error: <unknown>:0:0: in function preprocess__ZNSt3__15mutexC2B7v160006Ev void (ptr): Enzyme: Cannot deduce type of memset   call void @llvm.memset.p0.i64(ptr align 8 %2, i8 0, i64 40, i1 false) #46
<analysis>
ptr %0: {[-1]:Pointer}, intvals: {}
  %2 = getelementptr inbounds %"class.std::__1::mutex", ptr %0, i32 0, i32 0: {[-1]:Pointer}, intvals: {}
</analysis>

ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental23invoke_kokkosp_callbackIPFv28Kokkos_Profiling_SpaceHandlePKcPKvmEJRKS3_S5_RS7_RKmEEEvNS1_23MayRequireGlobalFencingERKT_DpOT0_ void (ptr, ptr, ptr, ptr, ptr): Enzyme: Cannot deduce type of copy   call void @llvm.memcpy.p0.p0.i64(ptr align 1 %7, ptr align 1 %1, i64 64, i1 false) #46
<analysis>

Is this the kind of thing that would be fixed by adding the allocation-like attribute to the Kokkos functions in question?
If so, is there an example of this attribute being used? I found this line in customalloc.c but I'm not sure if this is what you're referring to:

void* __enzyme_allocation_like[4] = {(void*)myallocator, (void*)1, (void*)"2,-1", (void*)myfree};

from enzyme.

ZuseZ4 avatar ZuseZ4 commented on June 26, 2024

Not recommended for production usecase, but try https://enzyme.mit.edu/getting_started/UsingEnzyme/#loose-type-analysis to get started. I think there might also be an example in the cmake on how to add it (but not sure).

Also, LLVMgold.so looks suspicious, I think it should use LLD and not gold(?), but maybe Billy knows more.

from enzyme.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.