Comments (10)
@ZuseZ4 no gold is fine here
@brian-kelley can you add -g so we can look at a backtrace of where the error comes from. It probably makes sense to mark some mutex or something as inactive there.
also FYI Enzyme does differentiate through AMD GPUs as well per your earlier comment
from enzyme.
@ZuseZ4 Thanks for the tip. I started with that example CMakeLists.txt, and made Kokkos a subdirectory. I do get through everything until the final executable link:
(from verbose makefile)
clang++ -O0 -stdlib=libc++ -Wl,-rpath,blah/lib/x86_64-unknown-linux-gnu -fuse-ld=lld -Wl,-mllvm -Wl,-load=/blah/enzyme-install/lib/LLDEnzyme-16.so -Wl,--load-pass-plugin=/blah/enzyme-install/lib/LLDEnzyme-16.so -DKOKKOS_DEPENDENCE CMakeFiles/myProgram.dir/myProgram.cpp.o -o myProgram kokkos/containers/src/libkokkoscontainers.a kokkos/core/src/libkokkoscore.a -ldl kokkos/simd/src/libkokkossimd.a
ld.lld: error: <unknown>:0:0: in function main i32 (i32, ptr): Enzyme: Cannot cast __enzyme_autodiff primal argument 1, found i32 0, type i32 - to arg 0 ptr
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
I assume the primal argument 1 is talking about v
above? Can I tweak my declaration of __enzyme_autodiff
or nrm2
to work around this? If it helps, the same error happens with the non-parallel version of nrm2
uncommented, which worked under ClangEnzyme.
from enzyme.
You probably want to do extern on enzyme_dup per that error.
I will forewarn that LTO comes with various compile time implications.
If it works now that's a great starting point, but we'll probably want to do some attributes directly for ease of kokkos users.
from enzyme.
Yeah my same comment about marking the function as allocation like (assuming I'm reading this correctly as an allocation function) applies here.
from enzyme.
Re visability of symbols for ClangEnzyme, did you try LLDEnzyme or one of the other paths described here? https://enzyme.mit.edu/getting_started/UsingEnzyme/#differentiating-cc
from enzyme.
In the case of several of the above functions, I think the right solution is to mark them as allocation like (we have an attribute for this).
from enzyme.
Thanks @wsmoses , that fixed it for the for-loop version but not the parallel_reduce version. The error messages look very similar to with ClangEnzyme, so maybe I misdiagnosed the original problem:
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos4Impl11ViewTrackerINS_4ViewIPdJEEEED2Ev void (ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_
at context: %8 = call noundef ptr @_ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_(ptr noundef %7) #28
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos4Impl11ViewTrackerINS_4ViewIPdJEEEED2Ev void (ptr): Enzyme: No reverse pass found for _ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_
at context: %8 = call noundef ptr @_ZN6Kokkos4Impl22SharedAllocationRecordIvvE9decrementEPS2_(ptr noundef %7) #30
freeing without malloc %19 = load ptr, ptr %18, align 8
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental4Impl19profile_fence_eventINS_6SerialEZNKS4_5fenceERKNSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEUlvE_EEvSD_NS2_19DirectFenceIDHandleERKT0_ void (ptr, i32, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos5Tools8endFenceEm
at context: call void @_ZN6Kokkos5Tools8endFenceEm(i64 noundef %8) #32
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental4Impl19profile_fence_eventINS_6SerialEZNKS4_5fenceERKNSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEUlvE_EEvSD_NS2_19DirectFenceIDHandleERKT0_ void (ptr, i32, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos5Tools10beginFenceENSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEjPm
at context: call void @_ZN6Kokkos5Tools10beginFenceENSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEEjPm(ptr noundef %5, i32 noundef %7, ptr noundef %4) #32
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos15parallel_reduceI12Nrm2_FunctordEENSt3__19enable_ifIXntoooosr6Kokkos7is_viewIT0_EE5valuesr6Kokkos10is_reducerIS4_EE5valuesr3std10is_pointerIS4_EE5valueEvE4typeERKmRKT_RS4_ void (ptr, ptr, ptr): Enzyme: No augmented forward pass found for _ZN6Kokkos6SerialC1Ev
at context: call void @_ZN6Kokkos6SerialC1Ev(ptr noundef nonnull align 8 dereferenceable(16) %5) #32
from enzyme.
But also this is still odd because this implies that you didn't do full LTO with wherever these functions were implemented. And this Enzyme couldn't find the definition to differentiate and complained.
I do think this is probably at the level we should mark a custom derivative for at a higher level, but still would be good to confirm it is okay if given the definitions in llvm.
from enzyme.
@wsmoses Made some progress - I wasn't very familiar with the usage of LTO before. I still had to add -flto
to the compilation of the Kokkos libraries and also install the LLVMgold.so
plugin.
Now, it's "cannot deduce type of memset" and "cannot deduce type of copy"
ld.lld: error: <unknown>:0:0: in function preprocess__ZNSt3__15mutexC2B7v160006Ev void (ptr): Enzyme: Cannot deduce type of memset call void @llvm.memset.p0.i64(ptr align 8 %2, i8 0, i64 40, i1 false) #46
<analysis>
ptr %0: {[-1]:Pointer}, intvals: {}
%2 = getelementptr inbounds %"class.std::__1::mutex", ptr %0, i32 0, i32 0: {[-1]:Pointer}, intvals: {}
</analysis>
ld.lld: error: <unknown>:0:0: in function preprocess__ZN6Kokkos5Tools12Experimental23invoke_kokkosp_callbackIPFv28Kokkos_Profiling_SpaceHandlePKcPKvmEJRKS3_S5_RS7_RKmEEEvNS1_23MayRequireGlobalFencingERKT_DpOT0_ void (ptr, ptr, ptr, ptr, ptr): Enzyme: Cannot deduce type of copy call void @llvm.memcpy.p0.p0.i64(ptr align 1 %7, ptr align 1 %1, i64 64, i1 false) #46
<analysis>
Is this the kind of thing that would be fixed by adding the allocation-like attribute to the Kokkos functions in question?
If so, is there an example of this attribute being used? I found this line in customalloc.c
but I'm not sure if this is what you're referring to:
void* __enzyme_allocation_like[4] = {(void*)myallocator, (void*)1, (void*)"2,-1", (void*)myfree};
from enzyme.
Not recommended for production usecase, but try https://enzyme.mit.edu/getting_started/UsingEnzyme/#loose-type-analysis to get started. I think there might also be an example in the cmake on how to add it (but not sure).
Also, LLVMgold.so looks suspicious, I think it should use LLD and not gold(?), but maybe Billy knows more.
from enzyme.
Related Issues (20)
- Enzyme: Cannot cast __enzyme_autodiff primal argument 16 HOT 7
- enzyme_dupped parameter doesn't return gradient
- New C++ interface with lambda HOT 2
- Can't compile eigensumsqdyn-notmp.cpp with Eigen 3.4.0
- Injected headers for c++ break tooling
- Incorrect derivative result when nested void functions and recursive nature functions are used. HOT 6
- abort cmake when -DLLVM_DIR is an invalid path.
- check-enzyme-integration tests failures HOT 3
- Branch mismatcharg fails to compile HOT 6
- Building Enzyme CMake - Undefined symbol: main HOT 3
- Unnecessary caching for recursive functions
- Bug in Enzyme gsl branch HOT 1
- compilation slowdown associated with PreserveNVVMNewPM HOT 1
- Is this N/3 correct? HOT 10
- incorrect derivative of function that returns struct HOT 2
- C++ interface templates appear to be broken HOT 7
- `std::vector.push_back()` causes segementation fault in Enzyme HOT 2
- EnzymeCreateForwardDiff missing from CApi.h HOT 1
- Clarify usage through linking HOT 8
- Failing unittest Enzyme/ReverseMode/gsl_sf_legendre_array_e.ll
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from enzyme.