Comments (6)
A relatively nice compiler fix would be to emulate the "rewrite as function" workaround:
- Before running inst-combine, rewrite getelementptr over private storage class as a non-pure function
- Then run instcombine
- Then convert the getelementptr functions back into real GEPs.
The main complexity is in generating all the right types for the functions.
from clspv.
A simpler variation is: rather than rewriting the getelementptr instruction, wrap the result with a call to an opaque function. That's a lower complexity solution.
from clspv.
Wrapping the pointer with a function call is not enough. The instcombine pass will still generate code like this:
cond.true: ; preds = %entry
%arrayidx = getelementptr inbounds [3 x float], [3 x float] addrspace(2)* @kFirst, i32 0, i32 %i
%wrapped0 = call float addrspace(2)* @wrap(float addrspace(2)* %arrayidx) #1
br label %cond.end
cond.false: ; preds = %entry
%arrayidx1 = getelementptr inbounds [3 x float], [3 x float] addrspace(2)* @kSecond, i32 0, i32 %i
%wrapped1 = call float addrspace(2)* @wrap(float addrspace(2)* %arrayidx1) #1
br label %cond.end
cond.end: ; preds = %cond.false, %cond.true
%cond.in = phi float addrspace(2)* [ %wrapped0, %cond.true ], [ %wrapped1, %cond.false ]
%cond = load float, float addrspace(2)* %cond.in, align 4
store float %cond, float addrspace(1)* %A, align 4
ret void
}
from clspv.
Wrapping the result of the load is sufficient to have it survive instcombine without doing select between the pointers:
This input survives intact through instcombine.
define float @wrap(float %A) {
entry:
ret float %A;
}
; Function Attrs: nounwind
define spir_kernel void @foo(float addrspace(1)* %A, i32 %c, i32 %i) #0 !kernel_arg_addr_space !3 !kernel_arg_access_qual !4 !kernel_arg_type !5 !kernel_arg_base_type !5 !kernel_arg_type_qual !6 {
entry:
%cmp = icmp eq i32 %c, 0
br i1 %cmp, label %cond.true, label %cond.false
cond.true: ; preds = %entry
%arrayidx = getelementptr inbounds [3 x float], [3 x float] addrspace(2)* @kFirst, i32 0, i32 %i
%0 = load float, float addrspace(2)* %arrayidx, align 4
%wrapped0 = call float @wrap(float %0)
br label %cond.end
cond.false: ; preds = %entry
%arrayidx1 = getelementptr inbounds [3 x float], [3 x float] addrspace(2)* @kSecond, i32 0, i32 %i
%1 = load float, float addrspace(2)* %arrayidx1, align 4
%wrapped1 = call float @wrap(float %1)
br label %cond.end
cond.end: ; preds = %cond.false, %cond.true
%cond = phi float [ %wrapped0, %cond.true ], [ %wrapped1, %cond.false ]
store float %cond, float addrspace(1)* %A, align 4
ret void
}
from clspv.
I have a pretty good prototype at https://github.com/dneto0/clspv/commits/hide-constant-loads
It fails only one test, but that's expected. It resolves the test case above.
from clspv.
Fixed by #77
from clspv.
Related Issues (20)
- SimplifyPointerBitcast performs wrong simplification HOT 1
- math functions issue since instcombine transform "and" pattern to call to fabs HOT 1
- Invalid word count when emitting OpConstant HOT 3
- Segmentation fault HOT 1
- OpAtomicUMax may have the wrong pointer type after cast + offset HOT 1
- Temporary Bugfix: SimplifyPointerBitcastPass emits faulty IR that loses OpLoads of 16 bit fields HOT 2
- ReplacePointerBitcastPass introduces invalid IR with struct field of a struct HOT 3
- migrate from deprecated Type::isOpaquePointerTy() to Type::isPointerTy() HOT 1
- Ternary operator on AMD vk drivers. HOT 3
- Segmentation fault with conditional pointer assignment from different address spaces HOT 4
- Canonicalization of GEPs to i8 HOT 10
- Constant initialized global variable rewrites produce invalid IR
- Handle LLVM intrinsic llvm.is_fpclass
- Loads of i32s are fragmented into 4 bytes HOT 3
- how can I cross compile clspv in x86 for risc-v HOT 1
- Loads and Stores of i32s from offset addresses are fragmented HOT 1
- Loads and Stores from offset addresses are fragmented HOT 5
- Implement compatibility for external LLVM-IL HOT 15
- Clspv Fragments access to global memory by the Smallest access size
- Improve `-cl-mad-enable` support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clspv.