nunoplopes / alive Goto Github PK
View Code? Open in Web Editor NEWAlive: Automatic LLVM's Instcombine Verifier
License: Apache License 2.0
Alive: Automatic LLVM's Instcombine Verifier
License: Apache License 2.0
AIUI Alive can deal with undef
s. In particular, undef
should be among potential values for input variables. Right? Then there is a problem: according to the LangRef undef
s are unstable and this seems not to be accounted for in Alive.
Simple example -- https://rise4fun.com/Alive/yjtD:
%r = and i32 %Y, 0
=>
%r = xor i32 %Y, %Y
Both and
and xor
are taken from examples in LangRef, and
here should always give 0
and xor
here gives undef
, so this transformation is not valid. But Alive says that it is.
More realistic example -- https://rise4fun.com/Alive/PpoX:
%c = icmp eq i32 %X, %Y
%r = select i1 %c, i32 %X, 0
=>
%c = icmp eq i32 %X, %Y
%r = select i1 %c, i32 %Y, 0
When %Y
is undef
, the comparison gives undef
, and select
in the source can choose either %X
or 0
. But we get undef
in the target.
This is kinda simplified from https://bugs.llvm.org/show_bug.cgi?id=44512.
The following 2 examples are accepted as valid:
%v = add 0, 1
=>
unreachable
And:
%v = add 0, 1
=>
%v = 1
store undef, * null
Alive could produce better error messages for when we have undefined behavior flowing to replaced instructions. If this flow is from newly-introduced instructions, then the error message will not be optimal.
Maybe we could detect that there's no model for these variables? Not 100% accurate? Or do additional SMT queries on error?
Example (%v0 has undefined behavior):
Pre: (((C1 % C3) == 0) && ((C2 % C3) == 0))
%add = add nsw nuw i5 C1, C2
%urem = srem %add, C3
=>
%v0 = srem i5 C1, C3
%v1 = srem i5 C2, C3
%urem = add i5 %v0, %v1
ERROR: Domain of definedness of Target is smaller than Source's for i5 %urem
Example:
C1 i5 = 0x10 (16, -16)
C2 i5 = 0x08 (8)
C3 i5 = 0x1F (31, -1)
%add i5 = 0x18 (24, -8)
%v0 i5 = 0x00 (0)
%v1 i5 = 0x00 (0)
Source value: 0x00 (0)
Target value: undef
According to the comment in value.py
https://github.com/nunoplopes/alive/blob/master/value.py#L289
only integers up to 64 bits are supported. Consequently, I couldn't check for the following optimization that was produced when compiling LLVM:
Name: 1
Pre: Known(%6, false)
%0 = i64 %x1
%1 = and i64 18446744073709551612, %0
%2 = icmp eq i64 0, %1
%3 = i80 %x2
%4 = and i80 302231454903657293676544, %3
%5 = icmp eq i80 0, %4
%6 = or i1 %2, %5
=>
%2 = false
Are there any fundamental reasons to not support >64 bit integers?
Alive accepts this transformation:
%x = undef
%y = xor %x, %x
=>
%y = 0
However, the language ref seems to require %y = undef
.
fshl
, fshr
ctlz
cttz
bswap
bitreverse
popcount
Alive crashes if the first argument of a select
is UnknownType
.
[Reading from terminal...]
%a = select %x, %y, %z
%b = select %a, %u, %v
=>
%b = 0
Traceback (most recent call last):
File "./alive.py", line 526, in <module>
main()
File "./alive.py", line 510, in main
opts = parse_opt_file(f.read())
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 572, in parse_opt_file
return opt_file.parseString(txt)
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1101, in parseString
loc, tokens = self._parse( instring, 0 )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2348, in parseImpl
loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2753, in parseImpl
loc, tokens = self.expr._parse( instring, loc, doActions, callPreParse=False )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2365, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1001, in _parseNoCache
tokens = fn( instring, tokensStart, retTokens )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 765, in wrapper
ret = func(*args[limit[0]:])
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 551, in parseOpt
return _parseOpt(s, loc, toks)
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 527, in _parseOpt
src, ident_src, used_src, _ = parse_llvm(src)
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 423, in parse_llvm
prog.parseString(txt)
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1101, in parseString
loc, tokens = self._parse( instring, 0 )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2365, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2719, in parseImpl
loc, tokens = self.expr._parse( instring, loc, doActions, callPreParse=False )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2365, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2470, in parseImpl
ret = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2365, in parseImpl
loc, exprtokens = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 975, in _parseNoCache
loc,tokens = self.parseImpl( instring, preloc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 2470, in parseImpl
ret = e._parse( instring, loc, doActions )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1032, in _parseCache
value = self._parseNoCache( instring, loc, doActions, callPreParse )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 1001, in _parseNoCache
tokens = fn( instring, tokensStart, retTokens )
File "/Users/dave/Documents/Research/alive/alive/pyparsing/pyparsing.py", line 765, in wrapper
ret = func(*args[limit[0]:])
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 30, in z
return fn(toks)
File "/Users/dave/Documents/Research/alive/alive/parser.py", line 145, in parseSelect
return Select(t, toks[2], parseOperand(toks[4], t), parseOperand(toks[6], t))
File "/Users/dave/Documents/Research/alive/alive/language.py", line 761, in __init__
assert isinstance(c.type, IntType)
AssertionError
Here is the relevant code:
class Select(Instr):
def __init__(self, type, c, v1, v2):
assert isinstance(type, Type)
assert isinstance(c, Value)
assert isinstance(c.type, IntType) # <---- this line
assert isinstance(v1, Value)
assert isinstance(v2, Value)
self.type = type.ensureFirstClass()
self.c = c
self.v1 = v1
self.v2 = v2
Alive newsema accepts this optimization:
Pre: isPowerOf2(%a)
%a = shl i4 1, %x
%b = select 1, %x, %a
%c = icmp ult %b, 4
=>
%c = true
See also http://rise4fun.com/Alive/U4o
LLVM's isKnownToBeAPowerOfTwo
analysis is always true for %a
, no matter what is known about %x
. In other words, the precondition should not constrain %x
at all. This is allowed by LLVM because the analysis is not required to be accurate when its input is poison, which it will be in any case where the optimization is not a refinement.
Hi there,
I wanted to use alive to verify an optimization but I know very little about how it works.
The optimization looks like:
Name: Mul128MatchLo
Pre: hasOneUse(%xl) && hasOneUse(%xh) && hasOneUse(%yl) && hasOneUse(%yh)
%xl = and i64 %X, 4294967295
%xh = lshr i64 %X, 32
%yl = and i64 %Y, 4294967295
%yh = lshr i64 %Y, 32
%mul = mul nuw i64 %xl, %yl
%and3 = and i64 %mul, 4294967295
%shr4 = lshr i64 %mul, 32
%mul5 = mul nuw i64 %yl, %xh
%add = add i64 %shr4, %mul5
%and7 = and i64 %add, 4294967295
%mul8 = mul nuw i64 %yh, %xl
%add9 = add i64 %and7, %mul8
%shl = shl i64 %add9, 32
%L = or i64 %shl, %and3
=>
%L = mul i64 %X, %Y
It might be to heavy for Z3, because it looks like it hangs and does not report anything back. There is not Done:
progress reported and it's impossible to terminate it with Ctrl+C.
We need a story for UB in constant exprs. Do the underlying APIs provide deterministic behavior? Should the generated code be made deterministic or the UB exposed in the DSL?
E.g.
%a = shl nsw i8 %x, C1
%b = icmp sgt %a, C0
=>
%b = icmp sgt %x, (C0 >> C1)
What if C1 >= width(C1)
in C0 >>C1
?
The follow input crashes alive (because %1 is replaced, but not %2). Test case by David Majnemer.
Pre: isPowerOf2(C1)
%1 = srem i32 %x, C1
%2 = ashr i32 %1, C2
%ret = and i32 %2, C1
=>
%cmp = icmp eq C2, 0
%1 = srem i32 %x, C1
%ret = select i1 %cmp, C1, 0
I noticed that the following test succeed:
Name: sext-seteq1-in
%a = sext %X to i32
%r = icmp eq i32 %a, 1
=>
%r = icmp eq %X, 1
but it should fail if %X is i1. This can be directly verified by giving the type explicitly:
Name: sext-seteq1-i1
%a = sext i1 %X to i32
%r = icmp eq i32 %a, 1
=>
%r = icmp eq i1 %X, 1
which correctly fails.
Are the tests not done for i1? Am I missing something?
BTW, I noticed that the version running on https://rise4fun.com/Alive correctly
fails even when not given the explicit type. Is it running another version?
Thanks,
-- Luc
Alive rejects the following (apparently it's not using the hasOneUse() condition at all):
Pre: hasOneUse(%Ptr)
%Ptr = alloca
store %x, %Ptr
=>
skip
The WillNotOverflowSignedAdd(%a,%b)
predicate corresponds to a function in LLVM which attempts to show that %a + %b has no signed wrap. Thus, !WillNotOverflowSignedAdd(%a,%b)
means "cannot be proven not to overflow", not "will overflow". This is not currently reflected in the SMT translation, which has perfect knowledge of the inputs.
Test case:
Pre: C >= 0 && !WillNotOverflowSignedAdd(%a,C)
%0 = add %a, C
%r = icmp sge %a, %0
=>
%r = true
Alive accepts this, but this is incorrect when translated to C++.
Note that WillNotOverflowSignedMul
and WillNotOverflowUnsignedMul
do not have this problem, as they are restricted to constants (meaning the compiler has perfect knowledge).
Related issue: Should the C++ translator handle WillNotOverflowSignedAdd(C1,C2)
specially?
this hangs overnight, but with Python using 100% of a CPU, not Z3
----------------------------------------
Optimization: 1
Precondition: true
%1 = urem i32 %0, 679
%out = icmp eq 0, %1
=>
%3a = mul %0, 2068415767
%out = icmp ult %3a, 6325431
The following is broken with --use-array-th:
%Ptr = alloca
store %x, %Ptr
=>
skip
At present, the following is disallowed, because expressions are not permitted in the source:
%1 = or undef, 1
%2 = and undef, ~1
%3 = add %1, %2
%4 = and %3, 1
=>
%4 = 1
This could be resolved by
Adding general support for expressions in the source is not recommended, as this makes the C++ translation too complex.
From David Majnemer:
Optimization: 1
Precondition: ((C0 & (computeKnownZeroBits(%x) ^ computeKnownOneBits(%x))) == 0)
%r = and %x, C0
=>
%r = (computeKnownZeroBits(%x) & C0)
Traceback (most recent call last):
File "./alive.py", line 507, in <module>
main()
File "./alive.py", line 500, in main
check_opt(opt)
File "./alive.py", line 461, in check_opt
check_typed_opt(pre, src, ident_src, tgt, ident_tgt, types, users)
File "./alive.py", line 342, in check_typed_opt
tgtv = toSMT(tgt, ident_tgt, False)
File "~/alive/language.py", line 950, in toSMT
assert defined == [] and poison == []
Optimization: 1
Precondition: true
%1 = add nsw i32 %b, 3
%2 = mul nsw i32 %1, %s
%3 = add nsw i32 %b, 15
%4 = mul nsw i32 %3, %s
=>
%1 = add nsw i32 %b, 3
%2 = mul nsw i32 %1, %s
%5 = mul %s, 12
%4 = add nsw i32 %2, %5
ERROR: Domain of poisoness of Target is smaller than Source's for i32 %4
Example:
%b i32 = 0xFFFFFFF0 (4294967280, -16)
%s i32 = 0xC020013A (3223322938, -1071644358)
%1 i32 = 0xFFFFFFF3 (4294967283, -13)
%2 i32 = 0x3E5FF00E (1046474766)
%3 i32 = 0xFFFFFFFF (4294967295, -1)
%5 i32 = 0x01800EB8 (25169592)
Source value: 0x3FDFFEC6 (1071644358)
Target value: poison
I ran into a seemingly bug. The transformation assumes %1 * %s doesn't sign-overflow, but In the counter example %1 * %s (-13 * -1071644358) sign-overflows.
While investigating https://bugs.llvm.org/show_bug.cgi?id=36682,
i've come down to this: https://godbolt.org/g/bZyQ7r
But i'm unable to verify those optimizations via alive: https://rise4fun.com/Alive/aaAn
ERROR: Unknown conversion instruction (line: 1)
%1 = sitofp i32 %0 to double
^
Which makes sense, because not only the sitofp
instruction is not supported, but there seems to be no base support for floating-point at all.
Alive says that the following optimization is correct, but I don't think it is.
The problem is that LLVM doesn't have a copy operation, but rather does RAUW. This means that each use of %a will get an undef value, and these values will be distinct of one another. The recent RFC by David Majnemer to remove poison may change things, though.
A way to fix the optimization below would be to require hasOneUse(%a). Then we know there won't be multiple undef values.
%a = load * undef
=>
%a = undef
Done: 128
Optimization is correct!
We have been asked multiple times for a switch to check only for the root instruction. This is slightly dangerous so it could (should) produce a warning if intermediate values cannot have outside uses.
A freeze cannot be changed to a constant at will; only if it has no other users.
This is currently not captured by the vcgen.
e.g.,
%v = freeze poison
=>
%v = 42
only correct if hasOneUse(%v)
This is newsema branch only!
It seems like https://rise4fun.com/Alive fails on some inputs that Alive can handle locally. e.g.
%a = sdiv %X, C
%r = sub undef, %a
=>
%r = sdiv %X, -C
fails online with ERROR: no output from Alive
. Locally, alive reports ERROR: Domain of definedness of Target is smaller than Source's for i4 %r
.
Without the undef
, online Alive produces usable output.
I'm unable to run this out of the box. If I try to run any of the samples, I get:
./alive.py tests/instcombine/addsub.opt
Traceback (most recent call last):
File "./alive.py", line 19, in
from parser import parse_llvm, parse_opt_file
File "/Users/matt/src/alive/parser.py", line 304, in
pred_args <<= (cnst_expr + ZeroOrMore(comma + cnst_expr)) | Empty()
TypeError: Empty() takes exactly 1 argument (0 given)
If I remove the | Empty(), I'm able to run it.
There's a new syntax for GEP:
%r = getelementptr ty, ty* ptrval {, ty idx}*
We could support read/writing poison values through memory.
I don't think there is any optimization that depends on this right now, so it's low priority.
Pre: C1 u< width(C1)
%Op0 = shl i7 C1, %A
%r = shl %Op0, C2
=>
%r = shl (C1 << C2), %A
gives
ERROR: Unknown function: (line: 5)
%r = shl (C1 << C2), %A
^
For example, Alive will crash if I have an instruction %r
in the target which depends on %y
from the source, but %y
depends on a value %x
which is replaced in the target.
[Reading from terminal...]
%x = sub %A, 0
%y = add %x, C2
%r = add %y, C3
=>
%x = %A
%r = add %y, C3
----------------------------------------
Optimization: 1
Precondition: true
%x = sub %A, 0
%y = add %x, C2
%r = add %y, C3
=>
%x = %A
%r = add %y, C3
Traceback (most recent call last):
File "./alive.py", line 526, in <module>
main()
File "./alive.py", line 517, in main
check_opt(opt)
File "./alive.py", line 460, in check_opt
check_typed_opt(pre, src, ident_src, tgt, ident_tgt, types, users)
File "./alive.py", line 340, in check_typed_opt
tgtv = toSMT(tgt, ident_tgt, False)
File "/Users/dave/Documents/Research/alive/alive/language.py", line 1170, in toSMT
smt = v.toSMT(defined, poison, state, qvars)
File "/Users/dave/Documents/Research/alive/alive/language.py", line 357, in toSMT
v1 = state.eval(self.v1, defined, poison, qvars)
File "/Users/dave/Documents/Research/alive/alive/language.py", line 151, in eval
(smt, d, p, q) = self.vars[v.getUniqueName()]
KeyError: '%x'
Either %y
needs to be reevaluated in the target, or we should disallow replacing values that are used by non-replaced values.
Alternately, we could disallow replacement for non-roots entirely. In the Alive suite, we only seem to do it for values which also have HasOneUse
, so there's on real difference between replacement and just creating a new value. (See also #26.)
Z3 has a relatively restricted license, which prevents companies from using it for commercial purposes. I wonder if you have considered or estimated how difficult it is to add other backends such as STP.
The following optimization should have precise analysis result for isPowerOf2 because its operands are constant expressions (right now we only detect constant vars):
Name: AndOrXor:1628-1
Pre: C1 u> C3 && C2 u> C3 && isPowerOf2(C1 ^ C2) && isPowerOf2(-C2 ^ -C1) && -C2 ^ -C1 == (C3-C2) ^ (C3-C1) && abs(C1-C2) u> C3
%a1 = add i64 %A, C1
%a2 = add %A, C2
%cmp1 = icmp ult %a1, C3
%cmp2 = icmp ult %a2, C3
%r = or %cmp1, %cmp2
=>
%newand = and %A, ~(C1^C2)
%newadd = add %newand, umax(C1, C2)
%r = icmp ult %newadd, C3
We should either reconsider adding +1 bit for positive numbers, have a way to downcast it or different syntax or...?
(test case from John Regehr)
Precondition: (computeKnownZeroBits(%in) == 4294443007)
%1 = icmp eq i32 %in, 0
%out = select i1 %1, 1, 2
=>
%3 = lshr i32 %in, 20
%out = add %3, 1
Precondition does not type check
$ cat x.opt
Name: func4912511
%13 = sub nuw i2 undef, %3
%14 = and i2 undef, %13
%15 = add nuw nsw i2 %0, %14
ret i2 %15
=>
ret i2 undef
$ ./alive.py x.opt
----------------------------------------
Optimization: func4912511
Precondition: true
%13 = sub nuw i2 undef, %3
%14 = and i2 undef, %13
%15 = add nuw nsw i2 %0, %14
ret i2 %15
=>
Done: 1
Optimization is correct!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.