Comments (6)
After creating a t0 directory
./tf -h
Usage: ./tf <exponent> <start bit> [<end bit>]
LLVM ERROR: Cannot select: 0x560d6354d550: i64,glue = sube 0x560d6337e1f0, 0x560d6354c910, 0x560d635621b0:1
0x560d6337e1f0: i64 = AssertZext 0x560d633e2040, ValueType:ch:i32
0x560d633e2040: i64,ch = CopyFromReg 0x560d633e2190:1, Register:i64 %vreg9
0x560d63424fb0: i64 = Register %vreg9
0x560d6354c910: i64 = zero_extend 0x560d6337e340
0x560d6337e340: i32 = add 0x560d63550430, 0x560d634d3530
0x560d63550430: i32 = add 0x560d6354cf30, 0x560d6354d8b0
0x560d6354cf30: i32 = truncate 0x560d63382de0:1
0x560d63382de0: i64,i64 = umul_lohi 0x560d63380130, 0x560d635c5d50
0x560d63380130: i64,glue = addc 0x560d6337fe20, 0x560d635c6e40
0x560d6337fe20: i64 = or 0x560d635c1840, 0x560d6337fb10
0x560d635c1840: i64 = srl 0x560d6360efe0, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d6337fb10: i64 = shl 0x560d6360efe0:1, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d635c6e40: i64 = add 0x560d635c6ac0, 0x560d63509200
0x560d635c6ac0: i64 = zero_extend 0x560d634c4f80
0x560d634c4f80: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d6339ffd0, 0x560d634f2400
0x560d63509200: i64 = zero_extend 0x560d634432f0
0x560d634432f0: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d63559430, 0x560d6337e490
0x560d635c5d50: i64 = or 0x560d634d3450, Constant:i64<1>
0x560d634d3450: i64,i64 = umul_lohi 0x560d63384e70, 0x560d6355a230
0x560d63384e70: i64 = add 0x560d634c5220, 0x560d63644f50
0x560d634c5220: i64 = mul 0x560d633e1b70, Constant:i64<60060>
0x560d63644f50: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg68
0x560d6355a230: i64 = zero_extend 0x560d6355cf60
0x560d6355cf60: i32,ch = CopyFromReg 0x560d633c4e40, Register:i32 %vreg5
0x560d634f2780: i64 = Constant<1>
0x560d6354d8b0: i32 = mul 0x560d63561810, 0x560d634d33e0
0x560d63561810: i32 = truncate 0x560d63380130
0x560d63380130: i64,glue = addc 0x560d6337fe20, 0x560d635c6e40
0x560d6337fe20: i64 = or 0x560d635c1840, 0x560d6337fb10
0x560d635c1840: i64 = srl 0x560d6360efe0, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d6337fb10: i64 = shl 0x560d6360efe0:1, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d635c6e40: i64 = add 0x560d635c6ac0, 0x560d63509200
0x560d635c6ac0: i64 = zero_extend 0x560d634c4f80
0x560d634c4f80: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d6339ffd0, 0x560d634f2400
0x560d63509200: i64 = zero_extend 0x560d634432f0
0x560d634432f0: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d63559430, 0x560d6337e490
0x560d634d33e0: i32 = truncate 0x560d634d3450:1
0x560d634d3450: i64,i64 = umul_lohi 0x560d63384e70, 0x560d6355a230
0x560d63384e70: i64 = add 0x560d634c5220, 0x560d63644f50
0x560d634c5220: i64 = mul 0x560d633e1b70, Constant:i64<60060>
0x560d633e1b70: i64 = zero_extend 0x560d633e1d30
0x560d634c78d0: i64 = Constant<60060>
0x560d63644f50: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg68
0x560d635c6dd0: i64 = Register %vreg68
0x560d6355a230: i64 = zero_extend 0x560d6355cf60
0x560d6355cf60: i32,ch = CopyFromReg 0x560d633c4e40, Register:i32 %vreg5
0x560d635c3b00: i32 = Register %vreg5
0x560d634d3530: i32 = mul 0x560d6354e330, 0x560d6354d240
0x560d6354e330: i32 = truncate 0x560d635c3d30
0x560d635c3d30: i64,glue = adde 0x560d635c21e0, Constant:i64<0>, 0x560d63380130:1
0x560d635c21e0: i64 = srl 0x560d6360efe0:1, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d635c42e0: i64 = or 0x560d634f2da0, 0x560d634d31b0
0x560d634f2da0: i64 = shl 0x560d633831d0, Constant:i32<32>
0x560d634d31b0: i64 = srl 0x560d6354fef0, Constant:i32<32>
0x560d635c5ea0: i64 = or 0x560d635c5960, 0x560d6354cb40
0x560d635c5960: i64 = shl 0x560d635072c0, Constant:i32<32>
0x560d6354cb40: i64 = and 0x560d63443280, Constant:i64<4294967295>
0x560d6360f4b0: i32 = Constant<32>
0x560d63550820: i64 = Constant<0>
0x560d63380130: i64,glue = addc 0x560d6337fe20, 0x560d635c6e40
0x560d6337fe20: i64 = or 0x560d635c1840, 0x560d6337fb10
0x560d635c1840: i64 = srl 0x560d6360efe0, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d6337fb10: i64 = shl 0x560d6360efe0:1, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d6360f4b0: i32 = Constant<32>
0x560d635c6e40: i64 = add 0x560d635c6ac0, 0x560d63509200
0x560d635c6ac0: i64 = zero_extend 0x560d634c4f80
0x560d634c4f80: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d6339ffd0, 0x560d634f2400
0x560d63509200: i64 = zero_extend 0x560d634432f0
0x560d634432f0: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d63559430, 0x560d6337e490
0x560d6354d240: i32 = truncate 0x560d635c5d50
0x560d635c5d50: i64 = or 0x560d634d3450, Constant:i64<1>
0x560d634d3450: i64,i64 = umul_lohi 0x560d63384e70, 0x560d6355a230
0x560d63384e70: i64 = add 0x560d634c5220, 0x560d63644f50
0x560d634c5220: i64 = mul 0x560d633e1b70, Constant:i64<60060>
0x560d633e1b70: i64 = zero_extend 0x560d633e1d30
0x560d634c78d0: i64 = Constant<60060>
0x560d63644f50: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg68
0x560d635c6dd0: i64 = Register %vreg68
0x560d6355a230: i64 = zero_extend 0x560d6355cf60
0x560d6355cf60: i32,ch = CopyFromReg 0x560d633c4e40, Register:i32 %vreg5
0x560d635c3b00: i32 = Register %vreg5
0x560d634f2780: i64 = Constant<1>
0x560d635621b0: i64,glue = subc Constant:i64<0>, 0x560d63382de0
0x560d63550820: i64 = Constant<0>
0x560d63382de0: i64,i64 = umul_lohi 0x560d63380130, 0x560d635c5d50
0x560d63380130: i64,glue = addc 0x560d6337fe20, 0x560d635c6e40
0x560d6337fe20: i64 = or 0x560d635c1840, 0x560d6337fb10
0x560d635c1840: i64 = srl 0x560d6360efe0, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d635c42e0: i64 = or 0x560d634f2da0, 0x560d634d31b0
0x560d634f2da0: i64 = shl 0x560d633831d0, Constant:i32<32>
0x560d633831d0: i64,glue = adde 0x560d634d36f0, Constant:i64<0>, 0x560d6354fef0:1
0x560d6360f4b0: i32 = Constant<32>
0x560d634d31b0: i64 = srl 0x560d6354fef0, Constant:i32<32>
0x560d6354fef0: i64,glue = addc 0x560d6354fbe0, 0x560d63561a40
0x560d6360f4b0: i32 = Constant<32>
0x560d635c5ea0: i64 = or 0x560d635c5960, 0x560d6354cb40
0x560d635c5960: i64 = shl 0x560d635072c0, Constant:i32<32>
0x560d635072c0: i64 = srl 0x560d63443050, 0x560d63561c00
0x560d6360f4b0: i32 = Constant<32>
0x560d6354cb40: i64 = and 0x560d63443280, Constant:i64<4294967295>
0x560d63443280: i64 = srl 0x560d63391490, 0x560d63561c00
0x560d635c5a40: i64 = Constant<4294967295>
0x560d6360f4b0: i32 = Constant<32>
0x560d6337fb10: i64 = shl 0x560d6360efe0:1, Constant:i32<32>
0x560d6360efe0: i64,i64 = umul_lohi 0x560d635c42e0, 0x560d635c5ea0
0x560d635c42e0: i64 = or 0x560d634f2da0, 0x560d634d31b0
0x560d634f2da0: i64 = shl 0x560d633831d0, Constant:i32<32>
0x560d633831d0: i64,glue = adde 0x560d634d36f0, Constant:i64<0>, 0x560d6354fef0:1
0x560d6360f4b0: i32 = Constant<32>
0x560d634d31b0: i64 = srl 0x560d6354fef0, Constant:i32<32>
0x560d6354fef0: i64,glue = addc 0x560d6354fbe0, 0x560d63561a40
0x560d6360f4b0: i32 = Constant<32>
0x560d635c5ea0: i64 = or 0x560d635c5960, 0x560d6354cb40
0x560d635c5960: i64 = shl 0x560d635072c0, Constant:i32<32>
0x560d635072c0: i64 = srl 0x560d63443050, 0x560d63561c00
0x560d6360f4b0: i32 = Constant<32>
0x560d6354cb40: i64 = and 0x560d63443280, Constant:i64<4294967295>
0x560d63443280: i64 = srl 0x560d63391490, 0x560d63561c00
0x560d635c5a40: i64 = Constant<4294967295>
0x560d6360f4b0: i32 = Constant<32>
0x560d635c6e40: i64 = add 0x560d635c6ac0, 0x560d63509200
0x560d635c6ac0: i64 = zero_extend 0x560d634c4f80
0x560d634c4f80: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d6339ffd0, 0x560d634f2400
0x560d633e17f0: i32 = TargetConstant<204>
0x560d6339ffd0: i32 = truncate 0x560d635072c0
0x560d635072c0: i64 = srl 0x560d63443050, 0x560d63561c00
0x560d63443050: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg3
0x560d63561c00: i32 = select 0x560d63383710, Constant:i32<31>, 0x560d63383780
0x560d634f2400: i32 = truncate 0x560d6354fef0
0x560d6354fef0: i64,glue = addc 0x560d6354fbe0, 0x560d63561a40
0x560d6354fbe0: i64 = zero_extend 0x560d63382150
0x560d63561a40: i64 = or 0x560d63382e50, 0x560d6354d0f0
0x560d63509200: i64 = zero_extend 0x560d634432f0
0x560d634432f0: i32 = llvm.HSAIL.mulhi.u32 TargetConstant:i32<204>, 0x560d63559430, 0x560d6337e490
0x560d633e17f0: i32 = TargetConstant<204>
0x560d63559430: i32 = truncate 0x560d6355d120
0x560d6355d120: i64 = srl 0x560d634438a0, 0x560d63561c00
0x560d634438a0: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg6
0x560d63561c00: i32 = select 0x560d63383710, Constant:i32<31>, 0x560d63383780
0x560d6337e490: i32 = truncate 0x560d633831d0
0x560d633831d0: i64,glue = adde 0x560d634d36f0, Constant:i64<0>, 0x560d6354fef0:1
0x560d634d36f0: i64 = srl 0x560d63644a80, Constant:i32<32>
0x560d63550820: i64 = Constant<0>
0x560d6354fef0: i64,glue = addc 0x560d6354fbe0, 0x560d63561a40
0x560d635c5d50: i64 = or 0x560d634d3450, Constant:i64<1>
0x560d634d3450: i64,i64 = umul_lohi 0x560d63384e70, 0x560d6355a230
0x560d63384e70: i64 = add 0x560d634c5220, 0x560d63644f50
0x560d634c5220: i64 = mul 0x560d633e1b70, Constant:i64<60060>
0x560d633e1b70: i64 = zero_extend 0x560d633e1d30
0x560d633e1d30: i32,ch = load<LD4[%arrayidx(addrspace=1)](tbaa=<0x560d633fa8c8>)> 0x560d633c4e40, 0x560d63644460, undef:i64
0x560d63644460: i64 = add 0x560d636449a0, 0x560d6360e720
0x560d63550190: i64 = undef
0x560d634c78d0: i64 = Constant<60060>
0x560d63644f50: i64,ch = CopyFromReg 0x560d633c4e40, Register:i64 %vreg68
0x560d635c6dd0: i64 = Register %vreg68
0x560d6355a230: i64 = zero_extend 0x560d6355cf60
0x560d6355cf60: i32,ch = CopyFromReg 0x560d633c4e40, Register:i32 %vreg5
0x560d635c3b00: i32 = Register %vreg5
0x560d634f2780: i64 = Constant<1>
In function: __OpenCL_tf_kernel
from gpuowl.
This happens because amdgpu-pro does not support 128-bit integers in opencl.
This works on ROCm 1.8.2. Hopefully amdgpu-pro will catch up.
from gpuowl.
Well, just today I discovered that from August 17 a new version of amdgpu-pro is out, version 18.30, so maybe there is hope ...
from gpuowl.
I tried amdgpu-pro 18.30, still not working.
from gpuowl.
Which system are you on? ubuntu?
Ok but probably I will get better stability for my gpus. There is an annoying and recurring error of timeout that hits both my gpu systems.
And, I'm starting to suspect that this problem is specific of the amdgpu driver. Definitely I have the option to try ROCm so why not...
from gpuowl.
Hi, basically the response from ROCm developers is that ROCm requires Gen3 atomics and can use at most one GPU. So ROCm doesn't work on my hardware. Future developments of ROCm may work with many GPUs, so I am waiting for news. If you think it is worth offering a smaller version of TF without 128 bit integers, I can use it and help you again to test it, otherwise I continue like now with both gpuowl and mfakto.
from gpuowl.
Related Issues (20)
- gpuOwl 7.2 - mild slowdown HOT 9
- Will this Linux patch impact gpuOwl? and if so, how? HOT 1
- Makefile:33: Pm1Plan.o error HOT 1
- Consistent error with ROCm 3.9, AMD Fury X HOT 18
- Mersenne video on youtube - Woltman speaking
- README.md includes CLI options (-pm1) that have been removed
- Gpuowl cannot run with ROCm 4.1.0/Navi 10 (Radeon RX 5700 XT) HOT 3
- Trial Factoring with gpuOwl HOT 13
- NTT2 branch (Radeon Pro VII) HOT 1
- Does gpuowl support integrated AMD gpu? HOT 2
- Error compiling cudaowl HOT 7
- Wiki Typo HOT 2
- Error compiling on Linux HOT 1
- KERNEL_INVALID - Kriesel's mingw64 guide from mersenneforum.org for Windows Compile Version: "v7.2-91-g9c22195" HOT 3
- Performance degradation on Nvidia GPUs HOT 5
- Cannot compile version 7.2 on Arch based Linuxes HOT 6
- Pm1Plan missing on master ? HOT 2
- Performance regression on Ubuntu 22.0 with ROCm 5.4.3/5.4.5/5.5 and latest gpuOwl version, exponent 114710069 HOT 8
- Progress spinner disappeared HOT 1
- Are we ever going to get a cute gpuOwl logo? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpuowl.