Comments (1)
Because train
is a recursive Haskell function, this amounts to loop unrolling in the Accelerate program; you are just building up a larger and larger embedded program, which the rest of the compiler then has to chew on (for no good reason). What you should do is either:
- rewrite this into a loop in the embedded program (using
awhile
); or - split the loop body into a function you feed to
runN
(to compile once) and then repeatedly apply it (via Haskell recursion)
Either of those should work around your immediate problem. Of course I'd still like to improve the performance of the compiler internals, but that will take much more time than for you to just give it a simpler program to begin with.
For reference here's the -ddump-simpl-stats
output for one step of the 100 epoc program (you are asking it to do a lot!):
Total ticks: 627376
8744 Inline
8744 Var
25510 RuleFired
5576 zipWithD
3984 backpermuteD
2800 aletD/float
2792 generateD
2788 replicateD
2384 aletD/bind
1993 mapD
1992 x*1
800 aletD/eliminate
397 commutes (*)
4 reshapeD
34544 BetaReduce
34544 inline exp
485729 Substitution
199868 rebuild
175976 weakenE
60512 shrinkE
32269 weaken
8744 inline
5172 strengthenE
2392 replaceE/shape
796 replaceE/!
72849 SimplifierDone
72849
from accelerate.
Related Issues (20)
- Support CUDA 11 HOT 1
- [BUG] CUDA-10 library doesn't support the Turing-based RTX 2060? HOT 8
- `inconsistent valuation @ shared 'Acc'` when trying to lift non-`Acc` function to `Acc` HOT 6
- `Foreign` instance for reference interpreter
- Is there a way to force accelerate operations to be sequentially evaluated? HOT 10
- [BUG] doc bugs
- Could not enable debugging options HOT 5
- Support GHCJS compilation HOT 7
- [BUG] Function hashes have incorrect length causing internal errors HOT 2
- [BUG] undefined symbol: _ZTIN4llvm10CallbackVHE HOT 4
- [BUG] Value 'sm_30' is not defined for option 'gpu-name' HOT 4
- [BUG] typo in Semigroup instance of Exp (Maybe a) HOT 1
- How to realise convolution? HOT 13
- [Tracking Issue] Implementing (Segmented) Single-Pass Look-Back Scans
- [BUG] Internal error in package accelerate and LLVM.PTX backend: CUDA Exception - misaligned address HOT 1
- [BUG] Runtime error with llvm-ptx backend: double free or corruption (!prev)
- [BUG] Library won't compile with debug flag when referenced by another project's cabal.project file. HOT 9
- [BUG] ptxas fatal error, sm_89 not defined for gpu-name
- [BUG] Cabal.extra-source-files lists many non-existing cbits files HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from accelerate.