Giter Club home page Giter Club logo

llvm-next-function-merging's Introduction

LLVM Next Function Merging

LLVM Next Function Merging is an experimental LLVM pass plugin that allows you to apply the State of the Art function merging techniques to your program.

The optimization passes are derived from "F3M: Fast Focused Function Merging (CGO'22), Sean Sterling, Rodrigo C. O. Rocha, Hugh Leather, Kim Hazelwood, Michael O'Boyle, Pavlos Petoumenos" and its forked LLVM repository with a few changes for building it as a plugin (Licensed under Apache License v2.0 with LLVM Exceptions).

This repository is a fork of the original repository and allows out-of-tree development from LLVM to focus on the essential improvements.

Quick start

  1. Clone the repository
$ git clone https://github.com/kateinoigakukun/llvm-next-function-merging.git
$ cd llvm-next-function-merging
  1. Configure build directory

You need LLVM 13 to build this plugin. Please add -DLLVM_DIR:PATH=/path/to/lib/cmake/llvm to the CMake configuration in the case of pkg-config cannot find LLVM.

$ cmake -B build -G Ninja
  1. Build the plugin
$ cmake --build ./build
  1. Use the plugin with opt

Note: Since most of passes other than func-merging still use the legacy pass manager, you need to pass --enable-new-pm=false and --load instead of --load-pass-plugin to opt.

opt --load-pass-plugin ./build/lib/Transforms/IPO/LLVMNextFM.so \
  -S ./test/Transforms/NextFM/basic.ll \
  --passes=func-merging

Optimization passes

This plugin provides the following passes:

Pass name Short description Paper Original source
func-merging The current state of the art. Better pair finding algorithm based on MinHash, LSH, and fastfm F3M: Fast Focused Function Merging (CGO'22), HyFM: Function Merging for Free (LCTES'21) ppetoumenos/llvm-project by Pavlos Petoumenos
fastfm Better merging algorithm (SalSSA) Effective Function Merging in the SSA Form (PLDI'20) rcorcs/llvm-project by Rodrigo Rocha
fmsa Merging arbitrary two functions by pairwise sequence alignment Function Merging by Sequence Alignment (CGO'19) rcorcs/fmsa by Rodrigo Rocha
mergesimilarfunc Merging two functions with isomorphic CFGs Exploiting Function Similarity for Code Size Reduction (LCTES 2014) LLVM Patch D22051 by Tobias Edler von Koch

Executing the Tests

This project uses lit to execute the tests as well as other LLVM families do.

$ cmake -B build -G Ninja -D BUILD_TESTING:BOOL=ON
$ cmake --build ./build
$ lit ./build/test

Project Structure

The directory structure of this project is designed to be easy for upstreaming to the LLVM project. Most part of the tree shadows the original LLVM repository structure.

llvm-next-function-merging's People

Contributors

kateinoigakukun avatar

Stargazers

Kiung Jung avatar Lagyu avatar aweNousaku avatar Jevin Sweval avatar Rodrigo Rocha avatar Sakamoto, Kazunori avatar  avatar

Watchers

Jevin Sweval avatar  avatar  avatar

llvm-next-function-merging's Issues

Merged block is too large (~2x) than original one

Example in 433.milc

Merged

.LBB3_23:                               # %m.bb.for.end189
        movq    -168(%rbp), %rax                # 8-byte Reload
        movq    even_sites_on_node@GOTPCREL(%rip), %rcx
        movl    (%rcx), %esi
        movzbl  -56(%rbp), %ecx                 # 1-byte Folded Reload
        movzbl  %dl, %edx
        testb   $1, %r15b
        cmovnel %ecx, %edx
        xorl    %edi, %edi
        testb   $1, %dl
        cmovnel %esi, %edi
        testb   $1, %r15b
        movzbl  -140(%rbp), %ecx                # 1-byte Folded Reload
        movzbl  -96(%rbp), %edx                 # 1-byte Folded Reload
        cmovnel %ecx, %edx
        movq    %rdi, -128(%rbp)                # 8-byte Spill
        movslq  %edi, %rcx
        cmoveq  %rax, %rcx
        movq    %rcx, -168(%rbp)                # 8-byte Spill
        movl    %esi, %eax
        testb   $1, %dl
        jne     .LBB3_25
# %bb.24:                               # %m.bb.for.end189
        movq    sites_on_node@GOTPCREL(%rip), %rax
        movl    (%rax), %eax
.LBB3_25:                               # %m.bb.for.end189
        cmpl    %eax, -128(%rbp)                # 4-byte Folded Reload
        movl    %eax, -144(%rbp)                # 4-byte Spill
        jge     .LBB3_26

Original

# %bb.34:                               # %for.end189
        movq    even_sites_on_node@GOTPCREL(%rip), %rax
        movl    (%rax), %eax
        movl    %eax, %r14d
        cmpl    $2, 20(%rsp)                    # 4-byte Folded Reload
        je      .LBB2_36
# %bb.35:                               # %for.end189
        movq    sites_on_node@GOTPCREL(%rip), %rcx
        movl    (%rcx), %r14d
.LBB2_36:                               # %for.end189
        xorl    %ecx, %ecx
        cmpl    $1, 20(%rsp)                    # 4-byte Folded Reload
        cmovel  %eax, %ecx
        cmpl    %r14d, %ecx
        jge     .LBB2_39

for.end122:                                       ; preds = %for.body114
  %cmp124 = icmp eq i32 %parity, 2
  %40 = load i32, i32* @even_sites_on_node, align 4
  %41 = load i32, i32* @sites_on_node, align 4
  %cond129 = select i1 %cmp124, i32 %40, i32 %41
  %cmp130 = icmp eq i32 %parity, 1
  %cond135 = select i1 %cmp130, i32 %40, i32 0
  %idxprom136 = sext i32 %cond135 to i64
  %cmp139409 = icmp slt i32 %cond135, %cond129
  br i1 %cmp139409, label %for.body141.preheader, label %for.body173.preheader

for.end189:                                       ; preds = %for.body183
  %71 = load i32, i32* @even_sites_on_node, align 4
  %72 = load i32, i32* @sites_on_node, align 4
  %cond196 = select i1 %cmp124, i32 %71, i32 %72
  %cond202 = select i1 %cmp130, i32 %71, i32 0
  %cmp206405 = icmp slt i32 %cond202, %cond196
  br i1 %cmp206405, label %for.body208.preheader, label %for.body242.preheader
sw.epilog:                                        ; preds = %if.end12, %sw.bb14, %sw.bb13
  %cmp50 = phi i1 [ false, %sw.bb14 ], [ true, %sw.bb13 ], [ false, %if.end12 ]
  %cmp52 = phi i1 [ false, %sw.bb14 ], [ false, %sw.bb13 ], [ true, %if.end12 ]
  %cmp19 = icmp eq i32 %start, 1
  %6 = bitcast %struct.su3_vector* %src to i8*
  br label %for.body18

for.end49:                                        ; preds = %for.inc47
  %18 = load i32, i32* @even_sites_on_node, align 4
  %19 = load i32, i32* @sites_on_node, align 4
  %cond = select i1 %cmp50, i32 %18, i32 %19
  %cond57 = select i1 %cmp52, i32 %18, i32 0
  %idxprom58 = sext i32 %cond57 to i64
  %cmp61438 = icmp slt i32 %cond57, %cond
  br i1 %cmp61438, label %for.body63.preheader, label %for.body91.preheader

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.