Giter Club home page Giter Club logo

clang-tutor's Introduction

clang-tutor

Apple Silicon x86-Ubuntu

Example Clang plugins for C and C++ - based on Clang 18

clang-tutor is a collection of self-contained reference Clang plugins. It's a tutorial that targets novice and aspiring Clang developers. Key features:

  • Modern - based on the latest version of Clang (and updated with every release)
  • Complete - includes build scripts, LIT tests and CI set-up
  • Out of tree - builds against a binary Clang installation (no need to build Clang from sources)

Corrections and feedback always welcome!

Overview

Clang (together with LibTooling) provides a very powerful API and infrastructure for analysing and modifying source files from the C language family. With Clang's plugin framework one can relatively easily create bespoke tools that aid development and improve productivity. The aim of clang-tutor is to showcase this framework through small, self-contained and testable examples, implemented using idiomatic LLVM.

This document explains how to set-up your environment, build and run the project, and go about debugging. The source files, apart from the code itself, contain comments that will guide you through the implementation. The tests highlight what edge cases are supported, so you may want to skim through them as well.

Table of Contents

HelloWorld

The HelloWorld plugin from HelloWorld.cpp is a self-contained reference example. The corresponding CMakeLists.txt implements the minimum set-up for an out-of-tree plugin.

HelloWorld extracts some interesting information from the input translation unit. It visits all C++ record declarations (more specifically class, struct and union declarations) and counts them. Recall that translation unit consists of the input source file and all the header files that it includes (directly or indirectly).

HelloWorld prints the results on a file by file basis, i.e. separately for every header file that has been included. It visits all declarations - including the ones in header files included by other header files. This may lead to some surprising results!

You can build and run HelloWorld like this:

# Build the plugin
export Clang_DIR=<installation/dir/of/clang/18>
export CLANG_TUTOR_DIR=<source/dir/clang/tutor>
mkdir build
cd build
cmake -DCT_Clang_INSTALL_DIR=$Clang_DIR $CLANG_TUTOR_DIR/HelloWorld/
make
# Run the plugin
$Clang_DIR/bin/clang -cc1 -load ./libHelloWorld.{so|dylib} -plugin hello-world $CLANG_TUTOR_DIR/test/HelloWorld-basic.cpp

You should see the following output:

# Expected output
(clang-tutor) file: <source/dir/clang/tutor>/test/HelloWorld-basic.cpp
(clang-tutor)  count: 3

How To Analyze STL Headers

In order to see what happens with multiple indirectly included header files, you can run HelloWorld on one of the header files from the Standard Template Library. For example, you can use the following C++ file that simply includes vector.h:

// file.cpp
#include <vector>

When running a Clang plugin on a C++ file that includes headers from STL, it is easier to run it with clang++ (rather than clang -cc1) like this:

$Clang_DIR/bin/clang++ -c -Xclang -load -Xclang libHelloWorld.dylib -Xclang -plugin -Xclang hello-world file.cpp

This way you can be confident that all the necessary include paths (required to locate STL headers) are automatically added. For the above input file, HelloWorld will print:

  • an overview of all header files included when using #include <vector>, and
  • the number of C++ records declared in each.

Note that there are no explicit declarations in file.cpp and only one header file is included. However, the output on my system consists of 37 header files (one of which contains 371 declarations). Note that the actual output depends on your host OS, the C++ standard library implementation and its version. Your results are likely to be different.

Development Environment

Platform Support And Requirements

clang-tutor has been tested on Ubuntu 20.04 and Mac OS X 10.14.6. In order to build clang-tutor you will need:

  • LLVM 18 and Clang 18
  • C++ compiler that supports C++17
  • CMake 3.13.4 or higher

As Clang is a subproject within llvm-project, it depends on LLVM (i.e. clang-tutor requires development packages for both Clang and LLVM).

There are additional requirements for tests (these will be satisfied by installing LLVM 18):

  • lit (aka llvm-lit, LLVM tool for executing the tests)
  • FileCheck (LIT requirement, it's used to check whether tests generate the expected output)

Installing Clang 18 On Mac OS X

On Darwin you can install Clang 18 and LLVM 18 with Homebrew:

brew install llvm

If you already have an older version of Clang and LLVM installed, you can upgrade it to Clang 18 and LLVM 18 like this:

brew upgrade llvm

Once the installation (or upgrade) is complete, all the required header files, libraries and tools will be located in /usr/local/opt/llvm/.

Installing Clang 18 On Ubuntu

On Ubuntu Jammy Jellyfish, you can install modern LLVM from the official repository:

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy-18 main"
sudo apt-get update
sudo apt-get install -y llvm-18 llvm-18-dev libllvm18 llvm-18-tools clang-18 libclang-common-18-dev libclang-18-dev libmlir-18 libmlir-18-dev

This will install all the required header files, libraries and tools in /usr/lib/llvm-18/.

Building Clang 18 From Sources

Building from sources can be slow and tricky to debug. It is not necessary, but might be your preferred way of obtaining LLVM/Clang 18. The following steps will work on Linux and Mac OS X:

git clone https://github.com/llvm/llvm-project.git
cd llvm-project
git checkout release/18.x
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=host -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi" <llvm-project/root/dir>/llvm/
cmake --build .

For more details read the official documentation.

Note for macOS users

As per this great description by Arthur O’Dwyer , add -DDEFAULT_SYSROOT="$(xcrun --show-sdk-path)" to your CMake invocation when building Clang from sources. Otherwise, clang won't be able to find e.g. standard C headers (e.g. wchar.h).

Building & Testing

You can build clang-tutor (and all the provided plugins) as follows:

cd <build/dir>
cmake -DCT_Clang_INSTALL_DIR=<installation/dir/of/clang/18> <source/dir/clang-tutor>
make

The CT_Clang_INSTALL_DIR variable should be set to the root of either the installation or build directory of Clang 18. It is used to locate the corresponding LLVMConfig.cmake script that is used to set the include and library paths.

In order to run the tests, you need to install llvm-lit (aka lit). It's not bundled with LLVM 18 packages, but you can install it with pip:

# Install lit - note that this installs lit globally
pip install lit

Running the tests is as simple as:

$ lit <build_dir>/test

Voilà! You should see all tests passing.

Overview of The Plugins

This table contains a summary of the examples available in clang-tutor. The Framework column refers to a plugin framework available in Clang that was used to implement the corresponding example. This is either RecursiveASTVisitor, ASTMatcher or both.

Name Description Framework
HelloWorld counts the number of class, struct and union declarations in the input translation unit RecursiveASTVisitor
LACommenter adds comments to literal arguments in functions calls ASTMatcher
CodeStyleChecker issue a warning if the input file does not follow one of LLVM's coding style guidelines RecursiveASTVisitor
Obfuscator obfuscates integer addition and subtraction ASTMatcher
UnusedForLoopVar issue a warning if a for-loop variable is not used RecursiveASTVisitor + ASTMatcher
CodeRefactor rename class/struct method names ASTMatcher

Once you've built this project, you can experiment with every plugin separately. All of them accept C and C++ files as input. Below you will find more detailed descriptions (except for HelloWorld, which is documented here).

LACommenter

The LACommenter (Literal Argument Commenter) plugin will comment literal arguments in function calls. For example, in the following input code:

extern void foo(int some_arg);

void bar() {
  foo(123);
}

LACommenter will decorate the invocation of foo as follows:

extern void foo(int some_arg);

void bar() {
  foo(/*some_arg=*/123);
}

This commenting style follows LLVM's oficial guidelines. LACommenter will comment character, integer, floating point, boolean and string literal arguments.

This plugin is based on a similar example by Peter Smith presented here.

Run the plugin

You can test LACommenter on the example presented above. Assuming that it was saved in input_file.c, you can add comments to it as follows:

$Clang_DIR/bin/clang -cc1 -load <build_dir>/lib/libLACommenter.dylib -plugin LAC input_file.cpp

Run the plugin through ct-la-commenter

locommenter is a standalone tool that will run the LACommenter plugin, but without the need of using clang and loading the plugin:

<build_dir>/bin/ct-la-commenter input_file.cpp --

If you don't append -- at the end of tools invocation will get the complain from Clang tools about missing compilation database as follow:

Error while trying to load a compilation database:
Could not auto-detect compilation database for file "input_file.cpp"
No compilation database found in <source/dir/clang-tutor> or any parent directory
fixed-compilation-database: Error while opening fixed database: No such file or directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.

Another workaround to solve the issue is to set the CMAKE_EXPORT_COMPILE_COMMANDS flag during the CMake invocation. It will give you the compilation database into your build directory with the filename as compile_commands.json. More detailed explaination about it can be found on Eli Bendersky's blog.

CodeStyleChecker

This plugin demonstrates how to use Clang's DiagnosticEngine to generate custom compiler warnings. Essentially, CodeStyleChecker checks whether names of classes, functions and variables in the input translation unit adhere to LLVM's style guide. If not, a warning is printed. For every warning, CodeStyleChecker generates a suggestion that would fix the corresponding issue. This is done with the FixItHint API. SourceLocation API is used to generate valid source location.

CodeStyleChecker is robust enough to cope with complex examples like vector.h from STL, yet the actual implementation is fairly compact. For example, it can correctly analyze names expanded from macros and knows that it should ignore user-defined conversion operators.

Run the plugin

Let's test CodeStyleCheker on the following file:

// file.cpp
class clangTutor_BadName;

The name of the class doesn't follow LLVM's coding guide and CodeStyleChecker indeed captures that:

$Clang_DIR/bin/clang -cc1 -fcolor-diagnostics -load libCodeStyleChecker.dylib -plugin CSC file.cpp
file.cpp:2:7: warning: Type and variable names should start with upper-case letter
class clangTutor_BadName;
      ^~~~~~~~~~~~~~~~~~~
      ClangTutor_BadName
file.cpp:2:17: warning: `_` in names is not allowed
class clangTutor_BadName;
      ~~~~~~~~~~^~~~~~~~~
      clangTutorBadName
2 warnings generated.

There are two warnings generated as two rules have been violated. Alongside every warning, a suggestion (i.e. a FixItHint) that would make the corresponding warning go away. Note that CodeStyleChecker also supplements the warnings with correct source code information.

-fcolor-diagnostics above instructs Clang to generate color output (unfortunately Markdown doesn't render the colors here).

Run the plugin through ct-code-style-checker

ct-code-style-checker is a standalone tool that will run the CodeStyleChecker plugin, but without the need of using clang and loading the plugin:

<build_dir>/bin/ct-code-style-checker input_file.cpp --

Obfuscator

The Obfuscator plugin will rewrite integer addition and subtraction according to the following formulae:

a + b == (a ^ b) + 2 * (a & b)
a - b == (a + ~b) + 1

The above transformations are often used in code obfuscation. You may also know them from Hacker's Delight.

The plugin runs twice over the input file. First it scans for integer additions. If any are found, the input file is updated and printed to stdout. If there are no integer additions, there is no output. Similar logic is implemented for integer subtraction.

Similar code transformations are possible at the LLVM IR level. In particular, see MBAsub and MBAAdd in llvm-tutor.

Run the plugin

Lets use the following file as our input:

int foo(int a, int b) {
  return a + b;
}

You can run the plugin like this:

$Clang_DIR/bin/clang -cc1 -load <build_dir>/lib/libObfuscator.dylib -plugin Obfuscator input.cpp

You should see the following output on your screen.

int foo(int a, int b) {
  return (a ^ b) + 2 * (a & b);
}

UnusedForLoopVar

This plugin detects unused for-loop variables (more specifically, the variables defined inside the traditional and range-based for loop statements) and issues a warning when one is found. For example, in function foo the loop variable j is not used:

int foo(int var_a) {
  for (int j = 0; j < 10; j++)
    var_a++;

  return var_a;
}

UnusedForLoopVar will warn you about it. Clearly the for loop in this case can be replaced with var_a += 10;, so UnusedForLoopVar does a great job in drawing developer's attention to it. It can also detect unused loop variables in range for loops, for example:

#include <vector>

int bar(std::vector<int> var_a) {
  int var_b = 10;
  for (auto some_integer: var_a)
    var_b++;

  return var_b;
}

In this case, some_integer is not used and UnusedForLoopVar will highlight it. The loop could be replaced with a much simpler expression: var_b += var_a.size();.

Obviously unused loop variables may indicate an issue or a potential optimisation (e.g. unroll the loop) or a simplification (e.g. replace the loop with one arithmetic operation). However, that does not have to be the case and sometimes we have good reasons not to use the loop variable. If the name of a loop variable matches the [U|u][N|n][U|u][S|s][E|e][D|d] then it will be ignored by"UnusedForLoopVar. For example, the following modified version of the above example will not be reported:

int foo(int var_a) {
  for (int unused = 0; unused < 10; unused++)
    var_a++;

  return var_a;
}

UnusedForLoopVar mixes both the ASTMatcher and RecursiveASTVisitor frameworks. It is an example of how to leverage both of them to solve a slightly more complex problem. The generated warnings are labelled so that you can see which framework was used to capture a particular case of an unused for-loop variable. For example, for the first example above you will get the following warning:

warning: (Recursive AST Visitor) regular for-loop variable not used

The second example leads to the following warning:

warning: (AST Matcher) range for-loop variable not used

Reading the source code should help you understand why different frameworks are needed in different cases. I have also added a few test files that you can use as reference examples (e.g. UnusedForLoopVar_regular_loop.cpp).

Run the plugin

$Clang_DIR/bin/clang -cc1 -fcolor-diagnostics -load <build_dir>/lib/libUnusedForLoopVar.dylib -plugin UFLV input.cpp

CodeRefactor

This plugin will rename a specified member method in a class (or a struct) and in all classes derived from it. It will also update all call sites in which the method is used so that the code remains semantically correct.

The following example contains all cases supported by CodeFefactor.

// file.cpp
struct Base {
    virtual void foo() {};
};

struct Derived: public Base {
    void foo() override {};
};

void StaticDispatch() {
  Base B;
  Derived D;

  B.foo();
  D.foo();
}

void DynamicDispatch() {
  Base *B = new Base();
  Derived *D = new Derived();

  B->foo();
  D->foo();
}

We will use CodeRefactor to rename Base::foo as Base::bar. Note that this consists of two steps:

  • update the declaration and the definition of foo in the base class (i.e. Base) as well as all in the derived classes (i.e. Derived)
  • update all call sites the use static dispatch (e.g. B1.foo()) and dynamic dispatch (e.g. B2->foo()).

CodeRefactor will do all this refactoring for you! See below how to run it.

The implementation of CodeRefactor is rather straightforward, but it can only operate on one file at a time. clang-rename is much more powerful in this respect.

Run the plugin

CodeRefactor requires 3 command line arguments: -class-name, -old-name, -new-name. Hopefully these are self-explanatory. Passing the arguments to the plugin is a bit cumbersome and probably best demonstrated with an example:

$Clang_DIR/bin/clang -cc1 -load <build_dir>/lib/libCodeRefactor.dylib -plugin CodeRefactor -plugin-arg-CodeRefactor -class-name -plugin-arg-CodeRefactor Base  -plugin-arg-CodeRefactor -old-name -plugin-arg-CodeRefactor foo  -plugin-arg-CodeRefactor -new-name -plugin-arg-CodeRefactor bar file.cpp

It is much easier when you the plugin through a stand-alone tool like ct-code-refactor!

Run the plugin through ct-code-refactor

ct-code-refactor is a standalone tool that is basically a wrapper for CodeRefactor. You can use it to refactor your input file as follows:

<build_dir>/bin/ct-code-refactor --class-name=Base --new-name=bar --old-name=foo file.cpp  --

ct-code-refactor uses LLVM's CommandLine 2.0 library for parsing command line arguments. It is very well documented, relatively easy to integrate and the end result is a very intuitive interface.

References

Below is a list of clang resources available outside the official online documentation that I have found very helpful.

License

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

For more information, please refer to http://unlicense.org/

clang-tutor's People

Contributors

banach-space avatar hftrader avatar ivafanas avatar lanza avatar samuelmarks avatar xgupta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clang-tutor's Issues

[IDEA] Convert types to correct ones: `for(int i=0; i<vec.size(); i++)` => `for(size_t i=0; i<vec.size(); i++)`

Often when building third-party libraries I get a bunch of warnings "comparison between signed and unsigned types is UB".

Not every such occurrence has a trivial solution. But—in my experience—most do. Usually switching just one var from int to size_t also requires tracing all use of that var and changing all those types to size_t also.

From:

unsigned long f() {return 0;}
const size_t v = f();

int main() {
    std::vector<float> vec;
    for(int i=0; i<vec.size(); i++) {}
}

To:

unsigned long f() {return 0;}
const unsigned long v = f();

int main() {
    std::vector<float> vec;
    for(int i=0; i<vec.size(); i++) {}
}

PS: I'm aware that size_type isn't necessarily size_t… but using it here anyway. Just to reiterate: C++ is an afterthought, my main target is C.

Having this tool may aid in compiler optimisation, will reduce UB, and draw down the number of warnings.

Happy to build it myself and release it under CC0 to be compatible with your philosophy. But I'm new to all this, so would benefit from your insights/aid.

How about it? 😃

Thanks for your consideration

Windows build fails to work with clang++

I have the weirdest issue. I have followed your clang-tutor and your llvm-tutor pages - great effort and very informative! Since in llvm-tutor there is a windows version for the project cmake file, I thought I could build your hello plugin on windows too. I managed to get something to build, but it only works with the -cc1 command, not with the -Xclang version. Furthermore, it only seems to work with clang.exe, couldn't get it to work with clang++.exe. I am probably doing something very wrong.

There are 2 things I have tried, when it comes to making the cmake file work: at first, I tried to make your cmake file work. It doesn't quite work out of the box. When cmake generates a visual studio solution, it links (correctly) the needed libs to the project, but also tries to find a "clang.lib", to no avail. When building LLVM/Clang (12), this library seems to stay in the build directory and not get copied over when cmake executes the install target. Fair enough, I just manually linked the odd one out clang.lib from the original build directory. Now the plugin compiles & works, but only with clang.exe and -cc1.

Second thing I tried, is that, on windows, this clang.lib is all that is needed for the linker, to build the plugin. All the libs in the llvm install dir/lib are unnecessary (so it would seem?). So I could build the whole hello plugin with just clang.lib and the include dirs at the install/include. However, the plugin still only works when called from clang.exe with -cc1.

Have you tried making hello plugin work on windows? Do you have any insight into why a plugin would run correctly with -cc1 and not with Xclang? Or with clang.exe but not with clang++.exe?

[IDEA] const-qualify tool

I want to work on a const qualify tool… concentrating on C with C++ as an afterthought (so don't want to get caught up on the constexpr, const ref, and const method conditions).

Whenever I go to a new codebase I'm [usually] annoyed at the lack of care taken to differentiate changing (var) names and unchanging (val/binding-only) names. Probably all those functional languages rubbing off on me!

I'd expect this to aid in compiler optimisations also…

So what I'm thinking is to use libclang and/or libtooling to modify code automatically from:

int f(int n) {
    int a = 5, b = 6, c;
    puts("nop");
    c = b;
    return n;
}

To:

const int f(const int n) {
    const int a = 5, b = 6;
    int c;
    puts("nop");
    c = b;
    return n;
}

…and get it to work nicely on enums, structs, objects (instantiations of classes and structs), functions, globals, and { scoped bodies (e.g., function bodies).

On the surface I'm not thinking this would be particularly difficult… at least for the majority of cases.

What do you think? - I don't have much experience here so happy to write it myself but could use some oversight 😅
(happy to release under CC0 to be compliant with your philosophy)

Thanks for your consideration

EDIT: Oh and I was watching this CppCon talk from a guy who made some progress in this direction: https://www.youtube.com/watch?v=tUBUqJSGr54

Visual Studio compiler lib import problem

1>HelloWorld.obj : error LNK2001: 无法解析的外部符号 "protected: virtual bool __cdecl clang::FrontendAction::shouldEraseOutputFiles(void)" (?shouldEraseOutputFiles@FrontendAction@clang@@MEAA_NXZ)
1>HelloWorld.obj : error LNK2001: 无法解析的外部符号 "protected: virtual void __cdecl clang::ASTFrontendAction::ExecuteAction(void)" (?ExecuteAction@ASTFrontendAction@clang@@MEAAXXZ)
1>HelloWorld.obj : error LNK2001: 无法解析的外部符号 "private: virtual void __cdecl clang::PluginASTAction::anchor(void)" (?anchor@PluginASTAction@clang@@EEAAXXZ)
1>HelloWorld.obj : error LNK2019: 无法解析的外部符号 "public: __cdecl clang::FrontendAction::FrontendAction(void)" (??0FrontendAction@clang@@QEAA@XZ),函数 "class std::unique_ptr<class FindNamedClassAction,struct std::default_delete<class FindNamedClassAction> > __cdecl std::make_unique<class FindNamedClassAction,0>(void)" (??$make_unique@VFindNamedClassAction@@$$V$0A@@std@@YA?AV?$unique_ptr@VFindNamedClassAction@@U?$default_delete@VFindNamedClassAction@@@std@@@0@XZ) 中引用了该符号
1>HelloWorld.obj : error LNK2019: 无法解析的外部符号 "public: virtual __cdecl clang::FrontendAction::~FrontendAction(void)" (??1FrontendAction@clang@@UEAA@XZ),函数 "public: virtual void * __cdecl FindNamedClassAction::`scalar deleting destructor'(unsigned int)" (??_GFindNamedClassAction@@UEAAPEAXI@Z) 中引用了该符号
1>HelloWorld.obj : error LNK2019: 无法解析的外部符号 "public: static void __cdecl llvm::Registry<class clang::PluginASTAction>::add_node(class llvm::Registry<class clang::PluginASTAction>::node *)" (?add_node@?$Registry@VPluginASTAction@clang@@@llvm@@SAXPEAVnode@12@@Z),函数 "void __cdecl `dynamic initializer for 'X''(void)" (??__EX@@YAXXZ) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPDirectiveName(enum llvm::omp::Directive)" (?getOpenMPDirectiveName@omp@llvm@@YA?AVStringRef@2@W4Directive@12@@Z),函数 "public: void __cdecl clang::OMPClausePrinter::VisitOMPIfClause(class clang::OMPIfClause *)" (?VisitOMPIfClause@OMPClausePrinter@clang@@QEAAXPEAVOMPIfClause@2@@Z) 中引用了该符号
1>clangAST.lib(StmtPrinter.obj) : error LNK2001: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPDirectiveName(enum llvm::omp::Directive)" (?getOpenMPDirectiveName@omp@llvm@@YA?AVStringRef@2@W4Directive@12@@Z)
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "enum llvm::omp::TraitSet __cdecl llvm::omp::getOpenMPContextTraitSetForProperty(enum llvm::omp::TraitProperty)" (?getOpenMPContextTraitSetForProperty@omp@llvm@@YA?AW4TraitSet@12@W4TraitProperty@12@@Z),函数 "public: void __cdecl clang::OMPTraitInfo::getAsVariantMatchInfo(class clang::ASTContext &,struct llvm::omp::VariantMatchInfo &)const " (?getAsVariantMatchInfo@OMPTraitInfo@clang@@QEBAXAEAVASTContext@2@AEAUVariantMatchInfo@omp@llvm@@@Z) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPContextTraitSetName(enum llvm::omp::TraitSet)" (?getOpenMPContextTraitSetName@omp@llvm@@YA?AVStringRef@2@W4TraitSet@12@@Z),函数 "public: void __cdecl clang::OMPTraitInfo::print(class llvm::raw_ostream &,struct clang::PrintingPolicy const &)const " (?print@OMPTraitInfo@clang@@QEBAXAEAVraw_ostream@llvm@@AEBUPrintingPolicy@2@@Z) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPContextTraitSelectorName(enum llvm::omp::TraitSelector)" (?getOpenMPContextTraitSelectorName@omp@llvm@@YA?AVStringRef@2@W4TraitSelector@12@@Z),函数 "public: void __cdecl clang::OMPTraitInfo::print(class llvm::raw_ostream &,struct clang::PrintingPolicy const &)const " (?print@OMPTraitInfo@clang@@QEBAXAEAVraw_ostream@llvm@@AEBUPrintingPolicy@2@@Z) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "enum llvm::omp::TraitProperty __cdecl llvm::omp::getOpenMPContextTraitPropertyKind(enum llvm::omp::TraitSet,class llvm::StringRef)" (?getOpenMPContextTraitPropertyKind@omp@llvm@@YA?AW4TraitProperty@12@W4TraitSet@12@VStringRef@2@@Z),函数 "public: __cdecl clang::OMPTraitInfo::OMPTraitInfo(class llvm::StringRef)" (??0OMPTraitInfo@clang@@QEAA@VStringRef@llvm@@@Z) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPContextTraitPropertyName(enum llvm::omp::TraitProperty)" (?getOpenMPContextTraitPropertyName@omp@llvm@@YA?AVStringRef@2@W4TraitProperty@12@@Z),函数 "public: class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl clang::OMPTraitInfo::getMangledName(void)const " (?getMangledName@OMPTraitInfo@clang@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ) 中引用了该符号
1>clangAST.lib(OpenMPClause.obj) : error LNK2019: 无法解析的外部符号 "bool __cdecl llvm::omp::isValidTraitSelectorForTraitSet(enum llvm::omp::TraitSelector,enum llvm::omp::TraitSet,bool &,bool &)" (?isValidTraitSelectorForTraitSet@omp@llvm@@YA_NW4TraitSelector@12@W4TraitSet@12@AEA_N2@Z),函数 "public: class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl clang::OMPTraitInfo::getMangledName(void)const " (?getMangledName@OMPTraitInfo@clang@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ) 中引用了该符号
1>clangAST.lib(StmtPrinter.obj) : error LNK2019: 无法解析的外部符号 "public: static class llvm::StringRef __cdecl clang::Lexer::getSourceText(class clang::CharSourceRange,class clang::SourceManager const &,class clang::LangOptions const &,bool *)" (?getSourceText@Lexer@clang@@SA?AVStringRef@llvm@@VCharSourceRange@2@AEBVSourceManager@2@AEBVLangOptions@2@PEA_N@Z),函数 "bool __cdecl printExprAsWritten(class llvm::raw_ostream &,class clang::Expr *,class clang::ASTContext const *)" (?printExprAsWritten@@YA_NAEAVraw_ostream@llvm@@PEAVExpr@clang@@PEBVASTContext@4@@Z) 中引用了该符号
1>clangAST.lib(Expr.obj) : error LNK2019: 无法解析的外部符号 "public: __cdecl clang::Lexer::Lexer(class clang::SourceLocation,class clang::LangOptions const &,char const *,char const *,char const *)" (??0Lexer@clang@@QEAA@VSourceLocation@1@AEBVLangOptions@1@PEBD22@Z),函数 "public: class clang::SourceLocation __cdecl clang::StringLiteral::getLocationOfByte(unsigned int,class clang::SourceManager const &,class clang::LangOptions const &,class clang::TargetInfo const &,unsigned int *,unsigned int *)const " (?getLocationOfByte@StringLiteral@clang@@QEBA?AVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@AEBVTargetInfo@2@PEAI3@Z) 中引用了该符号
1>clangAST.lib(Expr.obj) : error LNK2019: 无法解析的外部符号 "private: bool __cdecl clang::Lexer::Lex(class clang::Token &)" (?Lex@Lexer@clang@@AEAA_NAEAVToken@2@@Z),函数 "public: class clang::SourceLocation __cdecl clang::StringLiteral::getLocationOfByte(unsigned int,class clang::SourceManager const &,class clang::LangOptions const &,class clang::TargetInfo const &,unsigned int *,unsigned int *)const " (?getLocationOfByte@StringLiteral@clang@@QEBA?AVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@AEBVTargetInfo@2@PEAI3@Z) 中引用了该符号
1>clangAST.lib(Expr.obj) : error LNK2019: 无法解析的外部符号 "public: static unsigned int __cdecl clang::Lexer::getTokenPrefixLength(class clang::SourceLocation,unsigned int,class clang::SourceManager const &,class clang::LangOptions const &)" (?getTokenPrefixLength@Lexer@clang@@SAIVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@@Z),函数 "public: class clang::SourceLocation __cdecl clang::StringLiteral::getLocationOfByte(unsigned int,class clang::SourceManager const &,class clang::LangOptions const &,class clang::TargetInfo const &,unsigned int *,unsigned int *)const " (?getLocationOfByte@StringLiteral@clang@@QEBA?AVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@AEBVTargetInfo@2@PEAI3@Z) 中引用了该符号
1>clangAST.lib(Expr.obj) : error LNK2019: 无法解析的外部符号 "public: unsigned int __cdecl clang::StringLiteralParser::getOffsetOfStringByte(class clang::Token const &,unsigned int)const " (?getOffsetOfStringByte@StringLiteralParser@clang@@QEBAIAEBVToken@2@I@Z),函数 "public: class clang::SourceLocation __cdecl clang::StringLiteral::getLocationOfByte(unsigned int,class clang::SourceManager const &,class clang::LangOptions const &,class clang::TargetInfo const &,unsigned int *,unsigned int *)const " (?getLocationOfByte@StringLiteral@clang@@QEBA?AVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@AEBVTargetInfo@2@PEAI3@Z) 中引用了该符号
1>clangAST.lib(Expr.obj) : error LNK2019: 无法解析的外部符号 "private: void __cdecl clang::StringLiteralParser::init(class llvm::ArrayRef<class clang::Token>)" (?init@StringLiteralParser@clang@@AEAAXV?$ArrayRef@VToken@clang@@@llvm@@@Z),函数 "public: class clang::SourceLocation __cdecl clang::StringLiteral::getLocationOfByte(unsigned int,class clang::SourceManager const &,class clang::LangOptions const &,class clang::TargetInfo const &,unsigned int *,unsigned int *)const " (?getLocationOfByte@StringLiteral@clang@@QEBA?AVSourceLocation@2@IAEBVSourceManager@2@AEBVLangOptions@2@AEBVTargetInfo@2@PEAI3@Z) 中引用了该符号
1>clangAST.lib(TextNodeDumper.obj) : error LNK2019: 无法解析的外部符号 "class llvm::StringRef __cdecl llvm::omp::getOpenMPClauseName(enum llvm::omp::Clause)" (?getOpenMPClauseName@omp@llvm@@YA?AVStringRef@2@W4Clause@12@@Z),函数 "public: void __cdecl <lambda_9a749cc2600948731067c16499d3e43b>::operator()(void)const " (??R<lambda_9a749cc2600948731067c16499d3e43b>@@QEBAXXZ) 中引用了该符号
1>clangAST.lib(JSONNodeDumper.obj) : error LNK2019: 无法解析的外部符号 "public: static unsigned int __cdecl clang::Lexer::MeasureTokenLength(class clang::SourceLocation,class clang::SourceManager const &,class clang::LangOptions const &)" (?MeasureTokenLength@Lexer@clang@@SAIVSourceLocation@2@AEBVSourceManager@2@AEBVLangOptions@2@@Z),函数 "private: void __cdecl clang::JSONNodeDumper::writeBareSourceLocation(class clang::SourceLocation,bool)" (?writeBareSourceLocation@JSONNodeDumper@clang@@AEAAXVSourceLocation@2@_N@Z) 中引用了该符号
1>clangAST.lib(CommentSema.obj) : error LNK2019: 无法解析的外部符号 "public: class llvm::StringRef __cdecl clang::Preprocessor::getLastMacroWithSpelling(class clang::SourceLocation,class llvm::ArrayRef<class clang::TokenValue>)const " (?getLastMacroWithSpelling@Preprocessor@clang@@QEBA?AVStringRef@llvm@@VSourceLocation@2@V?$ArrayRef@VTokenValue@clang@@@4@@Z),函数 "public: void __cdecl clang::comments::Sema::checkDeprecatedCommand(class clang::comments::BlockCommandComment const *)" (?checkDeprecatedCommand@Sema@comments@clang@@QEAAXPEBVBlockCommandComment@23@@Z) 中引用了该符号

Error happen when include STL header

Hi, author!
I recently work on implement new libtool based on clang-API, and I work on MacOSX-arm64 now, my environment is: LLVM version is 12.0.0, when I analyze a c++ code containing include<iostream>, libtool report error /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/wchar.h:89:10: fatal error: 'stdarg.h' file not found. But clang could normally analyze this code. And there is stdarg.h in LLVM header search path, what should I do to fix this.

Fix building against LLVM/Clang configured with BUILD_SHARED_LIBS=ON

The tools from clang-tutor fail to build when using Clang/LLVM that was build with BUILD_SHARED_LIBS set. This is the error that I see on Ubuntu 16.04:

[7/9] Linking CXX executable bin/ct-la-commenter
FAILED: : && /usr/bin/clang++-11 -Wall    -fdiagnostics-color=always -fno-rtti -fvisibility-inlines-hidden -g  tools/CMakeFiles/ct-la-commenter.dir/LACommenterMain.cpp.o tools/CMakeFiles/ct-la-commenter.dir/__/lib/LACommenter.cpp.o -o bin/ct-la-commenter  -Wl,-rpath,/home/andwar02/work/release-11/build/release/lib  /home/bs/release-11/build/release/lib/libclangTooling.so.11  -Wl,-rpath-link,/home/bs/release-11/build/release/lib && :
/usr/bin/ld: tools/CMakeFiles/ct-la-commenter.dir/LACommenterMain.cpp.o: undefined reference to symbol '_ZN5clang14FrontendAction22shouldEraseOutputFilesEv'
/home/bs/work/release-11/build/release/lib/libclangFrontend.so.11: error adding symbols: DSO missing from command line
clang: error: linker command failed with exit code 1 (use -v to see invocation)

SOLUTION 1
A solution was submitted by @xgupta here. It went through a few iterations. The original approach looked like this:

$ git diff
diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt
index 3673d81..6a33776 100644
--- a/tools/CMakeLists.txt
+++ b/tools/CMakeLists.txt
@@ -34,6 +34,14 @@ foreach( tool ${CLANG_TUTOR_TOOLS} )
     target_link_libraries(
       ${tool}
       clangTooling
+      clangFrontend
+      clangSerialization
+      clangRewrite
+      clangASTMatchers
+      clangAST
+      clangBasic
+      LLVMFrontendOpenMP
+      LLVMSupport
     )

     # Configure include directories for 'tool'

However, we discovered that that didn't work well with pre-build binary packages from Ubuntu 16.04. This is the error that I was getting at run-time:

 bin/ct-la-commenter <clang-tutor-dir>/test/LACBool.cpp 2>&1
: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Aborted

So this was regression. It's hard to fix as it's not clear how the packages from Ubuntu are built and how to reproduce this with manually built packages.

SOLUTION 2
@xgupta kindly submitted an updated solution that is based on Clang's reference example. However, for this to work clang-tutor needs access to:

  • add_clang_executable
  • clang_target_link_libraries

This would normally be achieved with include(AddClang) in one of the CMake scripts. However, currently AddClang is not copied/installed in the build directory. A fix has been proposed here. One can also work around this by pointing clang-tutor to Clang's source directory (i.e. the location of AddClang). But that would mean that Ubuntu's pre-build packages alone are no longer sufficient to build the project.

SUMMARY
Both approaches break clang-tutor when using Ubuntu's pre-build LLVM/Clang packages. I think that SOLUTION 2 would be more canonical, but we may need for D94533 to be merged.

Questions about rationale of ASTMatcher and RecursiveASTVisitor?

Hi, author. Thanks for you excellent tutorial, I have some questions about your code in UnusedForLoopVar?

  • why ASTMatcher would work? you use API LoopVar->isUsed(true) in line 64 of UnusedForLoopVar.cpp to check whether LoopVar is used? how the attribute used of LoopVar is set to true. Does clang do this when it parse source code into AST?

  • you use API RecursiveASTVisitor::TraverseStmt(S->getBody()); instead of this->TraverseStmt(S->getBody()); in line 134 of UnusedForLoopVar.cpp, what's the difference between them.

JIT related?

Great work!
So this tutor just focus on front-end? Will add some tutor about JIT of LLVM etc?
thanks

What's the difference between TraverseForStmt and VisitForStmt?

Hi,

I am learning the excellent tutorial, and I have some confusion in UnusedForLoopVar Tool. In UnusedForLoopVar especially in the implementation of RecursiveASTVisitor, the TraverseForStmt() is used to visit the ForStmt in AST, rather than VisitForStmt(). What's the difference between these two methods on visiting the AST nodes? Why would UnusedForLoopVar use TraverseForStmt() rather than VisitForStmt()?

Thank you very much!!!

Run on Windows 11 Native

Work with conda-forge packages.
The following environment may be sufficient.

name: llvm

channels:
  - conda-forge
  - defaults

dependencies:

# build tools
- cmake
- conan 
- ninja
# - vcpkg

# https://github.com/conda-forge/clangdev-feedstock
- clang
- clang-format
# - clang-format-16
- clang-tools
- clangdev
# - clangxx
- libclang
- libclang-cpp
# - libclang-cpp16
# - libclang13
# - python-clang

# https://github.com/conda-forge/llvmdev-feedstock
- libllvm16
- lit
- llvm
- llvm-tools 
- llvmdev

As it is conda it should work for any platform.

I am working on a PR.

Obfuscator output is weird for the input with both + and - operations

Please consider the following input file:

int foo(int a, int b) {
  return a + b;
}

int bar(int a, int b) {
  return a - b;
}

Actual obfuscator plugin output is:

int foo(int a, int b) {
  return (a ^ b) + 2 * (a & b);
}

int bar(int a, int b) {
  return a - b;
}

int foo(int a, int b) {
  return a + b;
}

int bar(int a, int b) {
  return (a + ~b) + 1;
}

Expected obfuscator plugin output is:

int foo(int a, int b) {
  return (a ^ b) + 2 * (a & b);
}

int bar(int a, int b) {
  return (a + ~b) + 1;
}

May be it is possible to share Rewriter between matchers and apply rewrite at the end of ObfuscatorASTConsumer::HandleTranslationUnit call

No CMAKE_CXX_COMPILER could be found.

cmake -DCT_Clang_INSTALL_DIR=$Clang_DIR $CLANG_TUTOR_DIR/HelloWorld/ meet error


CMake Warning (dev) in CMakeLists.txt:
  No project() command is present.  The top-level CMakeLists.txt file must
  contain a literal, direct call to the project() command.  Add a line of
  code such as

    project(ProjectName)

  near the top of the file, but after cmake_minimum_required().

  CMake is pretending there is a "project(Project)" command on the first
  line.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- The CXX compiler identification is unknown
CMake Error in CMakeLists.txt:
  No CMAKE_CXX_COMPILER could be found.

  Tell CMake where to find the compiler by setting either the environment
  variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.


CMake Warning (dev) in CMakeLists.txt:
  No cmake_minimum_required command is present.  A line of code such as

    cmake_minimum_required(VERSION 3.16)

  should be added at the top of the file.  The version specified may be lower
  if you wish to support older CMake versions for this project.  For more
  information run "cmake --help-policy CMP0000".
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Configuring incomplete, errors occurred!

Example for instrumenting source code

Thank you! This is very helpful. Do you have any example plugins for adding functions to source code? I would like an example for adding a new function, with simple and complex variables(like structures ), which also uses inputs form the original source code. For example, an example plugin to ASTMatcher and Rewriter/Refactoring/Replacement tools in a plugin.

How to compile with these clang tools?

Thank you very much for this anazing project. I've successfully run all the examples. But I still have some confusion. How should I use these tools to compile a real project?

For example, I want to use libCodeStyleChecker.so into a real compilation process of a large project. Normally, I can do this by modifying the system environment variable CC, such as:

CC="clang -cc1 -load /path/to/libCodeStyleChecker.so -plugin CSC " make

But I can't get any output files. Further more, I made the following attempt:

clang -cc1 -fcolor-diagnostics -load ./libCodeStyleChecker.so -plugin CSC ../../test/CodeStyleCheckerFunction.cpp -o test.o
../../test/CodeStyleCheckerFunction.cpp:6:6: warning: Function names should start with lower-case letter
void ClangTutorFuncBad();
     ^~
     clangTutorFuncBad
../../test/CodeStyleCheckerFunction.cpp:12:8: warning: Function names should start with lower-case letter
  void ClangTutorMemberMethodBad();
       ^~
       clangTutorMemberMethodBad
2 warnings generated.

I got the following result:

# xxx @ XXX in ~/Workspace/Testspace/clang-ast/clang-tutor/build/lib on git:main x 15:51:11 
$ ls
CMakeFiles           libCodeRefactor.so      libHelloWorld.so   libObfuscator.so        Makefile
cmake_install.cmake  libCodeStyleChecker.so  libLACommenter.so  libUnusedForLoopVar.so

As we can see that the -o option I entered did not work, and I couldn't compile any cpp files.

Please tell me what should I do to be able to compile a large project while using clang plugins?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.