Boost logo

Boost :

Subject: Re: [boost] Interest in an LLVM library?
From: Andy Jost (Andrew.Jost_at_[hidden])
Date: 2015-11-18 20:41:23


Andrey Semashev wrote:
> I'm not sure I understand the purpose of this library. It seems to build an AST

It builds an intermediate representation of the program.

LLVM is a top-of-the-line compiler infrastructure. It does many things incredibly well but, unfortunately, does not provide an expressive API for defining programs. The API it provides is far too low-level.

Let me demonstrate. Here is a five-line "hello world" program written in my EDSL:

auto const puts = extern_<Function>(i32(*char_), "puts");
auto const main = extern_<Function>(i32(), "main", [&] {
    puts("hello world\n");
    return_(0);
}

The product of this is an intermediate representation of the program. Regarding its use, the LLVM website says the following (http://llvm.org/docs/LangRef.html#introduction):

        "The LLVM code representation is designed to be used in three different forms: as an in-memory compiler IR, as an on-disk bitcode representation (suitable for fast loading by a Just-In-Time compiler), and as a human readable assembly language representation... The three different forms of LLVM are all equivalent."

The EDSL produces the in-memory IR form. Here is the human-readable equivalent for this program:

@.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1
declare i32 @puts(i8*)
define i32 @main() {
.entry:
  %0 = call i32 @puts(i8* getelementptr inbounds ([13 x i8]* @.str, i32 0, i32 0))
  ret i32 0
}

To build up this representation from LLVM C++ API calls is extremely cumbersome. A file that does exactly that (produced by the LLVM tool llc with the option -march=cpp) is 127 lines and 3761 bytes long. Here's how it declares the puts function using API calls:

Function* func_puts = mod->getFunction("puts");
 if (!func_puts) {
 func_puts = Function::Create(
  /*Type=*/FuncTy_2,
  /*Linkage=*/GlobalValue::ExternalLinkage,
  /*Name=*/"puts", mod); // (external, no body)
 func_puts->setCallingConv(CallingConv::C);
 }
 AttributeSet func_puts_PAL;
 func_puts->setAttributes(func_puts_PAL);

I hope you'll agree that this style of API is not particularly expressive, simple, or fun to use. It is too low level. The proposed library aims to simplify encoding programs dynamically by raising the level of abstraction.

> but then it doesn't seem offer to do anything with it?

With the intermediate representation in hand, doing things with it through the LLVM API is comparatively straightforward. For example, to JIT compile and invoke the main function above (assuming it resides in module m), we do the following:

    ExecutionEngine * jit = EngineBuilder(m.ptr())
        .setEngineKind(EngineKind::JIT)
        .create();
    void * fp = jit->getPointerToFunction(m->getFunction("main"));
    auto main = reinterpret_cast<int(*)()>(fp);
    main();

This could easily be simplified, and it is worth considering whether to do so, if only to make the proposed library more complete and the examples more coherent. But I consider that a peripheral issue; not necessarily out of the proposed library, but certainly not the focus, either.

> Is it just a wrapper for LLVM AST API?

Essentially, yes. Though it's not really an AST, as discussed above. Like Boost.Python, this library makes the native API of some popular external library much easier (and more fun!) to use.

-Andy


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk