Tracking variables and their values – Advanced IR Generation-1
To be useful, the type metadata described in the previous section needs to be associated with variables of the source program. For a global variable, this is pretty easy. The createGlobalVariableExpression() function of the llvm::DIBuilder class creates the metadata to describe a global variable. This includes the name of the variable in the source, the mangled name, the source file, and so on. A global variable in LLVM IR is represented by an instance of the GlobalVariable class. This class has a method called addDebugInfo(), which associates the metadata node returned from createGlobalVariableExpression() with the global variable.
For local variables, we need to take another approach. LLVM IR does not know of a class representing a local variable as it only knows about values. The solution the LLVM community has developed is to insert calls to intrinsic functions into the IR code of a function. An intrinsic function is a function that LLVM knows about and, therefore, can do some magic with it. In most cases, intrinsic functions do not result in a subroutine call at the machine level. Here, the function call is a convenient vehicle to associate the metadata with a value. The most important intrinsic functions for debug metadata are llvm.dbg.declare and llvm.dbg.value.
The llvm.dbg.declare intrinsic provides information and is generated once by the frontend to declare a local variable. Essentially, this intrinsic describes the address of a local variable. During optimization, passes can replace this intrinsic with (possibly multiple) calls to llvm.dbg.value to preserve the debug information and to track the local source variables. After optimization, multiple calls to llvm.dbg.declare may be present as it is used to describe the program points where the local variables live within memory.
On the other hand, the llvm.dbg.value intrinsic is called whenever a local variable is set to a new value. This intrinsic describes the value of a local variable, not its address.
How does all of this work? The LLVM IR representation and the programmatic creation via the llvm::DIBuilder class differ a bit, so we will look at both.
Continuing with our example from the previous section, we’ll allocate local storage for the I variable inside the Func function with the alloca instruction:
@i = alloca i32
After that, we must add a call to the llvm.dbg.declare intrinsic:
call void @llvm.dbg.declare(metadata ptr %i,
metadata !1, metadata !DIExpression())
The first parameter is the address to the local variable. The second parameter is the metadata describing the local variable, which is created by a call to either createAutoVariable() for a local variable or createParameterVariable() for a parameter of the llvm::DIBuilder class. Finally, the third parameter describes an address expression, which will be explained later.
Let’s implement the IR creation. You can allocate the storage for the local @i variable with a call to the CreateAlloca() method of the llvm::IRBuilder<> class:
llvm::Type *IntTy = llvm::Type::getInt32Ty(LLVMCtx);
llvm::Value *Val = Builder.CreateAlloca(IntTy, nullptr, “i”);
The LLVMCtx variable is the used context class, and Builder is the used instance of the llvm::IRBuilder<> class.
A local variable also needs to be described by metadata:
llvm::DILocalVariable *DbgLocalVar =
Dbuilder.createAutoVariable(DbgFunc, “i”, DbgFile,
7, DbgIntTy);
Using the values from the previous section, we can specify that the variable is part of the DbgFunc function, is called i, is defined in the DbgFile file at line 7, and is of the DbgIntTy type.
Finally, we associate the debug metadata with the address of the variable using the llvm.dbg.declare intrinsic. Using llvm::DIBuilder shields you from all of the details of adding a call:
llvm::DILocation *DbgLoc =
llvm::DILocation::get(LLVMCtx, 7, 5, DbgFunc);
DBuilder.insertDeclare(Val, DbgLocalVar,
DBuilder.createExpression(), DbgLoc,
Val.getParent());