A robust, flow-sensitive static analysis tool built as an LLVM plugin. This project utilizes Data-Flow Analysis (DFA) to identify variables that may be read before they have been initialized. It is designed to work with the LLVM 20 New Pass Manager and provides source-level warnings including function names and line numbers.
Detecting uninitialized variables is a critical aspect of software security and reliability. This analyzer implements a "May-Uninitialized" logic: if there exists any execution path where a variable is not initialized before a load, a warning is generated.
- Flow-Sensitive: Distinguishes between different execution paths (e.g.,
if-elsebranches). - Source Mapping: Uses LLVM Debug Metadata to point to exact C source lines.
- Worklist Algorithm: Efficiently computes fixed-point convergence for iterative data-flow.
- Custom Lattice Logic: Implements a three-state lattice to handle complex control-flow merges.
The analysis operates on a lattice representing the state of each stack-allocated variable (alloca):
- Undefined: The variable has been allocated but not yet stored to.
- Initialized: The variable has been explicitly assigned a value.
- MaybeNull/Overdefined: A state used for complex pointer analysis or conflicting paths.
The "Meet" operator defines how states merge at Join Points in the Control Flow Graph (CFG).
-
Initialized$\sqcap$ Initialized=Initialized -
Initialized$\sqcap$ Undefined=Undefined(Warn if there is any dangerous path) -
Undefined$\sqcap$ Undefined=Undefined
The analysis updates states based on instructions:
StoreInst: Transitions the pointer operand to theInitializedstate.LoadInst: Checks the current state; ifUndefined, a warning is emitted.AllocaInst: Initialized asUndefinedat the entry of the function.
.
├── CMakeLists.txt # Top-level build configuration
├── src/
│ ├── CMakeLists.txt # Plugin build rules
│ ├── AnalysisPass.cpp # Main logic and Worklist Algorithm
│ ├── DataFlowAnalysis.h # Class definitions and Lattice States
│ └── AnalysisPrep.cpp # LLVM Plugin registration code
└── test/
├── test.c # C test cases (True/False Positives)
└── test.ll # Generated LLVM IR
- LLVM 20 (including opt, clang, and FileCheck)
- CMake 3.15+
- Ninja or Make
-
Create a build directory:
mkdir build && cd build -
Configure with CMake:
cmake .. -
Compile the plugin:
ninja
This generates src/LLVMStaticAnalyzer.so.
-
Generate Bitcode
Compile your C code with debug symbols and without optimizations to preserve the structure for analysis:
clang -g -S -emit-llvm -Xclang -disable-O0-optnone test/test.c -o test/test.ll -
Run the Analyzer
Use the opt tool to load the plugin and run the pass:
opt -load-pass-plugin ./src/LLVMStaticAnalyzer.so \ -passes="static-analyzer" \ -disable-output ../test/test.ll
void test_func(int x){
int a;
int b = 10;
if(x > 5){
printf("%d\n" , a); // bug : this should trigger a warning
}else{
printf("%d\n" , b);
}
}
void test_logic(int x){
int val;
if(x > 0){
val = 1;
}else{
val = 2;
}
printf("%d" , val); // this should be safe
}
void test_partial(int x){
int bug;
if(x > 100){
bug = 42;
}
printf("%d" , bug);
}
Expected Output:
Analyzing Function: test_func
Warning: Uninitialized read in test_func at line 8
Analyzing Function: test_logic
Analyzing Function: test_partial
Warning: Uninitialized read in test_partial at line 29
Analyzing Function: main
The analyzer uses a std::deque to track Basic Blocks that need processing. A block is re-added to the worklist only if its OutState changes, ensuring the algorithm reaches a fixed-point (convergence) efficiently even in the presence of loops. Plugin Registration
The project uses the PassPluginLibraryInfo structure to register with the LLVM pipeline. This allows it to be used seamlessly with the standard opt toolchain via the -passes flag.
Atharva Pingale ( https://github.com/atharva0300 )