Skip to content

atharva0300/llvm-ir-static-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llvm-ir-static-analyzer

LLVM Static Analyzer: Uninitialized Variable Detection

A robust, flow-sensitive static analysis tool built as an LLVM plugin. This project utilizes Data-Flow Analysis (DFA) to identify variables that may be read before they have been initialized. It is designed to work with the LLVM 20 New Pass Manager and provides source-level warnings including function names and line numbers.


Overview

Detecting uninitialized variables is a critical aspect of software security and reliability. This analyzer implements a "May-Uninitialized" logic: if there exists any execution path where a variable is not initialized before a load, a warning is generated.

Key Features:

  • Flow-Sensitive: Distinguishes between different execution paths (e.g., if-else branches).
  • Source Mapping: Uses LLVM Debug Metadata to point to exact C source lines.
  • Worklist Algorithm: Efficiently computes fixed-point convergence for iterative data-flow.
  • Custom Lattice Logic: Implements a three-state lattice to handle complex control-flow merges.

Technical Design

1. Data-Flow Lattice

The analysis operates on a lattice representing the state of each stack-allocated variable (alloca):

  • Undefined: The variable has been allocated but not yet stored to.
  • Initialized: The variable has been explicitly assigned a value.
  • MaybeNull/Overdefined: A state used for complex pointer analysis or conflicting paths.

2. The Meet Operator

The "Meet" operator defines how states merge at Join Points in the Control Flow Graph (CFG).

  • Initialized $\sqcap$ Initialized = Initialized
  • Initialized $\sqcap$ Undefined = Undefined (Warn if there is any dangerous path)
  • Undefined $\sqcap$ Undefined = Undefined

3. Transfer Functions

The analysis updates states based on instructions:

  • StoreInst: Transitions the pointer operand to the Initialized state.
  • LoadInst: Checks the current state; if Undefined, a warning is emitted.
  • AllocaInst: Initialized as Undefined at the entry of the function.

Project Structure

.
├── CMakeLists.txt             # Top-level build configuration
├── src/
│   ├── CMakeLists.txt         # Plugin build rules
│   ├── AnalysisPass.cpp       # Main logic and Worklist Algorithm
│   ├── DataFlowAnalysis.h     # Class definitions and Lattice States
│   └── AnalysisPrep.cpp       # LLVM Plugin registration code
└── test/
    ├── test.c                 # C test cases (True/False Positives)
    └── test.ll                # Generated LLVM IR

Build Instructions

Prerequisites

  1. LLVM 20 (including opt, clang, and FileCheck)
  2. CMake 3.15+
  3. Ninja or Make

Compilation

  1. Create a build directory:

    mkdir build && cd build
    
  2. Configure with CMake:

    cmake ..
    
  3. Compile the plugin:

    ninja
    

This generates src/LLVMStaticAnalyzer.so.

Usage

  1. Generate Bitcode

    Compile your C code with debug symbols and without optimizations to preserve the structure for analysis:

    clang -g -S -emit-llvm -Xclang -disable-O0-optnone test/test.c -o test/test.ll
    
  2. Run the Analyzer

    Use the opt tool to load the plugin and run the pass:

    opt -load-pass-plugin ./src/LLVMStaticAnalyzer.so \
        -passes="static-analyzer" \
        -disable-output ../test/test.ll
    

Example Test Case

void test_func(int x){
    int a;
    int b = 10;

    if(x > 5){
        printf("%d\n" , a); // bug : this should trigger a warning
    }else{
        printf("%d\n" , b);
    }
}

void test_logic(int x){
    int val;
    if(x > 0){
        val = 1;
    }else{
        val = 2;
    }
    printf("%d" , val); //  this should be safe
}

void test_partial(int x){
    int bug;
    if(x > 100){
        bug = 42;
    }
    printf("%d" , bug);
}

Expected Output:

Analyzing Function: test_func
Warning: Uninitialized read in test_func at line 8
Analyzing Function: test_logic
Analyzing Function: test_partial
Warning: Uninitialized read in test_partial at line 29
Analyzing Function: main

Implementation Details

The Worklist Algorithm

The analyzer uses a std::deque to track Basic Blocks that need processing. A block is re-added to the worklist only if its OutState changes, ensuring the algorithm reaches a fixed-point (convergence) efficiently even in the presence of loops. Plugin Registration

The project uses the PassPluginLibraryInfo structure to register with the LLVM pipeline. This allows it to be used seamlessly with the standard opt toolchain via the -passes flag.

Contributor

Atharva Pingale ( https://github.com/atharva0300 )

About

A robust, flow-sensitive static analysis tool built as an LLVM plugin.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors