The Causal Analysis Toolkit provides a comprehensive pipeline for:
- Dataset Preprocessing
- Exploratory Data Analysis (EDA) : Histograms and scatterplots
- Pairwise Dependence Analysis : Using Kernel-based Conditional Independence (KCI), Randomized Conditional Independence Test (RCIT), and HSIC tests
- Causal Discovery : Using the PC algorithm from
causal-learn, as well as GES.
This repository aims to provide researchers with an easy-to-use framework for causal discovery from observational data.
✅ Preprocessing : Handles dataset loading and cleaning
✅ Visualizations : Generates histograms and pairwise scatterplots
✅ Statistical Independence Tests : Computes CI p-values for multiple tests
✅ Causal Discovery : Uses the PC/GES algorithm to infer causal graphs
git clone https://github.com/lokali/causal_analysis_toolkit.git
cd causal_analysis_toolkitpip install -r requirements.txtjupyter notebook 01_data_analysis.ipynbThis notebook will:
- Load and visualize the dataset
- Generate histograms and scatterplots
- Compute KCI, RCIT, and HSIC dependence matrices
Run run_pc in 01_data_analysis.ipynb and or directly run the pc algorithm from causal-learn:
from utils import run_pc
cg, path = run_pc(data=df.values, alpha=0.01, indep_test='fisherz', label=df.columns.values)
or
from causallearn.search.ConstraintBased.PC import pc
cg = pc(df.values, alpha=0.01, indep_test="fisherz")This will infer the causal structure and plot the causal graph.
causal_analysis_toolkit/
│── 01_data_analysis.ipynb # Jupyter Notebook for data analysis
│── utils.py # Utility functions for CI tests & visualization
│── requirements.txt # List of dependencies
│── README.md # Documentation
│── results/ # Output directory for figures & logs
For questions or feedback, feel free to contact me via Longkang.Li@mbzuai.ac.ae.