- Abstract
- Key Features
- Quick Start
- Dataset Generation
- Training
- Inference
- Requirements
- Performance
- Citation
- License
AsterNav by Perception and Autonomous Robotics Group at the Department of Robotics Engineering, Worcester Polytechnic Institute.
Autonomous aerial navigation in absolute darkness is crucial for post-disaster search and rescue operations, which often occur from disaster-zone power outages. Yet, due to resource constraints, tiny aerial robots, perfectly suited for these operations, are unable to navigate in the darkness to find survivors safely. In this paper, we present an autonomous aerial robot for navigation in the dark by combining an Infra-Red (IR) monocular camera with a large-aperture coded lens and structured light without external infrastructure like GPS or motion-capture. Our approach obtains depth-dependent defocus cues (each structured light point appears as a pattern that is depth dependent), which acts as a strong prior for our AsterNet deep depth estimation model. The model is trained in simulation by generating data using a simple optical model and transfers directly to the real world without any fine-tuning or retraining. AsterNet runs onboard the robot at 20 Hz on an NVIDIA Jetson Orin Nano. Furthermore, our network is robust to changes in the structured light pattern and relative placement of the pattern emitter and IR camera, leading to simplified and cost-effective construction. We successfully evaluate and demonstrate our proposed depth navigation approach AsterNav using depth from AsterNet in many real-world experiments using only onboard sensing and computation, including dark matte obstacles and thin ropes (diameter 6.25mm), achieving an overall success rate of 95.5% with unknown object shapes, locations and materials. To the best of our knowledge, this is the first work on monocular, structured-light-based quadrotor navigation in absolute darkness.
- Real-time Depth Estimation: 20Hz inference on Jetson Nano
- Autonomous Navigation: Potential field-based path planning with obstacle avoidance
- Coded Aperture Imaging: PSF-based depth estimation for improved accuracy
- Edge Deployment: Optimized for NVIDIA Jetson platforms
- Synthetic Dataset Generation: Automated creation of training data
- Uncertainty Estimation: Dual-output model with depth and uncertainty prediction
- MAVLink Integration: Direct control of Pixhawk-based drones
# Clone the repository
git clone https://github.com/your-username/AsterNav.git
cd AsterNav
# Install dependencies
pip install -r requirements.txt# Option 1: Generate data and train
cd Training
python datagen.py --config config.yaml
python train.py --config config.yaml
# Option 2: Use existing data (skip datagen.py)
cd Training
python train.py --config config.yaml
# Run inference
cd ../Inference
python infer_depth.py --input assets/augmented_imgs/ --output results/The dataset generation uses a YAML configuration file (Training/config.yaml) with the following key parameters:
input_img_folder: '../coco2014/images/train2014' # Base images
psf_base_folder: 'Lens_16mm_PSF_Data' # PSF patterns
distances: ['0.25', '0.50', '0.75', ..., '3.50'] # Distance ranges (meters)
max_images: 25000 # Target dataset size
target_size: {width: 640, height: 480} # Output resolutioncd Training
# Generate 20-30k training images (recommended)
python datagen.py --config config.yaml --max_images 25000
# With custom parameters
python datagen.py --config config.yaml --output_dir custom_data --seed 42Recommended: Generate 20,000-30,000 images for optimal model performance. The dataset generation process:
- Loads base images from COCO dataset
- Applies PSF patterns at different distances
- Creates irregular polygon overlays
- Generates corresponding depth maps
- Applies optional sensor noise simulation
If you already have generated training data, you can skip the data generation step and directly specify the data paths in the configuration:
# In Training/config.yaml, update the data section:
data:
train_dir: "path/to/your/augmented_imgs" # Path to existing training images
depth_dir: "path/to/your/depth_maps" # Path to existing depth mapsNote: Comment out or skip the data generation step if your data is already available. The training script will automatically load data from the specified directories.
Before training, ensure your data paths are correctly configured in Training/config.yaml:
data:
train_dir: "training_data/augmented_imgs" # Path to training images
depth_dir: "training_data/depth_maps" # Path to depth mapsFor existing data: Update these paths to point to your pre-generated dataset directories.
cd Training
# Basic training
python train.py --config config.yaml
# With Weights & Biases logging
export WANDB_API_KEY=your_key_here
python train.py --config config.yaml --wandb
# Resume training
python train.py --config config.yaml --resume checkpoint.pthKey training parameters in config.yaml:
model:
input_width: 640
input_height: 480
pretrained: true
freeze_encoder: false
memory_efficient: true
training:
batch_size: 8
epochs: 100
learning_rate: 1e-4
mixed_precision: true
data:
train_dir: "training_data/augmented_imgs" # Update this path for existing data
depth_dir: "training_data/depth_maps" # Update this path for existing dataThe training script automatically saves:
- Best model:
densenet_best_215AM.pth(based on validation loss) - Checkpoints: Regular checkpoints during training
- TensorRT engine: Optimized model for inference
Minimal hardware requirements for inference:
- Any modern CPU/GPU (runs at 20Hz on Jetson Nano)
- 8GB RAM minimum
- Python 3.8+
cd Inference
# Single image
python infer_depth.py --input image.jpg --output depth.png
# Batch processing
python infer_depth.py --input assets/augmented_imgs/ --output results/ --batch
# With uncertainty visualization
python infer_depth.py --input image.jpg --output depth.png --uncertainty# Basic navigation
python autonomous_navigation.py
# With custom configuration
python autonomous_navigation.py --config navigation_config.yaml
# Disable logging for performance
python autonomous_navigation.py --no_log# Convert PyTorch model to TensorRT
python convert_to_trt.py --model densenet_best_215AM.pth --output densenet_32_unet.trt --precision fp16
# Run optimized inference
python infer_depth.py --input image.jpg --backend tensorrt --model densenet_32_unet.trttorch>=2.0.0
torchvision>=0.15.0
opencv-python>=4.8.0
numpy>=1.24.0
scipy>=1.10.0
tqdm>=4.65.0
pyyaml>=6.0
tensorrt>=8.0.0 # For optimized inference
wandb>=0.15.0 # For training logging
pymavlink>=2.4.0 # For drone control
pyrealsense2>=2.54.0 # For RealSense cameras
Training:
- NVIDIA GPU with 16GB+ VRAM
- 32GB RAM
Inference:
- Any modern CPU/GPU
- 8GB RAM
- Compatible with Jetson Nano/Xavier/Orin
Note: Model achieves 20Hz inference on Jetson Nano with TensorRT optimization.
If you use AsterNav in your research, please cite our paper:
@ARTICLE{11346995,
author={Singh, Deepak and Khobragade, Shreyas and Sanket, Nitin J.},
journal={IEEE Robotics and Automation Letters},
title={AsterNav: Autonomous Aerial Robot Navigation In Darkness Using Passive Computation},
year={2026},
volume={},
number={},
pages={1-8},
keywords={Aerial systems;Cameras;Navigation;Uncertainty;Estimation;Autonomous aerial vehicles;Depth estimation;Autonomous navigation;computational imaging;coded aperture;darkness;low-light;passive sensing},
doi={10.1109/LRA.2026.3653388}
}This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions or collaborations, please contact the authors.
AsterNav: Autonomous Aerial Robot Navigation In Darkness Using Passive Computation
