Abstract
The purpose of this study is to develop a real-time UAV-based wildlife surveillance system capable of detecting camouflaged and nocturnal animals using thermal infrared imaging. The study addresses the limitations of RGB and night-vision cameras, which perform poorly in low-light and vegetation-dense environments, by introducing a unified deep learning approach tailored for TIR data. The methodology uses the BIRDSAI aerial thermal dataset and adapts the YOLOR architecture through multi-channel TIR augmentation and adaptive thresholding. The model was evaluated against YOLOv5 and CenterNet2 under identical configurations, with performance assessed through mAP, inference speed, and precision-recall analysis. Experiments were performed on both synthetic and real TIR sequences with extensive augmentation to enhance robustness. The findings show that the proposed YOLOR-based framework achieves a mAP of 38.2% and real-time processing at 73.6 FPS, outperforming YOLOv5 and CenterNet2 in detecting small, low-contrast, and camouflaged animals. Adaptive thresholding improved precision by 4%, particularly for species with overlapping heat signatures. Class-merging and multi-channel enhancement further improved detection stability under limited data conditions. The practical implications indicate that UAV-mounted TIR imaging combined with unified deep learning models offers an efficient solution for nocturnal wildlife protection, anti-poaching operations, and remote habitat monitoring. The system’s real-time capability supports large-scale conservation applications in environments where traditional visual-spectrum methods fail.

