Rapid Detection and Identification of Roadside Wildlife using CCTV with Artificial-Intelligence

Vedant Srinivas

Fraser Shilling, Road Ecology Center, UC Davis

Topic Area

Advancing innovative technologies

Event

Wildlife-vehicle conflict poses injury/mortality risks to both drivers and wildlife. Roadside Animal Detection Systems (RADS) are being developed and deployed to detect animals near or on roads and warn drivers to reduce speed and increase attentiveness. Animal detection systems include video/still camera, Radar, thermal imaging, and advanced Light Detection and Range (LiDAR) sensors. Cameras/video systems provide validation information for other sensor types and are sensors in their own right. Detecting and identifying/classifying objects in moving imagery (e.g., closed-circuit television, CCTV feeds) offers an opportunity to both map an object's trajectory and to determine if it is of concern from a driver-safety point-of-view. We combined low-cost, commercial UltraHD (4K) outdoor CCTV cameras with a customized version of an existing artificial-intelligence-based computer vision model (YOLO, v3) that can use video streams to allow very rapid (<<200 milli-second) detection and classification of moving objects. The CCTV system is part of an investigation for Nevada DOT along USA Parkway (NV 439) comparing 4 types of animal sensors. The AI system uses three processing and analysis steps: 1) the creation of a bounding box around moving objects of interest, 2) classification of the object as vehicle, human, animal, and possibly animal type, and 3) mapping trajectories of moving objects of interest relative to the roadway. The AI model was pre-trained with 200K original images from the COCO dataset across 80 categories in addition to 1000 images of horses, including images from the study area. Tests of the model were conducted using 248 images not used in the training set. The trained model was 91.5% accurate at classifying horses in test images over distances of 10m to 100m, corresponding to 300 x 200 pixels and 100 X 50 pixels in the image respectively. The model mis-classified horses as sheep, cows or dogs 5.2% of the time. It never mis-classified horses as vehicles or people or vice-versa. It also detected and classified vehicles at 99.9% accuracy. Running on an GeForce RTX 3060 with 3584 cuda cores the model was able to run inference at 14 FPS or 70 milliseconds per frame for object classification. Both the AI model and the CCTV camera on the deployed system could be adjusted (e.g., optical zoom, infrared capability, resolution and frame rate) through a remote desktop access. The combination of rapid and accurate large-animal classification with remotely-adjustable hardware and software systems provides the possibility of highly-controllable and adjustable RADS in real-world settings.