TY - GEN
T1 - Towards Fast Detection and Classification of Moving Objects
AU - Palma-Ugarte, Joaquin
AU - Estacio-Cerquin, Laura
AU - Flores-Benites, Victor
AU - Mora-Colque, Rensso
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - The detection and classification of moving objects are fundamental tasks in computer vision. However, current solutions typically employ two isolated processes for detecting and classifying moving objects. First, all objects within the scene are detected, then, a separate algorithm is employed to determine the subset of objects that are in motion. Furthermore, diverse solutions employ complex networks that require a lot of computational resources, unlike lightweight solutions that could lead to widespread use. We propose an enhancement along with an extended explanation of TRG-Net, a unified model that can be executed on computationally limited devices to detect and classify only moving objects. This proposal is based on the Faster R-CNN architecture, MobileNetV3 as a feature extractor, and an improved GMM-based method for a fast and flexible search of regions of interest. TRG-Net reduces the inference time by unifying moving object detection and image classification tasks, limiting the regions proposals to a configurable fixed number of potential moving objects. Experiments over heterogeneous surveillance videos and the Kitti dataset for 2D object detection show that our approach improves the inference time of Faster R-CNN (from 0.176 to 0.149 s) using fewer parameters (from 18.91 M to 18.30 M) while maintaining average precision (AP = 0.423). Therefore, the enhanced TRG-Net achieves more tangible trade-offs between precision and speed, and it could be applied to address real-world problems.
AB - The detection and classification of moving objects are fundamental tasks in computer vision. However, current solutions typically employ two isolated processes for detecting and classifying moving objects. First, all objects within the scene are detected, then, a separate algorithm is employed to determine the subset of objects that are in motion. Furthermore, diverse solutions employ complex networks that require a lot of computational resources, unlike lightweight solutions that could lead to widespread use. We propose an enhancement along with an extended explanation of TRG-Net, a unified model that can be executed on computationally limited devices to detect and classify only moving objects. This proposal is based on the Faster R-CNN architecture, MobileNetV3 as a feature extractor, and an improved GMM-based method for a fast and flexible search of regions of interest. TRG-Net reduces the inference time by unifying moving object detection and image classification tasks, limiting the regions proposals to a configurable fixed number of potential moving objects. Experiments over heterogeneous surveillance videos and the Kitti dataset for 2D object detection show that our approach improves the inference time of Faster R-CNN (from 0.176 to 0.149 s) using fewer parameters (from 18.91 M to 18.30 M) while maintaining average precision (AP = 0.423). Therefore, the enhanced TRG-Net achieves more tangible trade-offs between precision and speed, and it could be applied to address real-world problems.
KW - Classification
KW - Detection
KW - Gaussian mixture
KW - Lightweight model
KW - Moving objects
UR - http://www.scopus.com/inward/record.url?scp=85202597991&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-66743-5_8
DO - 10.1007/978-3-031-66743-5_8
M3 - Conference contribution
AN - SCOPUS:85202597991
SN - 9783031667428
T3 - Communications in Computer and Information Science
SP - 161
EP - 180
BT - Computer Vision, Imaging and Computer Graphics Theory and Applications - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Revised Selected Papers
A2 - de Sousa, A. Augusto
A2 - Bashford-Rogers, Thomas
A2 - Paljic, Alexis
A2 - Ziat, Mounia
A2 - Hurter, Christophe
A2 - Purchase, Helen
A2 - Radeva, Petia
A2 - Farinella, Giovanni Maria
A2 - Bouatouch, Kadi
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2023
Y2 - 19 February 2023 through 21 February 2023
ER -