TY - JOUR
T1 - A Lightweight Gaussian-Based Model for Fast Detection and Classification of Moving Objects
AU - Palma-Ugarte, Joaquin
AU - Estacio-Cerquin, Laura
AU - Flores-Benites, Victor
AU - Mora-Colque, Rensso
N1 - Publisher Copyright:
© 2022 by SCITEPRESS-Science and Technology Publications, Lda.
PY - 2023
Y1 - 2023
N2 - Moving object detection and classification are fundamental tasks in computer vision. However, current solutions detect all objects, and then another algorithm is used to determine which objects are in motion. Furthermore, diverse solutions employ complex networks that require a lot of computational resources, unlike lightweight solutions that could lead to widespread use. We introduce TRG-Net, a unified model that can be executed on computationally limited devices to detect and classify just moving objects. This proposal is based on the Faster R-CNN architecture, MobileNetV3 as a feature extractor, and a Gaussian mixture model for a fast search of regions of interest based on motion. TRG-Net reduces the inference time by unifying moving object detection and image classification tasks, and by limiting the regions of interest to the number of moving objects. Experiments over surveillance videos and the Kitti dataset for 2D object detection show that our approach improves the inference time of Faster R-CNN (0.221 to 0.138s) using fewer parameters (18.91 M to 18.30 M) while maintaining average precision (AP=0.423). Therefore, TRG-Net achieves a balance between precision and speed, and could be applied in various real-world scenarios.
AB - Moving object detection and classification are fundamental tasks in computer vision. However, current solutions detect all objects, and then another algorithm is used to determine which objects are in motion. Furthermore, diverse solutions employ complex networks that require a lot of computational resources, unlike lightweight solutions that could lead to widespread use. We introduce TRG-Net, a unified model that can be executed on computationally limited devices to detect and classify just moving objects. This proposal is based on the Faster R-CNN architecture, MobileNetV3 as a feature extractor, and a Gaussian mixture model for a fast search of regions of interest based on motion. TRG-Net reduces the inference time by unifying moving object detection and image classification tasks, and by limiting the regions of interest to the number of moving objects. Experiments over surveillance videos and the Kitti dataset for 2D object detection show that our approach improves the inference time of Faster R-CNN (0.221 to 0.138s) using fewer parameters (18.91 M to 18.30 M) while maintaining average precision (AP=0.423). Therefore, TRG-Net achieves a balance between precision and speed, and could be applied in various real-world scenarios.
KW - Classification
KW - Detection
KW - Gaussian Mixture
KW - Lightweight Model
KW - Moving Objects
UR - http://www.scopus.com/inward/record.url?scp=85184960676&partnerID=8YFLogxK
U2 - 10.5220/0011697200003417
DO - 10.5220/0011697200003417
M3 - Conference article
AN - SCOPUS:85184960676
SN - 2184-5921
VL - 5
SP - 173
EP - 184
JO - Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
JF - Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
T2 - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2023
Y2 - 19 February 2023 through 21 February 2023
ER -