Title: |
FasterMLP efficient vision networks combining attention mechanisms and wavelet downsampling. |
Authors: |
Ma, Chenhao1 (AUTHOR), Liu, Xueyuan2 (AUTHOR), Cao, Yong1,3 (AUTHOR), Rong, Jian1,3,4 (AUTHOR) swordrong@swfu.edu.cn |
Source: |
Scientific Reports. 2/15/2025, Vol. 15 Issue 1, p1-14. 14p. |
Subject Terms: |
*MULTILAYER perceptrons, *CONVOLUTIONAL neural networks, *ARTIFICIAL intelligence, *FEATURE extraction, *VISUAL perception |
Abstract: |
The integration of Multi-layer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), and attention mechanisms has been demonstrated to significantly enhance model performance across various computer vision tasks. In this paper, a novel lightweight neural network architecture, FasterMLP, is proposed to achieve high computational efficiency and accuracy, particularly in resource-constrained and real-time applications. FasterMLP is designed to combine the local connectivity and weight-sharing properties of CNNs with the global feature representation capabilities of MLPs, while feature extraction is enhanced through the Convolutional Block Attention Module and spatial dimensions are effectively reduced using Haar wavelet downsampling without sacrificing critical feature information. The architecture, structured into four stages, has been rigorously evaluated on multiple benchmarks. On the ImageNet-1K dataset, a top-1 accuracy 3.9% higher than that of MobileViT-XXS is achieved by FasterMLP-S, while being 2 and 2.7 faster on GPU and CPU, respectively. On the COCO dataset, the performance of FasterMLP-L is shown to be comparable to FasterNet-L with significantly fewer parameters, and on the Cityscapes dataset, a mean Intersection-over-Union of 81.7% is achieved, surpassing existing methods such as CCNet and DANet. These results demonstrate that FasterMLP can effectively balance computational efficiency and accuracy, making it particularly suitable for visual perception tasks in resource-constrained and real-time environments such as autonomous driving. Code is available at https://github.com/windisl/FasterMLP. [ABSTRACT FROM AUTHOR] |
|
Copyright of Scientific Reports is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
Database: |
Academic Search Complete |
Full text is not displayed to guests. |
Login for full access.
|