FLPC: Fusing language and point cloud for 3D object classification

Xiaozheng Gan, Chengtian Song*, Jili Li, Lizhi Pan, Keyu Xu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This study enhances the accuracy of point cloud classification by introducing novel fusion architecture that fuses language with point cloud, drawing inspiration from recent advancements in multimodal fusion. Conventional neural networks depend extensively on images as intermediaries between language and point clouds, a methodology that lacks robustness and undermines accuracy. To tackle this, we propose FLPC, a groundbreaking fusion method for point cloud classification that integrates semantic information from textual descriptions with geometric features extracted from point cloud data using an attention mechanism. Our approach leverages a pre-trained model to extract both geometric and semantic features from the input data. These features are subsequently integrated through a classifier module, which is designed to effectively utilize the two types of visual features to enhance classification performance. Within the classifier module, three distinct fusion attention architectures (CFA, SFA, PFA) are proposed. This innovative design, which combines point cloud features with language features, results in a significant improvement in overall performance. A comprehensive set of extensive experiments reveals that both CFA and SFA showcase competitive performance. Significantly, PFA not only markedly outperforms the previous multimodal classification baseline model but also eclipses traditional unimodal classification models, achieving state-of-the-art accuracy. Specifically, on the ModelNet40 benchmark, the proposed FLPC method elevates the performance of PointMLP by approximately 1.5 %. Correspondingly, on the ScanObjectNN benchmark, it surpasses PointMLP by 8.7 %. These results underscore the efficacy of FLPC in leveraging multimodal information for 3D classification tasks, setting a new benchmark in the field.

Original languageEnglish
Article number128430
JournalExpert Systems with Applications
Volume296
DOIs
Publication statusPublished - 15 Jan 2026

Keywords

  • Attention mechanism
  • Classification
  • Multimodal fusion
  • Point cloud

Fingerprint

Dive into the research topics of 'FLPC: Fusing language and point cloud for 3D object classification'. Together they form a unique fingerprint.

Cite this