Abstract
Point cloud upsampling is a critical task in 3D computer vision, aiming to generate dense and uniformly distributed point sets from sparse inputs. While current self-supervised methods show promise, they often struggle with preserving fine-grained geometric details, especially for highly sparse point clouds. To address these limitations, we propose PointUpsampleLLM (PULLM), a novel multimodal framework that leverages the power of large language models (LLMs) to enhance 3D point cloud upsampling. PULLM integrates a pretrained Point Cloud LLM (PointLLM) with visual features extracted from point clouds, learning a unified representation that captures both geometric and semantic information. At the core of our approach is the Feature Aware Translator (FAT) module, which effectively bridges the modality gap between visual and textual features, enhancing the spatial understanding of the LLM. PULLM generates textual descriptions of point clouds on-the-fly, eliminating the need for large paired datasets. Extensive experiments on the PU1K and PUGAN benchmarks demonstrate that PULLM consistently outperforms state-of-the-art methods, achieving significant improvements in Chamfer Distance, Hausdorff Distance, and Point-to-Plane distance metrics. For instance, on the PUGAN dataset with sparse inputs, PULLM achieves a 56.15% improvement in Chamfer Distance over the best baseline. Our qualitative results further illustrate PULLM's superior ability to preserve fine details and generate high-quality upsampled point clouds across various object types and geometries.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing |
| Publication date | 2025 |
| Pages | 1223-1230 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 40th Annual ACM Symposium on Applied Computing, SAC 2025 - Catania, Italy Duration: 31 Mar 2025 → 4 Apr 2025 |
Conference
| Conference | 40th Annual ACM Symposium on Applied Computing, SAC 2025 |
|---|---|
| Country/Territory | Italy |
| City | Catania |
| Period | 31/03/2025 → 04/04/2025 |
| Sponsor | ACM Special Interest Group on Applied Computing |
Keywords
- 3D computer vision
- Feature aware translator (FAT)
- Large language models (LLMs)
- Multimodal learning
- Point cloud upsampling
Fingerprint
Dive into the research topics of 'PULLM: A Multimodal Framework for Enhanced 3D Point Cloud Upsampling Using Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver