Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/invictus717/pointlanguage
https://github.com/invictus717/pointlanguage
Last synced: 3 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/invictus717/pointlanguage
- Owner: invictus717
- License: apache-2.0
- Created: 2023-07-31T07:16:35.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-07-31T07:20:17.000Z (over 1 year ago)
- Last Synced: 2023-07-31T08:29:24.506Z (over 1 year ago)
- Size: 1.21 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
1
Multimedia Lab, The Chinese University of Hong Kong
2 OpenGVLab,Shanghai AI Laboratory
* Equal Contribution
† Corresponding Author-----------------
## Point as A Foreign Language, Let Large Language Models (LLMs) Perceive 3D Physical World as Reading Articles!
## 🌟 News
* **2023.7.31:** Github Repository Initialization. The paper will be released very soon.## Motivation
We propose to utilize pretrained language models for point cloud understanding. Differernt from existing methods leveraging image as intermediate, we found that language models can read point clouds as a foreign language. Benefit from pretraining on the large-scalle corpus, language models performs better in long-tailed and out-of-distibution tasks in 3D vision area.### A Brief Summary
- 💡 **For multimodal research**, our method explores the **underlying representation relationship between different modalities**, specifically, language and 3D point cloud, and demonstrates that models pretrained on natural language can read 3D point clouds.
- 💡 **For 3D vision research**, our method performs **end-to-end point cloud understanding without hand-crafted structure designs**. And it also demonstrates the feasibility of using **natural corpus text as pretraining data for 3D vision**.
- 💡 **For the vision-language area**, our method experimentally validates that **3D point clouds and text can be encoded by the same parameters**. A new promising direction appears for the tasks involving modality alignment between text and point clouds.
- 💡 With outstanding performance across benchmarks including ModelNet-40, S3DIS, and ShapeNetPart, our method demonstrates its effectiveness on both coarse-grained and fine-grained 3D point cloud tasks.# 🕙 ToDo
- [ ] Support Billion-scale Large Language Models.
- [ ] Large Language Model with More Modalities.
- [ ] Support Outdoor LiDAR Scenes.# ✉️ Contact
If you are interested in this project, welcome to contribute to our project!To contact us, you can send an email to `[email protected]` ,`[email protected]`, or `[email protected]`!
# License
This project is released under the [Apache 2.0 license](LICENSE).
# Acknowledgement
This code is developed based on an excellent open-sourced project [OpenPoints](https://github.com/guochengqian/openpoints).