{"id":27962387,"url":"https://github.com/anttwo/matcha","last_synced_at":"2025-05-07T19:21:27.430Z","repository":{"id":267378240,"uuid":"901064067","full_name":"Anttwo/MAtCha","owner":"Anttwo","description":"(CVPR 2025) Official PyTorch implementation of MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views","archived":false,"fork":false,"pushed_at":"2025-04-03T10:24:26.000Z","size":50610,"stargazers_count":87,"open_issues_count":2,"forks_count":2,"subscribers_count":22,"default_branch":"main","last_synced_at":"2025-04-03T11:30:59.081Z","etag":null,"topics":["differentiable-rendering","gaussian-splatting","meshes","sparse-view","surface-reconstruction"],"latest_commit_sha":null,"homepage":"https://anttwo.github.io/matcha/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Anttwo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-10T01:17:25.000Z","updated_at":"2025-04-03T11:29:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"69db1df3-e1bf-423d-b4ee-c01b465cd806","html_url":"https://github.com/Anttwo/MAtCha","commit_stats":null,"previous_names":["anttwo/matcha"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Anttwo%2FMAtCha","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Anttwo%2FMAtCha/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Anttwo%2FMAtCha/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Anttwo%2FMAtCha/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Anttwo","download_url":"https://codeload.github.com/Anttwo/MAtCha/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252941392,"owners_count":21828868,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["differentiable-rendering","gaussian-splatting","meshes","sparse-view","surface-reconstruction"],"created_at":"2025-05-07T19:21:26.750Z","updated_at":"2025-05-07T19:21:27.410Z","avatar_url":"https://github.com/Anttwo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views\n\n\u003cfont size=\"4\"\u003e\nCVPR 2025\n\u003c/font\u003e\n\u003cbr\u003e\n\n\u003cfont size=\"4\"\u003e\n\u003ca href=\"https://anttwo.github.io/\" style=\"font-size:100%;\"\u003eAntoine Guédon\u003csup\u003e1\u003c/sup\u003e\u003c/a\u003e\u0026emsp;\n\u003ca href=\"https://scholar.google.com/citations?user=-LzEJVwAAAAJ\u0026hl=en\" style=\"font-size:100%;\"\u003eTomoki Ichikawa\u003csup\u003e2\u003c/sup\u003e\u003c/a\u003e\u0026emsp;\n\u003ca href=\"https://kyamashita5.github.io/\" style=\"font-size:100%;\"\u003eKohei Yamashita\u003csup\u003e2\u003c/sup\u003e\u003c/a\u003e\u0026emsp;\n\u003ca href=\"https://vision.ist.i.kyoto-u.ac.jp/\" style=\"font-size:100%;\"\u003eKo Nishino\u003csup\u003e2\u003c/sup\u003e\u003c/a\u003e\u0026emsp;\n\u003c/font\u003e\n\u003cbr\u003e\n\n\u003cfont size=\"4\"\u003e\n\u003csup\u003e1\u003c/sup\u003eLIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS\u003cbr\u003e\n\u003csup\u003e2\u003c/sup\u003eGraduate School of Informatics, Kyoto University, Japan\n\u003c/font\u003e\n\n| \u003ca href=\"https://anttwo.github.io/matcha/\"\u003eWebpage\u003c/a\u003e | \u003ca href=\"https://arxiv.org/abs/2412.06767\"\u003earXiv\u003c/a\u003e |\n\n\u003c!-- \u003cimg src=\"./media/gifs/sirius1.gif\" alt=\"sirius1.gif\" width=\"267\"/\u003e\u003cimg src=\"./media/gifs/buzz_dancing_sh.gif\" alt=\"buzz_dancing_sh.gif\" width=\"267\"/\u003e\u003cimg src=\"./media/gifs/knight_attacking_hq.gif\" alt=\"knight_attacking_hq.gif\" width=\"267\"/\u003e \u003cbr\u003e --\u003e\n\u003cimg src=\"./media/gifs/garden.gif\" alt=\"garden.gif\" width=\"800\"/\u003e \u003cbr\u003e\n\u003cb\u003e\nMAtCha Gaussians reconstruction from 10 input views.\u003cbr\u003e\u003cbr\u003e\n\u003c/b\u003e\n\u003cb\u003e We propose MAtCha Gaussians, a novel surface representation for reconstructing high-quality 3D meshes with photorealistic rendering from sparse-view (or dense-view) images. Our key idea is to model the underlying scene geometry as an Atlas of Charts which we refine with \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussian surfels\u003c/a\u003e. We initialize the charts with a \u003ca href=\"https://depth-anything-v2.github.io/\"\u003emonocular depth estimation model\u003c/a\u003e and refine them using differentiable Gaussian rendering and a lightweight neural chart deformation model. Combined with a sparse-view SfM model like \u003ca href=\"https://europe.naverlabs.com/research/publications/mast3r-sfm-a-fully-integrated-solution-for-unconstrained-structure-from-motion/\"\u003eMASt3R-SfM\u003c/a\u003e, MAtCha can recover sharp and accurate surface meshes of both foreground and background objects in unbounded scenes within minutes, only from a few unposed RGB images.\u003c/b\u003e\n\u003c/div\u003e\n\nThis repository proposes the following key elements, compatible with \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussian Splatting\u003c/a\u003e:\n- **Optimization robust to sparse-view inputs:** Our novel initialization/optimization pipeline is robust to sparse-view inputs (as few as 3 to 10 images) but also scales to dense-view scenarios (hundreds of views). No more choosing between sparse or dense methods!\n- **Scalable mesh extraction method:** Inspired by \u003ca href=\"https://niujinshuchong.github.io/gaussian-opacity-fields/\"\u003eGaussian Opacity Fields\u003c/a\u003e, we developed a novel mesh extraction method for 2DGS that properly handles both foreground and background geometry while being lightweight (only 150-350MB), without any post-processing mesh decimation.\n- **Novel depth regularization:** We also introduce a novel *\"depth-order\"* regularization loss that leverages depth maps estimated with a monocular depth estimator (which can be multi-view inconsistent or have inaccurate scale) to achieve smooth, detailed background geometry while preserving very sharp foreground details.\n\n\u003c!-- ## Abstract\n\n_We present a novel appearance model that simultaneously realizes explicit high-quality 3D surface mesh recovery and photorealistic novel view synthesis from sparse view samples.\nOur key idea is to model the underlying scene geometry Mesh as an Atlas of Charts which we render with 2D Gaussian surfels (MAtCha Gaussians).\nMAtCha distills high-frequency scene surface details from an off-the-shelf monocular depth estimator and refines it through \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussian surfel rendering\u003c/a\u003e. The Gaussian surfels are attached to the charts on the fly, satisfying photorealism of neural volumetric rendering and crisp geometry of a mesh model, i.e., two seemingly contradicting goals in a single model.\nAt the core of MAtCha lies a novel neural deformation model and a structure loss that preserve the fine surface details distilled from learned monocular depths while addressing their fundamental scale ambiguities.\nResults of extensive experimental validation demonstrate MAtCha's state-of-the-art quality of surface reconstruction and photorealism on-par with top contenders but with dramatic reduction in the number of input views and computational time.\nWe believe MAtCha will serve as a foundational tool for any visual application in vision, graphics, and robotics that require explicit geometry in addition to photorealism._ --\u003e\n\n\n## BibTeX\n\n```\n@article{guedon2025matcha,\n title={MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views},\n author={Gu{\\'e}don, Antoine and Ichikawa, Tomoki and Yamashita, Kohei and Nishino, Ko},\n journal={CVPR},\n year={2025}\n}\n```\n\n## Updates and To-do list\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eUpdates\u003c/span\u003e\u003c/summary\u003e\n\u003cul\u003e\n  \u003cli\u003e\u003cb\u003e[04/02/2025]\u003c/b\u003e Code release.\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/details\u003e\u003cbr\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eTo-do list (Bugs and new features)\u003c/span\u003e\u003c/summary\u003e\n\u003cul\u003e\n  \u003cli\u003e\u003cb\u003e(Bug) If input images are too small:\u003c/b\u003e If the largest side of the input images is less than 512 pixels, the code currently returns an error. This should be fixed.\u003c/li\u003e\n  \u003cli\u003e\u003cb\u003e(New feature) Initial charts selection:\u003c/b\u003e For dense-view reconstruction, add a script to automatically select a good subset of images to convert into initial charts. The initial charts just need to cover the scene well, so a simple greedy approach should be sufficient.\u003c/li\u003e\n  \u003cli\u003e\u003cb\u003e(New feature) Evaluation:\u003c/b\u003e Add evaluation scripts.\u003c/li\u003e\n  \u003cli\u003e\u003cb\u003e(New feature) gsplat support:\u003c/b\u003e Make the code compatible with gsplat's rasterizer from the Nerfstudio team.\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/details\u003e\n\n## Overview\n\nThe full MAtCha pipeline consists of 4 steps:\n1. **Scene initialization**: Camera poses are estimated from a sparse set of images using \u003ca href=\"https://europe.naverlabs.com/research/publications/mast3r-sfm-a-fully-integrated-solution-for-unconstrained-structure-from-motion/\"\u003eMASt3R-SfM\u003c/a\u003e. Images can be unposed or posed. For posed images, a COLMAP dataset with ground truth camera poses can be provided to the pipeline. For COLMAP datasets, MAtCha can use either sparse or dense viewpoints.\n2. **Chart Alignment**: Each input image is converted into an optimizable chart. We first initialize the charts with \u003ca href=\"https://depth-anything-v2.github.io/\"\u003eDepthAnythingV2\u003c/a\u003e. Then, we use a novel neural deformation model to align the chart with the scene geometry and produce a multi-view consistent, coherent manifold.\n3. **Chart refinement**: We further refine and densify the geometry with a photometric optimization. To this end, we instantiate \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussians\u003c/a\u003e aligned with the manifold and optimize the representation with differentiable rendering. Gaussians can densify along the manifold to better capture the fine details of the scene.\n4. **Mesh extraction**: We propose a novel mesh extraction method that relies on a depth fusion algorithm and an adaptive tetrahedralization of the scene, similar to \u003ca href=\"https://niujinshuchong.github.io/gaussian-opacity-fields/\"\u003eGaussian Opacity Fields\u003c/a\u003e. Our mesh extraction method scales to large scenes with background objects, allowing for a much more flexible and adaptive mesh with detailed foreground objects and smooth background geometry.\n\nWe provide a dedicated script for each of these steps, as well as a script `train.py` that runs the entire pipeline. We explain how to use this script in the next sections. \u003cbr\u003e\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./media/imgs/buzz1_gt_rgb.jpg\" alt=\"re_gt_rgb.jpg\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/buzz1_matcha_mesh.png\" alt=\"buzz1_matcha_mesh.jpng\" width=\"400\"/\u003e\u003cbr\u003e\n\u003cimg src=\"./media/imgs/re_gt_rgb.jpg\" alt=\"re_gt_rgb.jpg\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/re_matcha_mesh.png\" alt=\"re_matcha_mesh.png\" width=\"400\"/\u003e\n\u003cbr\u003e\u003cb\u003eExamples of MAtCha reconstructions from 10 input views. MAtCha can recover sharp, accurate and complete 3D surface meshes from sparse views, including both foreground and background objects (left: GT images, right: MAtCha mesh).\u003c/b\u003e\u003cbr\u003e\n\u003c/div\u003e\u003cbr\u003e\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./media/imgs/workbench/re_dense_color.JPG\" alt=\"re_dense_color.JPG\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/workbench/re_dense_mesh.JPG\" alt=\"re_dense_mesh.JPG\" width=\"400\"/\u003e\u003cbr\u003e\n\u003cimg src=\"./media/imgs/workbench/garden_dense_color.JPG\" alt=\"garden_dense_color.JPG\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/workbench/garden_dense_mesh.JPG\" alt=\"garden_dense_mesh.JPG\" width=\"400\"/\u003e\u003cbr\u003e\n\u003cimg src=\"./media/imgs/workbench/gundam_dense_color.JPG\" alt=\"gundam_dense_color.JPG\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/workbench/gundam_dense_mesh.JPG\" alt=\"gundam_dense_mesh.JPG\" width=\"400\"/\u003e\u003cbr\u003e\n\u003cimg src=\"./media/imgs/workbench/buzz_dense_color.JPG\" alt=\"buzz_dense_color.JPG\" width=\"400\"/\u003e\n\u003cimg src=\"./media/imgs/workbench/buzz_dense_mesh.JPG\" alt=\"buzz_dense_mesh.JPG\" width=\"400\"/\u003e\n\u003cbr\u003e\u003cb\u003eExamples of MAtCha reconstructions from dense input views (150+ training views). MAtCha can fully leverage priors from monocular depth estimation models, even when the predicted depth maps are not multi-view consistent, and recover sharp and complete  meshes including both foreground and background objects (left: textured mesh, right: untextured mesh).\u003c/b\u003e\u003cbr\u003e\n\u003c/div\u003e\u003cbr\u003e\n\n## License\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eClick here to see content.\u003c/span\u003e\u003c/summary\u003e\n\nThis project builds on existing open-source implementations of the following projects:\n- \u003ca href=\"https://github.com/naver/mast3r/tree/mast3r_sfm\"\u003eMASt3R-SfM\u003c/a\u003e\n- \u003ca href=\"https://github.com/DepthAnything/Depth-Anything-V2\"\u003eDepthAnythingV2\u003c/a\u003e\n- \u003ca href=\"https://github.com/hbb1/2d-gaussian-splatting\"\u003e2D Gaussian Splatting\u003c/a\u003e\n- \u003ca href=\"https://github.com/autonomousvision/gaussian-opacity-fields\"\u003eGaussian Opacity Fields\u003c/a\u003e\n\nAs a consequence, this project contains some code from the above projects, specifically in the `./mast3r/`, `./Depth-Anything-V2/`, and `./2d-gaussian-splatting/` directories.\n\nPlease refer to the LICENSE files in the respective directories for more details about the license of these specific parts of the code, most of them being incompatible with commercial use.\n\nApart from these parts, the rest of the code was entirely written by ourselves and is licensed under the MIT license (see the LICENSE file in the root directory). As a consequence, you are free to use this code for any purpose, commercial or non-commercial.\n\n**Note for commercial use:** If you intend to use this project for commercial purposes, you would need to replace the components with non-commercial licenses with alternatives that permit commercial use.\n\n\u003c/details\u003e\n\n## Installation\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eClick here to see content.\u003c/span\u003e\u003c/summary\u003e\n\n### 0. Requirements\n\nThe software requirements are the following:\n- Conda (recommended for easy setup)\n- C++ Compiler for PyTorch extensions\n- CUDA toolkit 11.8 for PyTorch extensions\n- C++ Compiler and CUDA SDK must be compatible\n\nPlease refer to the original \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussian Splatting repository\u003c/a\u003e for more details about requirements.\n\n### 1. Quick install\n\nPlease start by cloning the repository:\n\n```shell\ngit clone https://github.com/anttwo/MAtCha.git\ncd MAtCha\n```\n\nThen, we provide a script to install all the dependencies for the MAtCha pipeline, as well as a script to download the weights of the pretrained models needed for running the full MAtCha pipeline (\u003ca href=\"https://github.com/naver/mast3r/tree/mast3r_sfm\"\u003eMASt3R-SfM\u003c/a\u003e and \u003ca href=\"https://github.com/DepthAnything/Depth-Anything-V2\"\u003eDepthAnythingV2\u003c/a\u003e).\n\nTo create the conda environment and install all the dependencies, run:\n\n```shell\npython install.py\n```\n\nBy default, the environment will be named `matcha`. You can change the name of the environment with the `--env_name` argument.\n\nTo download all the pretrained models, run:\n\n```shell\npython download_checkpoints.py\n```\n\nIf you encounter any issues when running the installation script, please refer to the following section for detailed instructions.\n\n### 2. Detailed installation (if quick install fails)\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eClick here to see content.\u003c/span\u003e\u003c/summary\u003e\n\n#### 2.1. Install dependencies\n\nPlease follow the instructions below to install the dependencies manually:\n\n```shell\nconda create --name matcha -y python=3.9\nconda activate matcha\n# Choose the right CUDA version for your system\nconda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia\nconda install -c fvcore -c iopath -c conda-forge fvcore iopath\nconda install pytorch3d==0.7.4 -c pytorch3d\nconda install -c plotly plotly\nconda install -c conda-forge rich\nconda install -c conda-forge plyfile==0.8.1\nconda install -c conda-forge jupyterlab\nconda install -c conda-forge nodejs\nconda install -c conda-forge ipywidgets\nconda install cmake\nconda install conda-forge::gmp\nconda install conda-forge::cgal\npip install roma==1.5.0\npip install open3d==0.18.0\npip install opencv-python==4.11.0.86\npip install scipy==1.13.1\npip install einops==0.8.1\npip install trimesh==4.6.4\npip install pyglet==1.5.29\npip install tensorboard\npip install scikit-learn==1.6.1\npip install cython==3.0.12\n# Choose the right CUDA version for your system\npip install faiss-gpu-cu11\npip install tqdm==4.67.1\npip install matplotlib==3.9.4\npip install huggingface-hub[torch]\npip install gradio\n```\n\nThen, install the 2D Gaussian splatting and adaptive tetrahedralization dependencies:\n\n```shell\ncd 2d-gaussian-splatting/submodules/diff-surfel-rasterization\npip install -e .\ncd ../simple-knn\npip install -e .\ncd ../tetra-triangulation\ncmake .\n# you can specify your own cuda path\nexport CPATH=/usr/local/cuda-11.8/targets/x86_64-linux/include:$CPATH\nexport LD_LIBRARY_PATH=/usr/local/cuda-11.8/targets/x86_64-linux/lib:$LD_LIBRARY_PATH\nexport PATH=/usr/local/cuda-11.8/bin:$PATH\nmake \npip install -e .\ncd ../../../\n```\n\nFinally, install the MASt3R-SfM dependencies:\n\n```shell\ncd mast3r/asmk/cython\ncythonize *.pyx\ncd ..\npip install .\ncd ..\ncd ../dust3r/croco/models/curope/\npython setup.py build_ext --inplace\ncd ../../../../../\n```\n\n\n#### 2.2. Download pretrained models\n\nStart by downloading a pretrained checkpoint for DepthAnythingV2. Several encoder sizes are available; We recommend using the `large` encoder:\n\n```shell\nmkdir -p ./Depth-Anything-V2/checkpoints/\nwget https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth -P ./Depth-Anything-V2/checkpoints/\n```\n\nThen, download the MASt3R-SfM checkpoint:\n\n```shell\nmkdir -p ./mast3r/checkpoints/\nwget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth -P ./mast3r/checkpoints/\n```\n\nAnd finally, download the MASt3R-SfM retrieval checkpoint:\n\n```shell\nwget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth -P ./mast3r/checkpoints/\nwget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl -P ./mast3r/checkpoints/\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n## Quick Start\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eClick here to see content.\u003c/span\u003e\u003c/summary\u003e\n\nThis section describes how to run the full MAtCha pipeline on a set of unposed or posed images, with either sparse or dense supervision. For running only specific steps of the pipeline, please refer to the following section. Please make sure to first activate the conda environment created by the installation script:\n\n```shell\nconda activate matcha\n```\n\n\u003c!-- ### Training on unposed images --\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eTraining on unposed images with sparse supervision\u003c/span\u003e\u003c/summary\u003e\n\nYou can run the following single script to optimize a full MAtCha model using a set of unposed images. By default, all images in the directory will be used:\n\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed\n```\n\nYou can select a subset of images to use for the optimization. To this end, you can either use the `--image_idx` argument to select a specific subset of images by index, or the `--n_images` argument to select a fixed number of images. If using the `--n_images` argument, the images will be sampled with constant spacing; You can add the `--randomize_images` argument to shuffle the images before sampling.\n\nTo use a specific subset of images (such as the first 5 images in the directory), you can run:\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --image_idx 0 1 2 3 4\n```\n\nTo use 10 images sampled with constant spacing in the directory, you can run:\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --n_images 10\n```\n\nTo use 10 images randomly sampled in the directory, you can run:\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --n_images 10 --randomize_images\n```\n\n\u003c/details\u003e\n\n\u003c!-- ### Training on posed images (COLMAP dataset) with sparse supervision --\u003e\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eTraining on COLMAP dataset with sparse supervision\u003c/span\u003e\u003c/summary\u003e\n\nYou can also provide a usual COLMAP dataset with ground truth camera poses to MAtCha by using the `--sfm_config posed` argument. In this case, as COLMAP datasets may contain a large number of images, make sure to provide the indices of the images to use for the optimization with the `--image_idx` argument, or to use the `--n_images` argument. For instance, to use only the first 5 images for sparse-view reconstruction, you can run:\n\n```shell\npython train.py -s \u003cPATH TO COLMAP DATASET\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config posed --image_idx 0 1 2 3 4\n```\n\nYou can create a COLMAP dataset from a set of images using the script `2d-gaussian-splatting/convert.py`. Please refer to the repo of our previous work \u003ca href=\"https://github.com/Anttwo/SuGaR\"\u003eSuGaR\u003c/a\u003e or the \u003ca href=\"https://github.com/graphdeco-inria/gaussian-splatting\"\u003eoriginal 3DGS repo\u003c/a\u003e for more details on how to do this.\n\n\u003c/details\u003e\n\n\u003c!-- ### Training on posed images (COLMAP dataset) with dense supervision --\u003e\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eTraining on COLMAP dataset with dense supervision\u003c/span\u003e\u003c/summary\u003e\n\nYou can also provide a usual COLMAP dataset with ground truth camera poses and dense viewpoints to MAtCha by using both the `--sfm_config posed` and `--dense_supervision` arguments. \nIn this case, a subset of images will first be converted into charts for building an initial manifold; The manifold will then be used as a scaffold for optimizing the full model with dense supervision, using all the images in the dataset. We recommend selecting a subset of images with good coverage of the scene for building the initial charts.\n\nWe also provide a novel loss function for regularizing the representation using depth maps for all viewpoints obtained with a monocular depth estimator. Our novel loss function enforces the rendered depth maps to preserve the same depth order as the supervision depth maps; As a result, it does not require supervision depth maps to be multi-view consistent and does not require any additional alignment or rescaling of the depth maps. \n\nYou do not need to provide the supervision depth maps to the pipeline, as our code will automatically use DepthAnythingV2 to generate them.\n\nMake sure to provide the indices of the images to use as initial charts with the `--image_idx` argument, or to use the `--n_images` argument. For instance, to use 10 images with constant spacing as initial charts for dense-view reconstruction, you can run:\n\n```shell\npython train.py -s \u003cPATH TO COLMAP DATASET\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config posed --dense_supervision --n_images 10\n```\n\nBy default, the train.py script will optimize the model for 30,000 iterations when using dense supervision, instead of the default 7,000 iterations used for sparse-view reconstruction. You can change this behavior by using the argument `--free_gaussians_config default` to optimize for 7,000 iterations, or `--free_gaussians_config long` to optimize for 30,000 iterations. \nWhile optimizing for 30,000 iterations will take approximately 50 minutes in total, optimizing for 7,000 iterations will take a few minutes only. However, for dense-view reconstruction, we recommend optimizing for 30,000 iterations to ensure optimal quality.\n\n\u003c/details\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eRunning specific steps of the pipeline\u003c/span\u003e\u003c/summary\u003e\n\nYou can run only one specific step of the pipeline by using one of the following arguments with the `train.py` script:\n\n- `--sfm_only`: Only run the scene initialization using MASt3R-SfM.\n- `--alignment_only`: Only run the chart alignment using our novel neural deformation model.\n- `--refinement_only`: Only run the chart refinement using 2D Gaussians.\n- `--mesh_only`: Only run the mesh extraction relying on our custom and scalable depth fusion algorithm.\n\nRunning a specific step can be useful for experimenting with different hyperparameters, adjusting the strength of the regularization during chart alignment and refinement, trying different resolution for extracting the final mesh, etc.\n\nYou can also combine several of these arguments to run several specific steps in a single run.\n\nThe full training script `train.py` is just a wrapper around individual scripts located in the `./scripts/` directory; Please refer to these scripts for more details on how to use them as well as the different arguments.\n\n\u003c/details\u003e\n\n\u003c!-- ### Stronger regularization for removing floaters --\u003e\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eStronger regularization for removing floaters\u003c/span\u003e\u003c/summary\u003e\n\nIf some artifacts or floaters are present in the final mesh, you can try to increase the strength of the chart alignment with `--alignment_config strong`:\n\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --n_images 10 --alignment_config strong\n```\n\nIf using a COLMAP dataset with dense supervision, you can also adjust the strength of the dense depth regularization with the `--dense_regul` argument: \n\n- Use `--dense_regul default` for the default regularization\n- Use `--dense_regul strong` for a stronger regularization\n- Use `--dense_regul weak` for a weaker regularization\n- Use `--dense_regul none` to disable the dense regularization.\n\nIncreasing the strength of the dense depth regularization can help removing floaters. For some specific scenes where GT camera poses or monocular depth maps could be inaccurate and contain outliers, the dense depth regularization could be detrimental; In this case, you can also try to weaken or disable the dense depth regularization.\n\nFor instance, to use 10 initial charts generated from 10 images with constant spacing and perform dense-view reconstruction with a stronger dense depth regularization, you can run:\n\n```shell\npython train.py -s \u003cPATH TO COLMAP DATASET\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config posed --dense_supervision --n_images 10 --dense_regul strong\n```\n\n\u003c/details\u003e\n\n\u003c!-- ### Mesh extraction --\u003e\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eImprove Mesh Quality\u003c/span\u003e\u003c/summary\u003e\n\nOur novel mesh extraction method relies on a custom depth fusion algorithm. To scale our method to large scenes with background objects, we adapted the adaptive tetrahedralization method from \u003ca href=\"https://niujinshuchong.github.io/gaussian-opacity-fields/\"\u003eGaussian Opacity Fields (GOF)\u003c/a\u003e: We partition the scene into a set of tetrahedra, such that the local number of tetrahedra directly depends on the local distribution of Gaussians in the scene. This allows for a much more flexible and adaptive mesh with both detailed foreground objects and smooth background geometry.\n\nFor dense-view reconstruction, we iterate over all training viewpoints to extract the final mesh.\n\nFor sparse-view reconstruction, we linearly interpolate pseudo-viewpoints between the neighboring sparse training viewpoints to extract the final mesh. However, in some cases, interpolating pseudo-viewpoints could lead to artifacts in the mesh, especially if interpolated viewpoints end up being inside the geometry.\n\nIf encountering such artifacts, you can try to disable the interpolation with the `--no_interpolated_views` argument:\n\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --n_images 10 --no_interpolated_views\n```\n\nPlease note that we also propose a multi-resolution TSDF fusion method that can be used instead of the adaptive tetrahedralization method. While this method is closer to concurrent works, it may produce artifacts or large holes in the mesh; As a result, we strongly recommend using the adaptive tetrahedralization method for better quality meshes. For using the multi-resolution TSDF fusion method, you can use the `--use_multires_tsdf` argument:\n\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --n_images 10 --use_multires_tsdf\n```\n\n\u003c/details\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eControl the resolution of the final mesh\u003c/span\u003e\u003c/summary\u003e\n\nWhen using the adaptive tetrahedralization method, you can control the downsampling ratio of the tetrahedra set with the `--tetra_downsample_ratio` argument. This parameter directly controls the number of vertices in the final mesh.\n\nWe recommend starting with the default value of `--tetra_downsample_ratio 0.5` and then decreasing to `0.25` if the mesh is too dense, or increasing to `1.0` if the mesh is too sparse.\n\nFor example, to use a downsampling ratio of `0.25` with a dataset of unposed images, you can run:\n\n```shell\npython train.py -s \u003cPATH TO IMAGE DIRECTORY\u003e -o \u003cPATH TO OUTPUT DIRECTORY\u003e --sfm_config unposed --tetra_downsample_ratio 0.25\n```\n\n\u003c/details\u003e\n\n\u003cbr\u003e\n\nPlease refer to the `train.py` script as well as individual scripts in the `./scripts/` directory for more details on the command line arguments. Feel free to modify the config files in the `./configs/` directory or create your own to suit your needs.\n\n\u003c!-- \u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003ePlease click here to see the most important arguments for the `train_full_pipeline.py` script.\u003c/span\u003e\u003c/summary\u003e\n\n| Parameter | Type | Description |\n| :-------: | :--: | :---------: |\n| `--scene_path` / `-s`   | `str` | Path to the source directory containing a COLMAP dataset.|\n| `--gs_output_dir` | `str` | Path to the checkpoint directory of a vanilla 3D Gaussian Splatting model. If no path is provided, the script will start from scratch and first optimize a vanilla 3DGS model. |\n| `--eval` | `bool` | If True, performs an evaluation split of the training images. Default is `False`. |\n| `--regularization_type` / `-r` | `str` | Type of regularization to use for optimizing a Frosting. Can be `\"dn_consistency\"`, `\"density\"` or `\"sdf\"`. We recommend using the newer `\"dn_consistency\"` regularization for best quality meshes. |\n| `--gaussians_in_frosting` | `int` | Number of Gaussians to use in the Frosting layer. Default is `2_000_000`. You can try with `5_000_000` Gaussians for optimal quality. |\n| `--use_occlusion_culling` | `bool` | If True, uses occlusion culling for accelerating optimization and rendering. Sligthly impacts the quality of the rendering. Default is `False`. |\n|`--poisson_depth` | `int` | Depth of the Poisson reconstruction for the mesh extraction. If `-1`, the depth is automatically computed using our heuristic described in the paper. Default is `-1`. You can try to reduce the depth if your mesh has holes or too many ellipsoidal bumps. |\n| `--cleaning_quantile` | `float` | Quantile used for cleaning the mesh after Poisson reconstruction. Default is `0.1`. We recommend `0.1` for real scenes and `0.0` for single-object synthetic scenes. |\n| `--connected_components_vis_th` | `int` | Threshold to use for removing non-visible connected components in the mesh. We recommend using `0.001` for real scenes and `0.5` for single-object synthetic scenes. Default is `0.001`. |\n| `--low_poly` | `bool` | If True, uses the standard config for a low poly mesh, with `200_000` vertices. |\n| `--high_poly` | `bool` | If True, uses the standard config for a high poly mesh, with `1_000_000` vertices. |\n| `--refinement_time` | `str` | Default configs for time to spend on refinement. Can be `\"short\"` (2k iterations), `\"medium\"` (7k iterations) or `\"long\"` (15k iterations). |\n| `--export_ply` | `bool` | If True, export a `.ply` file with the refined Frosting 3D Gaussians at the end of the training. This file can be large (+/- 500MB), but is needed for using a 3DGS viewer. Default is `True`. |\n| `--export_obj` | `bool` | If True, will optimize and export a traditional textured mesh as an `.obj` file from the Frosting model. Computing a traditional color UV texture should just take a few seconds with Nvdiffrast. Default is `True`. |\n| `--texture_square_size` | `int` | Size of the square allocated to each pair of triangles in the UV texture. Increase for higher texture resolution. Please decrease if you encounter memory issues. Default is `8`. |\n|`--white_background` | `bool` | If True, the background of the images will be set to white. Default is `False`. |\n\n\u003c/details\u003e --\u003e\n\n\n\u003c/details\u003e\n\n\n## Evaluation\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cspan style=\"font-weight: bold;\"\u003eClick here to see content.\u003c/span\u003e\u003c/summary\u003e\u003cbr\u003e\n\nWe will release the evaluation scripts soon, including:\n- Surface reconstruction on DTU\n- Surface reconstruction on Tanks and Temples\n- Novel view synthesis on Mip-NeRF 360, Tanks and Temples and DeepBlending\n\nWe still need to clean the codebase a little bit and merge some scripts to make the full evaluation pipeline easier to run.\n\n\u003c/details\u003e\n\n\n## Acknowledgements\n\nWe would like to thank the authors of the following projects for their awesome work, and for providing their code; This work would not have been possible without them.\n\n- \u003ca href=\"https://europe.naverlabs.com/research/publications/mast3r-sfm-a-fully-integrated-solution-for-unconstrained-structure-from-motion/\"\u003eMASt3R-SfM\u003c/a\u003e: We use the MASt3R-SfM pipeline to estimate camera poses and initial geometry from a set of images.\n- \u003ca href=\"https://depth-anything-v2.github.io/\"\u003eDepthAnythingV2\u003c/a\u003e: We use DepthAnythingV2 to initialize the charts.\n- \u003ca href=\"https://surfsplatting.github.io/\"\u003e2D Gaussian Splatting\u003c/a\u003e: We use 2D Gaussian Splatting to refine the charts with a differentiable renderer.\n- \u003ca href=\"https://niujinshuchong.github.io/gaussian-opacity-fields/\"\u003eGaussian Opacity Fields\u003c/a\u003e: We adapted the tetrahedralization method from Gaussian Opacity Fields to extract a mesh from our representation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanttwo%2Fmatcha","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanttwo%2Fmatcha","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanttwo%2Fmatcha/lists"}