{"id":30897435,"url":"https://github.com/sharpiless/l2m","last_synced_at":"2025-09-09T00:09:19.356Z","repository":{"id":302029910,"uuid":"1010918100","full_name":"Sharpiless/L2M","owner":"Sharpiless","description":"Official implementation of our ICCV'25 paper \"Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space\"","archived":false,"fork":false,"pushed_at":"2025-06-30T07:21:23.000Z","size":0,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-30T07:46:06.236Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Sharpiless.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-30T03:20:34.000Z","updated_at":"2025-06-30T07:33:03.000Z","dependencies_parsed_at":"2025-06-30T07:56:21.951Z","dependency_job_id":null,"html_url":"https://github.com/Sharpiless/L2M","commit_stats":null,"previous_names":["sharpiless/l2m"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/Sharpiless/L2M","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sharpiless%2FL2M","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sharpiless%2FL2M/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sharpiless%2FL2M/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sharpiless%2FL2M/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Sharpiless","download_url":"https://codeload.github.com/Sharpiless/L2M/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sharpiless%2FL2M/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274231115,"owners_count":25245685,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-09T00:09:08.630Z","updated_at":"2025-09-09T00:09:19.083Z","avatar_url":"https://github.com/Sharpiless.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space\n\n![L2M Logo](https://img.shields.io/badge/L2M-Official%20Implementation-blue)\n\nWelcome to the **L2M** repository! This is the official implementation of our ICCV'25 paper titled \"Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space\".\n\n*Accepted to ICCV 2025 Conference*\n\n---\n\n\u003e 🚨 **Important Notice:**  \n\u003e This repository is the **official implementation** of the ICCV 2025 paper authored by Sharpiless.  \n\u003e  \n\u003e Please be aware that the repository at [https://github.com/chelseaaxy/L2M](https://github.com/chelseaaxy/L2M) is **NOT an official implementation** and is not authorized by the original authors.  \n\u003e  \n\u003e Always refer to this repository for the authentic and up-to-date code.\n\n\n## 🧠 Overview\n\n**Lift to Match (L2M)** is a two-stage framework for **dense feature matching** that lifts 2D images into 3D space to enhance feature generalization and robustness. Unlike traditional methods that depend on multi-view image pairs, L2M is trained on large-scale, diverse single-view image collections.\n\n- **Stage 1:** Learn a **3D-aware ViT-based encoder** using multi-view image synthesis and 3D Gaussian feature representation.\n- **Stage 2:** Learn a **feature decoder** through novel-view rendering and synthetic data, enabling robust matching across diverse scenarios.\n\n\u003e 🚧 Code is still under construction.\n\n---\n\n## 🧪 Feature Visualization\n\nWe compare the 3D-aware ViT encoder from L2M (Stage 1) with other recent methods:\n\n- **DINOv2**: Learning Robust Visual Features without Supervision\n- **FiT3D**: Improving 2D Feature Representations by 3D-Aware Fine-Tuning\n- **Ours: L2M Encoder**\n\nYou can download them from the [Releases](https://github.com/Sharpiless/L2M/releases/tag/checkpoints) page.\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"./assets/sacre_coeur_A_compare.png\" width=\"90%\"\u003e\n  \u003cbr/\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"./assets/sacre_coeur_B_compare.png\" width=\"90%\"\u003e\n  \u003cbr/\u003e\n\u003c/div\u003e\n\n---\n\nTo get the results, make sure your checkpoints and image files are in the correct paths, then run:\n```\npython vis_feats.py \\\n  --img_paths assets/sacre_coeur_A.jpg assets/sacre_coeur_B.jpg \\\n  --ckpt_dino ckpts/dinov2.pth \\\n  --ckpt_fit3d ckpts/fit3d.pth \\\n  --ckpt_L2M ckpts/l2m_vit_base.pth \\\n  --save_dir outputs_vis_feat\n```\n\n## 🏗️ Data Generation\n\nTo enable training from single-view images, we simulate diverse multi-view observations and their corresponding dense correspondence labels in a fully automatic manner.\n\n#### Stage 2.1: Novel View Synthesis\nWe lift a single-view image to a coarse 3D structure and then render novel views from different camera poses. These synthesized multi-view images are used to supervise the feature encoder with dense matching consistency.\n\nRun the following to generate novel-view images with ground-truth dense correspondences:\n```\npython get_data.py \\\n  --output_path [PATH-to-SAVE] \\\n  --data_path [PATH-to-IMAGES] \\\n  --disp_path [PATH-to-MONO-DEPTH]\n```\n\nThis code provides an example on novel view generation with dense matching ground truth.\n\nThe disp_path should contain grayscale disparity maps predicted by Depth Anything V2 or another monocular depth estimator.\n\nBelow are examples of synthesized novel views with ground-truth dense correspondences, generated in Stage 2.1:\n\n\u003cdiv align=\"center\"\u003e \u003cimg src=\"./assets/0_d_00d1ae6aab6ccd59.jpg\" width=\"45%\"\u003e \u003cimg src=\"./assets/2_a_02a270519bdb90dd.jpg\" width=\"45%\"\u003e \u003c/div\u003e \u003cbr/\u003e\n\n![test_000002809](https://github.com/user-attachments/assets/a9c62860-b153-40ab-95cb-fa14cb59490c)\n\n\nThese demonstrate both the geometric diversity and high-quality pixel-level correspondence labels used for supervision.\n\nFor novel-view inpainting, we also provide a better inpainting model fine-tuned from Stable-Diffusion-2.0-Inpainting:\n\n```\nfrom diffusers import StableDiffusionInpaintPipeline\nimport torch\nfrom diffusers.utils import load_image, make_image_grid\nimport PIL\n\n# 指定模型文件路径\nmodel_path = \"Liangyingping/L2M-Inpainting\"  # 替换为你自己的模型路径\n\n# 加载模型\npipe = StableDiffusionInpaintPipeline.from_pretrained(\n    model_path, torch_dtype=torch.float16\n)\npipe.to(\"cuda\")  # 如果有 GPU，可以将模型加载到 GPU 上\n\ninit_image = load_image(\"assets/debug_masked_image.png\")\nmask_image = load_image(\"assets/debug_mask.png\")\nW, H = init_image.size\n\nprompt = \"a photo of a person\"\nimage = pipe(\n    prompt=prompt,\n    image=init_image,\n    mask_image=mask_image,\n    h=512, w=512\n).images[0].resize((W, H))\n\nprint(image.size, init_image.size)\n\nimage2save = make_image_grid([init_image, mask_image, image], rows=1, cols=3)\nimage2save.save(\"image2save_ours.png\")\n```\n\nOr you can manually download the model from [hugging-face](https://huggingface.co/Liangyingping/L2M-Inpainting).\n\n\u003cimg width=\"701\" alt=\"novel-view-sup\" src=\"https://github.com/user-attachments/assets/32bf30f3-7ad4-4e9c-aa94-c1a28c866ddd\" /\u003e\n\n\u003cimg width=\"702\" alt=\"novel-view-mpi\" src=\"https://github.com/user-attachments/assets/25398b47-e61f-4dad-a90e-d30fbda2233f\" /\u003e\n\n#### Stage 2.2: Relighting for Appearance Diversity\nTo improve feature robustness under varying lighting conditions, we apply a physics-inspired relighting pipeline to the synthesized 3D scenes.\n\nRun the following to generate relit image pairs for training the decoder:\n```\npython relight.py\n```\nAll outputs will be saved under the configured output directory, including original view, novel views, and their camera metrics with dense depth.\n\n\u003cimg width=\"565\" alt=\"demo-data\" src=\"https://github.com/user-attachments/assets/a9f29fd8-6616-44de-9325-409708560525\" /\u003e\n\n\n#### Stage 2.3: Sky Masking (Optional)\n\nIf desired, you can run sky_seg.py to mask out sky regions, which are typically textureless and not useful for matching. This can help reduce noise and focus training on geometrically meaningful regions.\n\n```\npython sky_seg.py\n```\n\n![ADE_train_00000971](https://github.com/user-attachments/assets/ef34c52c-7bff-4be6-94dd-9aeedeef0f60)\n\n\n## 🙋‍♂️ Acknowledgements\n\nWe build upon recent advances in [ROMA](https://github.com/Parskatt/RoMa), [GIM](https://github.com/xuelunshen/gim), and [FiT3D](https://github.com/ywyue/FiT3D).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharpiless%2Fl2m","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsharpiless%2Fl2m","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharpiless%2Fl2m/lists"}