{"id":29647225,"url":"https://github.com/moften/metadata-web-extractor-foca-lite","last_synced_at":"2025-07-22T03:06:59.233Z","repository":{"id":304848043,"uuid":"1020242884","full_name":"moften/Metadata-Web-Extractor-Foca-Lite","owner":"moften","description":"FOCA-LITE – Metadata \u0026 Passive Recon Toolkit","archived":false,"fork":false,"pushed_at":"2025-07-15T15:21:08.000Z","size":59,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-16T09:23:19.291Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/moften.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-15T15:02:13.000Z","updated_at":"2025-07-15T15:21:11.000Z","dependencies_parsed_at":"2025-07-16T13:50:24.430Z","dependency_job_id":"a484c6bf-9f64-4bdf-a669-4fceeca89ca9","html_url":"https://github.com/moften/Metadata-Web-Extractor-Foca-Lite","commit_stats":null,"previous_names":["moften/metadata-web-extractor-foca-lite"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/moften/Metadata-Web-Extractor-Foca-Lite","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moften%2FMetadata-Web-Extractor-Foca-Lite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moften%2FMetadata-Web-Extractor-Foca-Lite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moften%2FMetadata-Web-Extractor-Foca-Lite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moften%2FMetadata-Web-Extractor-Foca-Lite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/moften","download_url":"https://codeload.github.com/moften/Metadata-Web-Extractor-Foca-Lite/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moften%2FMetadata-Web-Extractor-Foca-Lite/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266417122,"owners_count":23925301,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-22T03:06:56.710Z","updated_at":"2025-07-22T03:06:59.225Z","avatar_url":"https://github.com/moften.png","language":"Python","funding_links":["https://www.paypal.com/paypalme/moften"],"categories":[],"sub_categories":[],"readme":"# 🕵️‍♂️ FOCA-LITE – Metadata \u0026 Passive Recon Toolkit\n\n**FOCA-LITE** es una herramienta de análisis pasivo de metadatos inspirada en la legendaria FOCA de Chema Alonso, pero reimaginada en Python (sin windows) por [m10sec](mailto:m10sec@proton.me). Escanea documentos descargados de fuentes públicas, analiza sus metadatos, y genera reportes HTML/CSV detallados. Ideal para OSINT, pentesting y auditorías digitales.\n\n![Banner](docs/banner-focalite.png)\n\n---\n\n## 🚀 Características\n\n- 🧠 Crawling de documentos públicos (`site:` + `filetype:` + `index of`)\n- 🔍 Análisis profundo de metadatos (PDF, DOCX, XLSX, ZIP, JPG/PNG, TXT, CSV)\n- 🧽 Limpieza segura de metadatos sensibles [Reusando el módulo de mi otra tool de limpieza](https://github.com/moften/Metadata-File-Analizer).\n- 📊 Reportes automáticos en **HTML** y **CSV**\n- 🛡️ Ideal para red teams, OSINT y bug bounty hunters\n\n---\n\n## 🧰 Instalación\n\n### 🔧 Requisitos\n\n- Python 3.8+\n- `exiftool` instalado en el sistema:\n  - macOS: `brew install exiftool`\n  - Ubuntu/Debian: `sudo apt install libimage-exiftool-perl`\n\n### 📦 Instalar dependencias\n\n```bash\ngit clone https://github.com/moften/Metadata-Web-Extractor-Foca-Lite.git\ncd Metadata-Web-Extractor-Foca-Lite\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n```\n\n---\n\n## ⚙️ Uso\n\n### 📁 1. Recolección de archivos con dorks\n\n```bash\npython foca_lite.py --domain gobierno.mx --filetype pdf\n```\n\nDescargará documentos públicos y los almacenará en la carpeta `output/`.\n\n### 🔬 2. Escaneo de metadatos\n\n```bash\npython foca_lite.py --analyze output/\n```\n\nMuestra metadatos y permite generar reportes.\n\n### 🧽 3. Limpieza de metadatos\n\n```bash\npython foca_lite.py --clean output/\n```\n\nCrea versiones limpias de todos los documentos.\n\n---\n\n## 📊 Reportes generados\n\n- `report.csv` – formato para análisis con Excel, Google Sheets, pandas.\n- `report.html` – visual amigable con tablas, exportable o embebible.\n\n---\n\n## 📷 Capturas\n\n| Análisis CLI | Reporte HTML |\n|--------------|--------------|\n| ![CLI](docs/demo-cli.png) | ![HTML](docs/demo-report.png) |\n\n---\n\n## 🤝 Contribuciones\n\n¿Encontraste un bug, o quieres colaborar con nuevas features como análisis de OLE, RTF, archivos antiguos de Word o integración con Shodan?\n\n**Pull requests, forks y mejoras bienvenidas.**\n\n---\n\n## 🙌 Apóyame\n\nSi esta herramienta te ha sido útil o quieres apoyar futuros desarrollos, puedes invitarme un café ☕ o hacer una donación. ¡Cualquier apoyo cuenta!\n\n[![Donate with PayPal](https://img.shields.io/badge/PayPal-Donate-blue.svg)](https://www.paypal.com/paypalme/moften)\n\n---\n\n## 👾 Autor\n\n** m10sec **  \nPentester, Red Team Specialist, Dev de herramientas ofensivas.\n\n- 💌 Correo: [m10sec@proton.me](mailto:m10sec@proton.me)\n- 🌐 Blog: [https://m10.com.mx](https://m10.com.mx)\n- 🐦 Twitter: [@hack4lifemx](https://twitter.com/hack4lifemx)\n- 💼 LinkedIn: [Francisco Santibañez](https://www.linkedin.com/in/franciscosantibanez)\n- 🐙 GitHub: [github.com/m10sec](https://github.com/moften)\n\n---\n\n\u003e “Los metadatos no mienten. Solo los ignora quien no sabe lo que busca.” – m10sec","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoften%2Fmetadata-web-extractor-foca-lite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmoften%2Fmetadata-web-extractor-foca-lite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoften%2Fmetadata-web-extractor-foca-lite/lists"}