{"id":13814367,"url":"https://github.com/dadoonet/fscrawler","last_synced_at":"2026-03-17T22:12:47.736Z","repository":{"id":3540128,"uuid":"4600110","full_name":"dadoonet/fscrawler","owner":"dadoonet","description":"Elasticsearch File System Crawler (FS Crawler)","archived":false,"fork":false,"pushed_at":"2024-10-29T07:51:25.000Z","size":15604,"stargazers_count":1353,"open_issues_count":139,"forks_count":300,"subscribers_count":73,"default_branch":"master","last_synced_at":"2024-10-29T15:11:04.219Z","etag":null,"topics":["crawler","elasticsearch","java","tika"],"latest_commit_sha":null,"homepage":"https://fscrawler.readthedocs.io/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dadoonet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-06-08T17:23:03.000Z","updated_at":"2024-10-29T14:33:14.000Z","dependencies_parsed_at":"2023-10-11T06:41:42.532Z","dependency_job_id":"7595c692-4150-4475-a0c6-5fd658307e51","html_url":"https://github.com/dadoonet/fscrawler","commit_stats":{"total_commits":1579,"total_committers":55,"mean_commits":28.70909090909091,"dds":"0.29385687143761874","last_synced_commit":"2344c30d2690d03da43712c5eef78eab3ef1b020"},"previous_names":["dadoonet/fsriver"],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dadoonet%2Ffscrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dadoonet%2Ffscrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dadoonet%2Ffscrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dadoonet%2Ffscrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dadoonet","download_url":"https://codeload.github.com/dadoonet/fscrawler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248201993,"owners_count":21064242,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","elasticsearch","java","tika"],"created_at":"2024-08-04T04:01:55.165Z","updated_at":"2026-03-17T22:12:47.722Z","avatar_url":"https://github.com/dadoonet.png","language":"Java","readme":"# File System Crawler for Elasticsearch\n\nWelcome to the FS Crawler for [Elasticsearch](https://elastic.co/)\n\nThis crawler helps to index binary documents such as PDF, Open Office, MS Office.\n\n![FSCrawler Explained - Generated with Gemini](fscrawler-explained.png)\n\n**Main features**:\n\n* Local file system (or a mounted drive) crawling and index new files, update existing ones and removes old ones.\n* Remote file system over SSH/FTP crawling.\n* REST interface to let you \"upload\" your binary documents to elasticsearch.\n\n## Latest versions\n\nCurrent \"most stable\" versions are:\n\n| Elasticsearch | FS Crawler    | Released   | Docs                                                                          |\n|---------------|---------------|------------|-------------------------------------------------------------------------------|\n| 7.x, 8.x, 9.x | 2.10-SNAPSHOT |            | [2.10-SNAPSHOT](https://fscrawler.readthedocs.io/en/latest/)                  |\n\n[![Maven Central](https://img.shields.io/maven-central/v/fr.pilato.elasticsearch.crawler/fscrawler-distribution)](https://repo1.maven.org/maven2/fr/pilato/elasticsearch/crawler/fscrawler-distribution/)\n![GitHub Release Date](https://img.shields.io/github/release-date/dadoonet/fscrawler)\n[![Maven metadata URL](https://img.shields.io/maven-metadata/v?metadataUrl=https%3A%2F%2Fs01.oss.sonatype.org%2Fcontent%2Frepositories%2Fsnapshots%2Ffr%2Fpilato%2Felasticsearch%2Fcrawler%2Ffscrawler-distribution%2Fmaven-metadata.xml\u0026label=Latest%20SNAPSHOT\u0026link=https%3A%2F%2Fs01.oss.sonatype.org%2Fcontent%2Frepositories%2Fsnapshots%2Ffr%2Fpilato%2Felasticsearch%2Fcrawler%2Ffscrawler-distribution%2F)](https://s01.oss.sonatype.org/content/repositories/snapshots/fr/pilato/elasticsearch/crawler/fscrawler-distribution/)\n![GitHub last commit](https://img.shields.io/github/last-commit/dadoonet/fscrawler)\n\n![Docker Pulls](https://img.shields.io/docker/pulls/dadoonet/fscrawler)\n![Docker Image Size (tag)](https://img.shields.io/docker/image-size/dadoonet/fscrawler/2.10-SNAPSHOT?label=Docker%20image%20size)\n![Docker Image Version (latest semver)](https://img.shields.io/docker/v/dadoonet/fscrawler)\n\n## Build and Quality Status\n\n[![Build](https://github.com/dadoonet/fscrawler/actions/workflows/maven.yml/badge.svg)](https://github.com/dadoonet/fscrawler/actions/workflows/maven.yml)\n[![Documentation Status](https://readthedocs.org/projects/fscrawler/badge/?version=latest)](https://fscrawler.readthedocs.io/en/latest/?badge=latest)\n\n[![Lines of Code](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=ncloc)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Duplicated Lines (%)](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=duplicated_lines_density)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Maintainability Rating](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=sqale_rating)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Technical Debt](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=sqale_index)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Reliability Rating](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=reliability_rating)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n\n[![Vulnerabilities](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=vulnerabilities)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Bugs](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=bugs)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=alert_status)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Code Smells](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=code_smells)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n[![Security Rating](https://sonarcloud.io/api/project_badges/measure?project=dadoonet_fscrawler\u0026metric=security_rating)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n\n## GitHub stats\n\n![GitHub commits since latest release (by SemVer including pre-releases)](https://img.shields.io/github/commits-since/dadoonet/fscrawler/latest/master)\n![GitHub commit activity (branch)](https://img.shields.io/github/commit-activity/t/dadoonet/fscrawler)\n![GitHub contributors](https://img.shields.io/github/contributors/dadoonet/fscrawler)\n\n![GitHub issues](https://img.shields.io/github/issues/dadoonet/fscrawler)\n![GitHub pull requests](https://img.shields.io/github/issues-pr/dadoonet/fscrawler)\n\n## Documentation\n\nThe guide has been moved to [ReadTheDocs](https://fscrawler.readthedocs.io/en/latest/).\n\n![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/dadoonet)\n\n## Contribute\n\nWorks on my machine - and yours ! Spin up pre-configured, standardized dev environments of this repository, by clicking on the button below.\n\n[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#/https://github.com/dadoonet/fscrawler)\n\n# License\n\n![GitHub](https://img.shields.io/github/license/dadoonet/fscrawler)\n\nRead more about the [Apache2 License](https://fscrawler.readthedocs.io/en/latest/index.html#license).\n\n# Thanks\n\nThanks to [JetBrains](https://www.jetbrains.com/?from=FSCrawler) for the IntelliJ IDEA License!\n\nThanks to SonarCloud for the free analysis!\n\n[![SonarCloud](https://sonarcloud.io/images/project_badges/sonarcloud-white.svg)](https://sonarcloud.io/summary/new_code?id=dadoonet_fscrawler)\n","funding_links":[],"categories":["Java"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdadoonet%2Ffscrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdadoonet%2Ffscrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdadoonet%2Ffscrawler/lists"}