{"id":20007569,"url":"https://github.com/kevm/tikaondotnet","last_synced_at":"2026-01-27T03:36:41.866Z","repository":{"id":963593,"uuid":"753747","full_name":"KevM/tikaondotnet","owner":"KevM","description":"Use the Java Tika text extraction library on the .NET platform","archived":false,"fork":false,"pushed_at":"2024-04-13T16:01:21.000Z","size":162634,"stargazers_count":193,"open_issues_count":33,"forks_count":73,"subscribers_count":23,"default_branch":"master","last_synced_at":"2024-04-29T01:43:13.538Z","etag":null,"topics":["extract-text","tika"],"latest_commit_sha":null,"homepage":"http://kevm.github.io/tikaondotnet/","language":"Rich Text Format","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KevM.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":"Contributing.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2010-07-02T16:15:05.000Z","updated_at":"2024-06-18T13:40:22.834Z","dependencies_parsed_at":"2024-06-18T13:40:04.870Z","dependency_job_id":"8e4c0e35-5b78-4c9c-8cc6-f4b6fb92f929","html_url":"https://github.com/KevM/tikaondotnet","commit_stats":{"total_commits":160,"total_committers":16,"mean_commits":10.0,"dds":"0.18125000000000002","last_synced_commit":"5f7225c8e40d2f38a6f9faed134201b2f191ab5b"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KevM%2Ftikaondotnet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KevM%2Ftikaondotnet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KevM%2Ftikaondotnet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KevM%2Ftikaondotnet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KevM","download_url":"https://codeload.github.com/KevM/tikaondotnet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241444907,"owners_count":19963891,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extract-text","tika"],"created_at":"2024-11-13T06:22:21.051Z","updated_at":"2026-01-27T03:36:41.841Z","avatar_url":"https://github.com/KevM.png","language":"Rich Text Format","funding_links":[],"categories":[],"sub_categories":[],"readme":"Tika on .NET\r\n============\r\n\r\n[![Build status](https://ci.appveyor.com/api/projects/status/ofc68okbo9s75okr?svg=true)](https://ci.appveyor.com/project/KevM/tikaondotnet) [![NuGet version](https://badge.fury.io/nu/TikaOnDotNet.TextExtractor.svg)](https://badge.fury.io/nu/TikaOnDotNet.TextExtractor)\r\n\r\nThis project is a simple wrapper around the very excellent and robust\r\n[Tika](http://tika.apache.org/) text extraction Java library. This project produces two nugets:\r\n- TikaOnDotNet - A straight [IKVM](http://www.ikvm.net/userguide/ikvmc.html) hosted port of Java Tika project.\r\n\r\n[![Install-Package TikaOnDotNet](https://cldup.com/H-IdGdU75T.png)](https://www.nuget.org/packages/TikaOnDotnet/)\r\n\r\n- TikaOnDotNet.TextExtractor - Use Tika to extract text from rich documents.\r\n\r\n[![Install-Package TikaOnDotNet.TextExtractor](https://cldup.com/_BM0b5jVjU.png)](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/)\r\n\r\n## Getting Started \r\n\r\nThe best way to get started is to:\r\n- Add a Nuget dependency to [TikaOnDotNet.TextExtractor](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/).\r\n- Instantiate a new `TextExtractor` object and call one of the `Extract` methods.\r\n\r\n### Usage \r\n```cs\r\n// using TikaOnDotNet.TextExtraction;\r\n\r\nvar textExtractor = new TextExtractor();\r\n\r\nvar wordDocContents = textExtractor.Extract(@\".\\path\\to\\my favorite word.docx\");\r\nvar webPageContents = textExtractor.Extract(new Uri(\"https://google.com\"));\r\n```\r\n\r\nTake a look at [our tests](https://github.com/KevM/tikaondotnet/tree/master/src/TikaOnDotNet.Tests) for more usage examples. \r\n\r\n## How To Contribute\r\n\r\nHave an idea to make this project better? Great! Start out by taking a look at our [Contributing Guide](https://github.com/KevM/tikaondotnet/blob/master/Contributing.md).\r\n\r\n## Having A Problem?\r\n\r\nSearch in the [Issues](https://github.com/KevM/tikaondotnet/issues?q=is%3Aopen+is%3Aissue)\r\nas your problem may be a common one. If don't find your problem please [create an\r\nissue](https://github.com/KevM/tikaondotnet/issues/new). Contributors here will\r\nchime in when they can.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkevm%2Ftikaondotnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkevm%2Ftikaondotnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkevm%2Ftikaondotnet/lists"}