https://github.com/kevm/tikaondotnet
Use the Java Tika text extraction library on the .NET platform
https://github.com/kevm/tikaondotnet
extract-text tika
Last synced: 7 months ago
JSON representation
Use the Java Tika text extraction library on the .NET platform
- Host: GitHub
- URL: https://github.com/kevm/tikaondotnet
- Owner: KevM
- License: apache-2.0
- Created: 2010-07-02T16:15:05.000Z (over 15 years ago)
- Default Branch: master
- Last Pushed: 2024-04-13T16:01:21.000Z (over 1 year ago)
- Last Synced: 2024-04-29T01:43:13.538Z (over 1 year ago)
- Topics: extract-text, tika
- Language: Rich Text Format
- Homepage: http://kevm.github.io/tikaondotnet/
- Size: 155 MB
- Stars: 193
- Watchers: 23
- Forks: 73
- Open Issues: 33
-
Metadata Files:
- Readme: Readme.md
- Contributing: Contributing.md
- License: LICENSE
Awesome Lists containing this project
README
Tika on .NET
============[](https://ci.appveyor.com/project/KevM/tikaondotnet) [](https://badge.fury.io/nu/TikaOnDotNet.TextExtractor)
This project is a simple wrapper around the very excellent and robust
[Tika](http://tika.apache.org/) text extraction Java library. This project produces two nugets:
- TikaOnDotNet - A straight [IKVM](http://www.ikvm.net/userguide/ikvmc.html) hosted port of Java Tika project.[](https://www.nuget.org/packages/TikaOnDotnet/)
- TikaOnDotNet.TextExtractor - Use Tika to extract text from rich documents.
[](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/)
## Getting Started
The best way to get started is to:
- Add a Nuget dependency to [TikaOnDotNet.TextExtractor](https://www.nuget.org/packages/TikaOnDotNet.TextExtractor/).
- Instantiate a new `TextExtractor` object and call one of the `Extract` methods.### Usage
```cs
// using TikaOnDotNet.TextExtraction;var textExtractor = new TextExtractor();
var wordDocContents = textExtractor.Extract(@".\path\to\my favorite word.docx");
var webPageContents = textExtractor.Extract(new Uri("https://google.com"));
```Take a look at [our tests](https://github.com/KevM/tikaondotnet/tree/master/src/TikaOnDotNet.Tests) for more usage examples.
## How To Contribute
Have an idea to make this project better? Great! Start out by taking a look at our [Contributing Guide](https://github.com/KevM/tikaondotnet/blob/master/Contributing.md).
## Having A Problem?
Search in the [Issues](https://github.com/KevM/tikaondotnet/issues?q=is%3Aopen+is%3Aissue)
as your problem may be a common one. If don't find your problem please [create an
issue](https://github.com/KevM/tikaondotnet/issues/new). Contributors here will
chime in when they can.