Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/asutosh11/documentreader
This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.
https://github.com/asutosh11/documentreader
android-library docparser docx filereader kotlin kotlin-android pdf pdf-document pdfreader txtreader
Last synced: 7 days ago
JSON representation
This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.
- Host: GitHub
- URL: https://github.com/asutosh11/documentreader
- Owner: Asutosh11
- License: mit
- Created: 2020-07-26T17:18:18.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-01-27T18:07:33.000Z (almost 2 years ago)
- Last Synced: 2025-01-08T22:22:25.451Z (16 days ago)
- Topics: android-library, docparser, docx, filereader, kotlin, kotlin-android, pdf, pdf-document, pdfreader, txtreader
- Language: Kotlin
- Homepage:
- Size: 171 KB
- Stars: 97
- Watchers: 5
- Forks: 16
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![](https://jitpack.io/v/Asutosh11/DocumentReader.svg)](https://jitpack.io/#Asutosh11/DocumentReader)
[![API](https://img.shields.io/badge/API-5%2B-orange.svg?style=flat)](https://android-arsenal.com/api?level=5)
[![Android Arsenal](https://img.shields.io/badge/Android%20Arsenal-DocumentReader-blue.svg?style=flat)](https://android-arsenal.com/details/1/8136)# DocumentReader
This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.If you have ever tried to read contents of a PDF or MS word document on Android, you know how painful it is.
This library makes your work easy.
Dependency for build.gradle (Project level)
```
repositories {
...
maven { url 'https://jitpack.io' }
}
```Dependency for build.gradle (Module: app)
```
dependencies {
....
implementation 'com.github.Asutosh11:DocumentReader:0.12'
// NOTE: use this only if you get a multidex exception
implementation "androidx.multidex:multidex:2.0.1"
}
``````
// NOTE: use this only if you get an error like - More than one file was found with OS independent path
packagingOptions {
exclude 'META-INF/DEPENDENCIES'
exclude 'META-INF/INDEX.LIST'
exclude 'META-INF/spring.handlers'
exclude 'META-INF/spring.schemas'
exclude 'META-INF/cxf/bus-extensions.txt'
}
``````
// NOTE: use this only if you get a multidex exception
defaultConfig {
...
multiDexEnabled true
}
```
How to use it?
```
// Read a pdf file from Uri
val docString : String = DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
// Read a pdf file from File
val docString : String = DocumentReaderUtil.readPdfFromFile(file, applicationContext)
``````
// read a doc file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a doc file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
``````
// read a docx file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a docx file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
``````
// read a txt file from Uri
val docString : String = DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
``````
/*
Even if you don't know your file type,
this library detects the file mime type and gives you the content of the file as a String
*/
val docString : String = when (DocumentReaderUtil.getMimeType(fileUri, applicationContext)) {
"text/plain" -> DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
"application/pdf" -> DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
"application/msword" -> DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
"application/vnd.openxmlformats-officedocument.wordprocessingml.document" ->
DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
else -> ""
}
```
Thanks
The Apache Tika project
Apache's PdfBox port by TomRoush