Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bridgeconn/anyfile_to_text

text extraction
https://github.com/bridgeconn/anyfile_to_text

Last synced: about 2 months ago
JSON representation

text extraction

Awesome Lists containing this project

README

        

# AnyFile_to_Text

This script convert any file type to .txt format.

### Here are some of the formats supported:
1. Microsoft Office OLE 2 and Office Open XML Formats (.doc, .docx, .xls, .xlsx, .ppt, .pptx)
2. OpenOffice.org OpenDocument Formats (.odt, .ods, .odp)
3. Apple iWorks Formats
4. Rich Text Format (.rtf)
5. Portable Document Format (.pdf)

## DEPENDENCIES
Requires a working 7 JRE for it to work. Download before run script for [JRE Download Link](http://openjdk.java.net/install/)
1. **you can run from terminal in Debian, Ubuntu, etc. $ sudo apt-get install openjdk-7-jre**
2. **Fedora, Oracle Linux, Red Hat Enterprise Linux, etc. $ su -c "yum install java-1.7.0-openjdk"**

### Please follow below steps for convert file
1. Put you file in folder.
2. File may be multiple or single
3. Output will be current folder where Script will be present

### Make file executable with below command from console/ Terminal
$ chmod +x doc_to_text.rb

### Please check example file with below command:
$ ./doc_to_text.rb

Developed by Uday Kumar [email protected] [Bridge Connectivity Solutions](http://bridgeconn.com/)