Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/anvie/dotext
Simple Document File Text Extraction Library for Rust
https://github.com/anvie/dotext
Last synced: 4 days ago
JSON representation
Simple Document File Text Extraction Library for Rust
- Host: GitHub
- URL: https://github.com/anvie/dotext
- Owner: anvie
- License: mit
- Created: 2017-11-24T13:23:04.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-04-24T13:14:39.000Z (9 months ago)
- Last Synced: 2025-01-09T15:52:46.131Z (12 days ago)
- Language: Rust
- Homepage:
- Size: 1.35 MB
- Stars: 55
- Watchers: 4
- Forks: 14
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Document File Text Extractor
=============================[![Build Status](https://travis-ci.org/anvie/dotext.svg?branch=master)](https://travis-ci.org/anvie/dotext)
[![Build status](https://ci.appveyor.com/api/projects/status/rghm59ie4ax9655t?svg=true)](https://ci.appveyor.com/project/anvie/dotext)
[![Crates.io](https://img.shields.io/crates/v/dotext.svg)](https://crates.io/crates/dotext)Simple Rust library to extract readable text from specific document format like Word Document (docx).
Currently only support several format, other format coming soon.Supported Document
-------------------------- [x] Microsoft Word (docx)
- [x] Microsoft Excel (xlsx)
- [x] Microsoft Power Point (pptx)
- [x] OpenOffice Writer (odt)
- [x] OpenOffice Spreadsheet (ods)
- [x] OpenDocument Presentation (odp)Usage
------```rust
let mut file = Docx::open("samples/sample.docx").unwrap();
let mut isi = String::new();
let _ = file.read_to_string(&mut isi);
println!("CONTENT:");
println!("----------BEGIN----------");
println!("{}", isi);
println!("----------EOF----------");
```Test
-----```bash
$ cargo test
```or run example:
```bash
$ cargo run --example readdocx data/sample.docx
```[] Robin Sy.