https://github.com/gnames/gndoc
https://github.com/gnames/gndoc
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/gnames/gndoc
- Owner: gnames
- License: mit
- Created: 2021-04-28T14:00:01.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2022-10-13T15:36:48.000Z (over 3 years ago)
- Last Synced: 2023-11-30T11:05:31.770Z (over 2 years ago)
- Language: Go
- Size: 4.8 MB
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# GNdoc
GNdoc is a library for extracting the content of a large variety of files
into UTF8-encoded text format.
## Install
```bash
go get github.com/gnames/gndoc
```
## Usage
```go
import (
"fmt"
"path/filepath"
"strings"
"github.com/gnames/gndoc"
)
func Example() {
gnd := gndoc.New(tikaURL)
path := filepath.Join("testdata/file.pdf")
txt, _, err := gnd.TextFromFile(path, false)
if err != nil {
log.Fatal(err)
}
hasText := strings.Contains(txt, "sabana de Bogotá")
fmt.Printf("%v\n", hasText)
path = filepath.Join("testdata/utf8.txt")
txt, _, err = gnd.TextFromFile(path, true)
if err != nil {
log.Fatal(err)
}
hasText = strings.Contains(txt, "Holarctic genus")
fmt.Printf("%v\n", hasText)
url := "https://example.org"
txt, _, err = gnd.TextFromURL(url)
if err != nil {
log.Fatal(err)
}
hasText = strings.Contains(txt, "Example")
fmt.Printf("%v\n", hasText)
}
// Output:
// true
// true
// true
```