Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/v-braun/hero-scrape
Find the hero (main) image of an URL
https://github.com/v-braun/hero-scrape
crawler fastimage hero hero-image opengraph webscraping
Last synced: about 23 hours ago
JSON representation
Find the hero (main) image of an URL
- Host: GitHub
- URL: https://github.com/v-braun/hero-scrape
- Owner: v-braun
- License: mit
- Created: 2018-12-01T19:52:59.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2018-12-24T22:53:10.000Z (almost 6 years ago)
- Last Synced: 2024-06-20T08:10:49.784Z (5 months ago)
- Topics: crawler, fastimage, hero, hero-image, opengraph, webscraping
- Language: Go
- Homepage:
- Size: 80.1 KB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# hero-scrape
> Find the hero (main) image of an URL[![Build Status](https://travis-ci.org/v-braun/hero-scrape.svg?branch=master)](https://travis-ci.org/v-braun/hero-scrape)
[![codecov](https://codecov.io/gh/v-braun/hero-scrape/branch/master/graph/badge.svg)](https://codecov.io/gh/v-braun/hero-scrape)By [v-braun - viktor-braun.de](https://viktor-braun.de).
## Demo
See a demo on https://hero-scrape.viktor-braun.de## Description
hero-scrape extracts the main image of a webpage.
It use different strategies to find the main images (OpenGraph HTML Tags and heuristic search).
You can use the existing strategies or implement your own.To find the "biggest" image it is necessary to download it. [fastimage](https://github.com/rubenfonseca/fastimage/) is the perfect choice for that job.
## Installation
```bash
go get github.com/v-braun/hero-scrape
```## Usage
**With pre configured strategies**
```go
pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()result, _ := heroscrape.Scrape(pageUrl, res.Body)
fmt.Println(result.Image)```
**With cusom strategies**
```go
pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()result, _ := heroscrape.ScrapeWithStrategy(pageUrl, res.Body, , NewOgStrategy(), NewHeuristicStrategy(), YourOwnStrategy())
fmt.Println(result.Image)```
## Related Projects
- [hero-scrape](https://github.com/v-braun/hero-scrape-web) Demo for this lib
- [fastimage](https://github.com/rubenfonseca/fastimage/) Finds the type and/or size of a remote image given its uri, by fetching as little as needed.
- [goquery](https://github.com/PuerkitoBio/goquery) A little like that j-thing, only in Go.## Known Issues
If you discover any bugs, feel free to create an issue on GitHub fork and
send me a pull request.[Issues List](https://github.com/v-braun/hero-scrape/issues).
## Authors
![image](https://avatars3.githubusercontent.com/u/4738210?v=3&s=50)
[v-braun](https://github.com/v-braun/)## Contributing
1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request## License
See [LICENSE](https://github.com/v-braun/hero-scrape/blob/master/LICENSE).