https://github.com/bfontaine/summarify
Python library to get a title & description for a URL
https://github.com/bfontaine/summarify
library python3
Last synced: about 1 year ago
JSON representation
Python library to get a title & description for a URL
- Host: GitHub
- URL: https://github.com/bfontaine/summarify
- Owner: bfontaine
- License: mit
- Created: 2018-01-14T19:10:50.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2023-08-16T18:23:17.000Z (almost 3 years ago)
- Last Synced: 2025-04-15T03:15:36.452Z (about 1 year ago)
- Topics: library, python3
- Language: Python
- Homepage:
- Size: 25.4 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Summarify
**Summarify** is a small Python library to extract a title and description from a Web page.
```python
import summarify
summary = summarify.from_url("https://github.com/")
print(summary.title)
print(summary.description)
print(summary.picture)
```
Output:
```text
The world's leading software development platform · GitHub
GitHub is where people build software. More than 27 million people use GitHub to discover, fork, and contribute to over 75 million projects.
https://assets-cdn.github.com/images/modules/open_graph/github-octocat.png
```
## Install
pip3 install summarify
## Usage
```python
import summarify
summary = summarify.from_url("https://...")
# If you already have the HTML:
# summary = summarify.from_html("...")
```
The `Summary` object returned from `summarify.from_url` has the following
attributes:
* `title` (`str` or `None`)
* `description` (`str` or `None`)
* `url` (`str` or `None`): The URL you passed as an argument. If you used
`summary.from_markup`, it’ll try to guess it from the markup.
* `picture` (`str` or `None`): Picture URL
* `author` (`str` or `None`)
* `publisher` (`str` or `None`)
* `excerpt`: Always `None` for now
You can also export a summary as a `dict` for e.g. JSON serialization:
```python
dict(my_summary) # -> {"url": "...", "title": "..."}
```
Be aware that only the non-`None` attributes are included in that dictionnary.