https://github.com/rushter/find_domains
Library to search for domain names in text data
https://github.com/rushter/find_domains
Last synced: 1 day ago
JSON representation
Library to search for domain names in text data
- Host: GitHub
- URL: https://github.com/rushter/find_domains
- Owner: rushter
- Created: 2020-08-14T10:35:29.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-09-14T13:12:37.000Z (over 4 years ago)
- Last Synced: 2025-04-18T02:47:54.208Z (18 days ago)
- Language: Python
- Size: 12.7 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## find_domains Documentation
This library is for searching domain names in raw text data. First it searches domain-like strings
using simple regexp. Then it uses list of top level domain names to remove names which could be a
domain name i.e. last segment is not top level domain name. TLD list is provided by
[tldextract](https://github.com/john-kurkowski/tldextract) library, technicall that means that
when you will use `find_domains` in first time it will download top level domains list (this is
tldextract behaviour).## Installation
`pip install -U find_domains`
## Usage
```
from find_domains import find_domainsdata = """
foo bar google.com foo.bar.com domain.info
превед-медвед.рф
"""for domain in find_domains(data):
print(domain)
```