https://github.com/yangxuhui/urlp
A command line url parser, written in Python
https://github.com/yangxuhui/urlp
command-line-tool url-parser
Last synced: 5 months ago
JSON representation
A command line url parser, written in Python
- Host: GitHub
- URL: https://github.com/yangxuhui/urlp
- Owner: yangxuhui
- Created: 2019-09-19T04:06:40.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-10-14T12:26:07.000Z (over 6 years ago)
- Last Synced: 2025-07-29T18:23:44.329Z (11 months ago)
- Topics: command-line-tool, url-parser
- Language: Python
- Size: 8.79 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# urlp
A simple command-line utility for parsing URLs, written in Python. Inspired by [urlp](https://github.com/clayallsopp/urlp).
```bash
$ urlp --host "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
www.cnn.com
$ urlp --registered_domain "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
cnn.com
$ urlp --path "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
/service/alert.jsp
$ urlp --path -i 0 "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
service
$ urlp --query "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
s=cnn&v=a
$ urlp --query --query_field=s "http://www.cnn.com/service/alert.jsp?s=cnn&v=a"
cnn
```
urlp often works together with other unix command-line tools. For example:
* Find all hosts in urls, sorted by count.
```bash
cat urlfile | urlp --host | sort | uniq -c | sort -nr -k1,1
```
* Find all url path words (separated by "/"), sorted by count.
```bash
cat urlfile | urlp --path | tr / \\n | awk '$1!=""' | sort | uniq -c | sort -nr -k1,1
```
## Install
```
pip install urlp
```
## Usage
```
$ urlp --help
usage: urlp [-h] [--host] [-p] [-i path_index] [-q] [-k query_field] [-r]
[urls [urls ...]]
A command line url parser
positional arguments:
urls URLs to parse
optional arguments:
-h, --help show this help message and exit
--host hostname
-p, --path Path
-i path_index, --path_index path_index
filter parsed path by index
-q, --query query string
-k query_field, --query_field query_field
value for the specified query field
-r, --registered_domain
registered domain
```