https://github.com/internetarchive/surt
Sort-friendly URI Reordering Transform (SURT) python module
https://github.com/internetarchive/surt
Last synced: 6 months ago
JSON representation
Sort-friendly URI Reordering Transform (SURT) python module
- Host: GitHub
- URL: https://github.com/internetarchive/surt
- Owner: internetarchive
- License: agpl-3.0
- Created: 2012-07-17T23:06:58.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2024-07-30T22:01:09.000Z (over 1 year ago)
- Last Synced: 2025-06-25T08:14:52.133Z (7 months ago)
- Language: Python
- Homepage: http://www.archive.org
- Size: 120 KB
- Stars: 42
- Watchers: 20
- Forks: 16
- Open Issues: 15
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
Sort-friendly URI Reordering Transform (SURT) python package.
Usage:
::
>>> from surt import surt
>>> surt("http://archive.org/goo/?a=2&b&a=1")
'org,archive)/goo?a=1&a=2&b'
>>> surt("http://archive.org/goo/?a=2&b&a=1", trailing_comma=True)
'org,archive,)/goo?a=1&a=2&b'
>>> surt("http://123.456.78.910/goo/?a=2&b&a=1", reverse_ipaddr=False)
'123.456.78.910)/goo?a=1&a=2&b'
Installation:
::
pip install surt
Or install the dev version from git:
::
pip install git+https://github.com/internetarchive/surt.git#egg=surt
More information about SURTs:
http://crawler.archive.org/articles/user\_manual/glossary.html#surt
This is mostly a python port of the webarchive-commons org.archive.url
package. The original java version of the org.archive.url package is
here:
https://github.com/iipc/webarchive-commons/tree/master/src/main/java/org/archive/url
This module depends on the ``tldextract`` module to query the Public
Suffix List. ``tldextract`` can be installed via ``pip``
|Build Status|
.. |Build Status| image:: https://travis-ci.org/internetarchive/surt.svg
:target: https://travis-ci.org/internetarchive/surt