Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/moskytw/zipcodetw
Find Taiwan ZIP code by address fuzzily.
https://github.com/moskytw/zipcodetw
Last synced: 3 months ago
JSON representation
Find Taiwan ZIP code by address fuzzily.
- Host: GitHub
- URL: https://github.com/moskytw/zipcodetw
- Owner: moskytw
- License: mit
- Created: 2014-01-31T14:08:06.000Z (almost 11 years ago)
- Default Branch: dev
- Last Pushed: 2023-02-15T23:00:08.000Z (almost 2 years ago)
- Last Synced: 2024-09-21T18:39:48.154Z (4 months ago)
- Language: Python
- Homepage:
- Size: 11.7 MB
- Stars: 281
- Watchers: 17
- Forks: 67
- Open Issues: 7
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-tw-foss - Zipcodetw - Find Taiwan ZIP code by address fuzzily. (Lib/Framework/API/Dev Tool)
README
The ZIP Code Finder for Taiwan
==============================This package lets you find ZIP code by address in Taiwan.
The main features:
1. Fast. It builds ZIP code index by tokenization.
2. Gradual. It returns partial ZIP code rather than noting when address is not
detailed enoguh.
3. Stand-alone. It depends on nothing.Usage
-----Find ZIP code gradually:
.. code-block:: python
>>> import zipcodetw
>>> zipcodetw.find('臺北市')
u'1'
>>> zipcodetw.find('臺北市信義區')
u'110'
>>> zipcodetw.find('臺北市信義區市府路')
u'110'
>>> zipcodetw.find('臺北市信義區市府路1號')
u'11008'After v0.3, you even can find ZIP code like:
.. code-block:: python
>>> zipcodetw.find('松山區')
u'105'
>>> zipcodetw.find('秀山街')
u''
>>> zipcodetw.find('台北市秀山街')
u'10042'Installation
------------It is available on PyPI:
.. code-block:: bash
$ sudo pip install zipcodetw
Just install it and have fun. :)
Build Index Manually
--------------------If you install it by ``pip`` or ``python setup.py install``, a ZIP code index
will be built automatically. But if you want to use it from source code, you
have to build an index manually:.. code-block:: bash
$ python -m zipcodetw.builder
Data
----The ZIP code directory is provided by Chunghwa Post, and is available from:
http://www.post.gov.tw/post/internet/Download/all_list.jsp?ID=2201#dl_txt_s_A0206Changelog
---------v0.6.5
~~~~~~1. Updated to the 3+3 v2102.01 data.
2. Fixed a Python 3 bug, maybe.v0.6.2–0.6.4
~~~~~~~~~~~~1. A black hole ate the logs.
v0.6.1
~~~~~~1. Fixed the py2 py3 compatibility. Thanks the contribution from `Poren Chiang `_ and `Ryan `_.
v0.6
~~~~1. Updated the data to 2014/12.
v0.5.7
~~~~~~1. Fixed a rarely issue that causes IndexError.
v0.5.6
~~~~~~1. Reverted removing insignificant tokens introduced in v0.5.4.
2. It now handles insignificant tokens; and
3. redundant units in the finding logic (``directory.find``).
4. Allowed number token ends without unit.
5. Now ``address.tokens`` is a list.v0.5.5
~~~~~~1. Fixed a gradual matching issue causing some wrong results.
v0.5.4
~~~~~~1. Removed the token whose unit is insignificant automatically.
v0.5.3
~~~~~~1. Fixed and simplified the matching logic for address tail.
2. Refined the index building logic.v0.5.2
~~~~~~1. Fixed the issue while it was running in multi-threaded environment.
2. Added a new argument, ``keep_alive``, for the ``Directory`` class.v0.5.1
~~~~~~1. Refined the code slightly.
v0.5
~~~~1. It now builds a ZIP code index when you install it; so
2. the package size is 12.5x smaller.
3. The internal API is better now.v0.4
~~~~1. It now shipped with an index compiled in SQLite; so
2. initiation time is ~680x faster, i.e. ~30ms each import; and
3. ``zipcodetw.find`` is ~1.9x slower, i.e. ~2ms each call; and
4. has bigger package size.
5. All code was moved into ``zipcodetw`` package.
6. ``zipcodetw.find`` now returns unicode instead of string.v0.3
~~~~1. It builds full index for middle tokens; and
2. also normalizes Chinese numerals now!
3. ``zipcodetw.find`` is ~1.06x faster.
4. But initiation time increases to ~1.7x.v0.2
~~~~1. ``zipcodetw.find`` is 8x faster now!
2. It has a better tokenizing logic; and
3. a better matching logic for sub-number now.
4. ``zipcodetw.find_zipcodes`` was removed.
5. Internal API was changed a lot.
6. The tests are better now.