{"id":25861230,"url":"https://github.com/mu373/tailestim","last_synced_at":"2026-04-02T15:28:20.908Z","repository":{"id":280036436,"uuid":"940830655","full_name":"mu373/tailestim","owner":"mu373","description":"Estimate tail parameters of heavy-tailed distributions (including power law exponent gamma) in Python","archived":false,"fork":false,"pushed_at":"2025-10-03T00:19:32.000Z","size":1340,"stargazers_count":1,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-03T02:35:24.928Z","etag":null,"topics":["conda-forge","extreme-value-theory","heavy-tailed-distributions","network-science","power-law","powerlaw","python","scale-free","scale-free-networks"],"latest_commit_sha":null,"homepage":"https://tailestim.readthedocs.io/en/latest/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mu373.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-28T21:52:13.000Z","updated_at":"2025-10-03T00:19:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"94370b54-e6f7-4c25-99ab-2da933b312c6","html_url":"https://github.com/mu373/tailestim","commit_stats":null,"previous_names":["mu373/tailestim"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/mu373/tailestim","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mu373%2Ftailestim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mu373%2Ftailestim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mu373%2Ftailestim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mu373%2Ftailestim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mu373","download_url":"https://codeload.github.com/mu373/tailestim/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mu373%2Ftailestim/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29708363,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-22T05:59:28.568Z","status":"ssl_error","status_checked_at":"2026-02-22T05:58:46.208Z","response_time":110,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["conda-forge","extreme-value-theory","heavy-tailed-distributions","network-science","power-law","powerlaw","python","scale-free","scale-free-networks"],"created_at":"2025-03-01T23:33:00.508Z","updated_at":"2026-02-22T09:41:34.579Z","avatar_url":"https://github.com/mu373.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tailestim\n\n[GitHub](https://github.com/mu373/tailestim) | [PyPI](https://pypi.org/project/tailestim/) | [conda-forge](https://anaconda.org/conda-forge/tailestim) | [Documentation](https://tailestim.readthedocs.io/en/latest/)\n\n[![PyPI version](https://img.shields.io/pypi/v/tailestim)](https://pypi.org/project/tailestim/) [![Conda Version](https://img.shields.io/conda/vn/conda-forge/tailestim.svg)](https://anaconda.org/conda-forge/tailestim) [![PyPI status](https://img.shields.io/pypi/status/tailestim)](https://pypi.org/project/tailestim/)  [![Test CI status](https://github.com/mu373/tailestim/actions/workflows/test.yml/badge.svg)](https://github.com/mu373/tailestim/actions/workflows/test.yml) [![conda-forge build status](https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/tailestim-feedstock?branchName=main)](https://dev.azure.com/conda-forge/feedstock-builds/_build/latest?definitionId=25102\u0026branchName=main) [![GitHub license](https://img.shields.io/github/license/mu373/tailestim)](https://github.com/mu373/tailestim/blob/main/LICENSE.txt)\n\n\nA Python package for estimating tail parameters of heavy-tailed distributions, including the powerlaw exponent. Please note that the package is still in development at the **alpha state**, and thus any *breaking change* may be introduced with coming updates. For changelogs, please refer to the [releases page](https://github.com/mu373/tailestim/releases).\n\n\u003e [!NOTE]\nThe original estimation implementations are from [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation), which is based on the paper [\"Scale-free networks well done\"](https://doi.org/10.1103/PhysRevResearch.1.033034)  (Voitalov et al. 2019). `tailestim` is a wrapper package that provides a more convenient/modern interface and logging, installable through `pip` and `conda`.\n\n## Features\n- Multiple estimation methods including Hill, Moments, Kernel, Pickands, and Smooth Hill estimators\n- Double-bootstrap procedure for optimal threshold selection\n- Built-in example datasets\n\n## Installation\nThe package can be installed from [PyPI](https://pypi.org/project/tailestim/) and [conda-forge](https://anaconda.org/conda-forge/tailestim).\n```bash\npip install tailestim\nconda install conda-forge::tailestim\n```\n\n## Quick Start\n\n### Using Built-in Datasets\n```python\nfrom tailestim import TailData\nfrom tailestim import HillEstimator, KernelTypeEstimator, MomentsEstimator\n\n# Load a sample dataset\ndata = TailData(name='CAIDA_KONECT').data\n\n# Initialize and fit the Hill estimator\nestimator = HillEstimator()\nestimator.fit(data)\n\n# Get the estimated results\nresult = estimator.get_result()\n\n# Get the power law exponent\ngamma = result.gamma_\n\n# Print full results\nprint(result)\n```\n\n### Using degree sequence from networkx graphs\n```python\nimport networkx as nx\nfrom tailestim import HillEstimator, KernelTypeEstimator, MomentsEstimator\n\n# Create or load your network\nG = nx.barabasi_albert_graph(10000, 2)\ndegree = list(dict(G.degree()).values()) # Degree sequence\n\n# Initialize and fit the Hill estimator\nestimator = HillEstimator()\nestimator.fit(degree)\n\n# Get the estimated results\nresult = estimator.get_result()\n\n# Get the power law exponent\ngamma = result.gamma_\n\n# Print full results\nprint(result)\n```\n\n## Available Estimators\nThe package provides several estimators for tail estimation. For details on parameters that can be specified to each estimator, please refer to the original repository [ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation), [original paper](https://doi.org/10.1103/PhysRevResearch.1.033034), or the [actual code](https://github.com/mu373/tailestim/blob/main/src/tailestim/tail_methods.py).\n\n1. **Hill Estimator** (`HillEstimator`)\n   - Classical Hill estimator with double-bootstrap for optimal threshold selection\n   - Generally recommended for power law analysis\n2. **Moments Estimator** (`MomentsEstimator`)\n   - Moments-based estimation with double-bootstrap\n   - More robust to certain types of deviations from pure power law\n3. **Kernel-type Estimator** (`KernelEstimator`)\n   - Kernel-based estimation with double-bootstrap and bandwidth selection\n4. **Pickands Estimator** (`PickandsEstimator`)\n   - Pickands-based estimation (no bootstrap)\n   - Provides arrays of estimates across different thresholds\n5. **Smooth Hill Estimator** (`SmoothHillEstimator`)\n   - Smoothed version of the Hill estimator (no bootstrap)\n\n## Results\nThe full result can be obtained by `estimator.get_result()`, which is a TailEstimatorResult object. This includes attributes such as:\n- `gamma_`: Power law exponent (γ = 1 + 1/ξ)\n- `xi_star_`: Tail index (ξ)\n- `k_star_`: Optimal order statistic\n- Bootstrap results (when applicable):\n  - First and second bootstrap AMSE values\n  - Optimal bandwidths or minimum AMSE fractions\n\n## Example Output\nWhen you `print(result)` after fitting, you will get the following output.\n```\n--------------------------------------------------\nResult\n--------------------------------------------------\nOrder statistics: Array of shape (200,) [1.0000, 1.0000, 1.0000, ...]\nTail index estimates: Array of shape (200,) [1614487461647431761920.0000, 1249994621547387551744.0000, 967791073562264862720.0000, ...]\nOptimal order statistic (k*): 25153\nTail index (ξ): 0.5942\nPower law exponent (γ): 2.6828\nBootstrap Results: \n  First Bootstrap: \n    Fraction of order statistics: None\n    AMSE values: None\n    H Min: 0.9059\n    Maximum index: None\n  Second Bootstrap: \n    Fraction of order statistics: None\n    AMSE values: None\n    H Min: 0.9090\n    Maximum index: None\n```\n\n## Built-in Datasets\n\nThe package includes several example datasets:\n- `CAIDA_KONECT`\n- `Libimseti_in_KONECT`\n- `Pareto` (Follows power-law with $\\gamma=2.5$)\n\nLoad any example dataset using:\n```python\nfrom tailestim import TailData\ndata = TailData(name='dataset_name').data\n```\n\n## Testing\n\nThe package includes comprehensive test suites to ensure correctness and numerical accuracy.\n\n### Running Tests\n\nRun the test suite using pytest:\n```bash\npytest tests/\n```\n\nFor verbose output:\n```bash\npytest tests/ -v\n```\n\n### Test Structure\n\n#### Unit Tests\nLocated in `tests/test_*.py`, these tests verify:\n- Individual estimator functionality (Hill, Moments, Kernel, Pickands)\n- Noise generation and random seed handling\n- Edge cases and error handling\n- Result data structures and attributes\n\n#### Validation Tests\n`tests/test_tailestimation_validation.py` provides cross-package validation against the original [tail-estimation](https://github.com/ivanvoitalov/tail-estimation) implementation:\n- Validates numerical equivalence for each estimator (Hill, Moments, Kernel, Pickands)\n- Comprehensive multi-dataset validation across all estimators\n- Reproducibility tests with various random seeds\n- Plot data comparison (PDF, CCDF, bootstrap AMSE)\n\nThe validation tests ensure that `tailestim` produces **identical results** to the original implementation when using the same `base_seed` parameter.\n\n**Example datasets tested:**\n- CAIDA_KONECT (26,475 samples)\n- Libimseti_in_KONECT (168,791 samples)\n- Pareto distributions (synthetic, various sizes)\n- Complete graphs (synthetic): produces error in both cases, as intended\n\nRun validation tests:\n```bash\npytest tests/test_tailestimation_validation.py -v\n```\n\nRun quick validation (smaller datasets):\n```bash\npytest tests/test_tailestimation_validation.py -k \"quick\" -v\n```\n\n### Interactive Validation\n\nThe [`examples/validation.ipynb`](examples/validation.ipynb) notebook provides an interactive demonstration of the validation process with visualizations comparing `tailestim` and `tail-estimation` outputs side-by-side.\n\n## References\n- I. Voitalov, P. van der Hoorn, R. van der Hofstad, and D. Krioukov. Scale-free networks well done. *Phys. Rev. Res.*, Oct. 2019, doi: [10.1103/PhysRevResearch.1.033034](https://doi.org/10.1103/PhysRevResearch.1.033034).\n- I. Voitalov. `ivanvoitalov/tail-estimation`, GitHub. Mar. 2018. [https://github.com/ivanvoitalov/tail-estimation](https://github.com/ivanvoitalov/tail-estimation).\n\n## Citations\nIf you use `tailestim` in your research or projects, I would greatly appreciate if you could cite this package, the original implementation, and the original paper (Voitalov et al. 2019).\n\n```bibtex\n@article{voitalov2019scalefree,\n  title = {Scale-free networks well done},\n  author = {Voitalov, Ivan and van der Hoorn, Pim and van der Hofstad, Remco and Krioukov, Dmitri},\n  journal = {Phys. Rev. Res.},\n  volume = {1},\n  issue = {3},\n  pages = {033034},\n  numpages = {30},\n  year = {2019},\n  month = {Oct},\n  publisher = {American Physical Society},\n  doi = {10.1103/PhysRevResearch.1.033034},\n  url = {https://link.aps.org/doi/10.1103/PhysRevResearch.1.033034}\n}\n\n@software{voitalov2018tailestimation,\n  author       = {Voitalov, Ivan},\n  title        = {tail-estimation},\n  month        = mar,\n  year         = 2018,\n  publisher    = {GitHub},\n  url          = {https://github.com/ivanvoitalov/tail-estimation}\n}\n\n@software{ueda2025tailestim,\n  author       = {Ueda, Minami},\n  title        = {tailestim: A Python package for estimating tail parameters of heavy-tailed distributions},\n  month        = mar,\n  year         = 2025,\n  publisher    = {GitHub},\n  url          = {https://github.com/mu373/tailestim}\n}\n```\n\n## License\n`tailestim` is distributed under the terms of the [MIT license](https://github.com/mu373/tailestim/blob/main/LICENSE.txt).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmu373%2Ftailestim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmu373%2Ftailestim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmu373%2Ftailestim/lists"}