{"id":13600690,"url":"https://github.com/xhochy/fletcher","last_synced_at":"2025-10-21T19:54:13.314Z","repository":{"id":54431319,"uuid":"123808274","full_name":"xhochy/fletcher","owner":"xhochy","description":"Pandas ExtensionDType/Array backed by Apache Arrow","archived":true,"fork":false,"pushed_at":"2023-02-22T15:17:01.000Z","size":562,"stargazers_count":229,"open_issues_count":0,"forks_count":33,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-14T04:19:11.330Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://fletcher.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xhochy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-03-04T16:44:22.000Z","updated_at":"2024-07-13T08:49:51.000Z","dependencies_parsed_at":"2024-01-14T01:07:27.834Z","dependency_job_id":"dfa555c0-c6c6-45aa-ae10-765b627e4eff","html_url":"https://github.com/xhochy/fletcher","commit_stats":{"total_commits":402,"total_committers":27,"mean_commits":14.88888888888889,"dds":0.7164179104477613,"last_synced_commit":"afb2395e4ee764ee5f9dee9c797463f56a55a59d"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xhochy%2Ffletcher","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xhochy%2Ffletcher/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xhochy%2Ffletcher/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xhochy%2Ffletcher/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xhochy","download_url":"https://codeload.github.com/xhochy/fletcher/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322410,"owners_count":21084334,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T18:00:46.681Z","updated_at":"2025-10-21T19:54:12.974Z","avatar_url":"https://github.com/xhochy.png","language":"Python","funding_links":[],"categories":["数据容器和结构","Feature Extraction","Libraries"],"sub_categories":["Text/NLP"],"readme":"# fletcher\n\n![CI](https://github.com/xhochy/fletcher/workflows/CI/badge.svg)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/xhochy/fletcher/master)\n\nA library that provides a generic set of Pandas ExtensionDType/Array\nimplementations backed by Apache Arrow. They support a wider range of types\nthan Pandas natively supports and also bring a different set of constraints and\nbehaviours that are beneficial in many situations.\n\n# 🗃️ Archived successfully 🤘\n\nThis project has been archived as development has ceased around 2021.\nWith the support of [Apache Arrow-backed extension arrays in `pandas`](https://github.com/pandas-dev/pandas/pull/35259), the major goal of this project has been fulfilled.\nAs Marc Garcia outlines in his blog post [\"pandas 2.0 and the Arrow revolution (part I)\"](https://datapythonista.me/blog/pandas-20-and-the-arrow-revolution-part-i) Apache Arrow support in `pandas` is now generally available and here to stay.\n`fletcher` has hopefully discovered some bugs along the way and gave inspiration to the implementation that is now in `pandas`.\n\n## Usage\n\nTo use `fletcher` in Pandas DataFrames, all you need to do is to wrap your data\nin a `FletcherChunkedArray` or `FletcherContinuousArray` object. Your data can \nbe of either `pyarrow.Array`, `pyarrow.ChunkedArray` or a type that can be passed\nto `pyarrow.array(…)`.\n\n\n```\nimport fletcher as fr\nimport pandas as pd\n\ndf = pd.DataFrame({\n    'str_chunked': fr.FletcherChunkedArray(['a', 'b', 'c']),\n    'str_continuous': fr.FletcherContinuousArray(['a', 'b', 'c']),\n})\n\ndf.info()\n\n# \u003cclass 'pandas.core.frame.DataFrame'\u003e\n# RangeIndex: 3 entries, 0 to 2\n# Data columns (total 2 columns):\n#  #   Column          Non-Null Count  Dtype                      \n# ---  ------          --------------  -----                      \n#  0   str_chunked     3 non-null      fletcher_chunked[string]   \n#  1   str_continuous  3 non-null      fletcher_continuous[string]\n# dtypes: fletcher_chunked[string](1), fletcher_continuous[string](1)\n# memory usage: 166.0 bytes\n```\n\n## Development\n\nWhile you can use `fletcher` in pip-based environments, we strongly recommend\nusing a `conda` based development setup with packages from `conda-forge`.\n\n```\n# Create the conda environment with all necessary dependencies\nconda env create\n\n# Activate the newly created environment\nconda activate fletcher\n\n# Install fletcher into the current environment\npython -m pip install -e . --no-build-isolation --no-use-pep517\n\n# Run the unit tests (you should do this several times during development)\npy.test -nauto\n\n# Install pre-commit hooks\n# These will then be automatically run on every commit and ensure that files\n# are black formatted, have no flake8 issues and mypy checks the type consistency.\npre-commit install\n```\n\nCode formatting is done using black. This should keep everything in a\nconsistent styling and the formatting is automatically adjusted via the\npre-commit hooks.\n\n### Using pandas in development mode\n\nTo test and develop against pandas' master or your local fixes, you can install a development version of pandas using:\n\n```\ngit clone https://github.com/pandas-dev/pandas\ncd pandas\n\n# Install additional pandas dependencies\nconda install -y cython\n\n# Build and install pandas\npython setup.py build_ext --inplace -j 4\npython -m pip install -e . --no-build-isolation --no-use-pep517\n```\n\nThis links the development version of `pandas` into your `fletcher` conda environment.\nIf you change any Python code in pandas, it is directly reflected in your environment.\nIf you change any Cython code in pandas, you need to re-execute `python setup.py build_ext --inplace -j 4`.\n\n### Using (py)arrow nightlies\n\nTo test and develop against the latest development version of Apache Arrow (`pyarrow`), you can install it from the `arrow-nightlies` conda channel:\n\n```\nconda install -c arrow-nightlies arrow-cpp pyarrow\n```\n\n### Benchmarks\n\nIn `benchmarks/` we provide a set of benchmarks to compare the performance of\n`fletcher` against `pandas` and ensure that `fletcher` itself stays performant.\nThe benchmarks are written using\n[airspeed velocity](https://asv.readthedocs.io/en/stable/). When developing\nthe benchmarks you can run them using `asv dev` (use `-b \u003cpattern\u003e` to only\nrun a selection of them) only once. To get real benchmark values, you should\nuse `asv run --python=same` to run the benchmarks multiple times and get\nmeaningful average runtimes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxhochy%2Ffletcher","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxhochy%2Ffletcher","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxhochy%2Ffletcher/lists"}