{"id":19865560,"url":"https://github.com/revelc/pyaccumulo","last_synced_at":"2025-05-02T05:31:42.752Z","repository":{"id":7464771,"uuid":"8811560","full_name":"revelc/pyaccumulo","owner":"revelc","description":"Python Client Library for Apache Accumulo","archived":false,"fork":false,"pushed_at":"2020-08-01T05:29:36.000Z","size":129,"stargazers_count":26,"open_issues_count":12,"forks_count":22,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-04-19T19:53:40.392Z","etag":null,"topics":["hacktoberfest"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/revelc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-03-16T00:03:42.000Z","updated_at":"2020-10-05T20:31:06.000Z","dependencies_parsed_at":"2022-09-09T02:01:13.165Z","dependency_job_id":null,"html_url":"https://github.com/revelc/pyaccumulo","commit_stats":null,"previous_names":["accumulo/pyaccumulo"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/revelc%2Fpyaccumulo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/revelc%2Fpyaccumulo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/revelc%2Fpyaccumulo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/revelc%2Fpyaccumulo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/revelc","download_url":"https://codeload.github.com/revelc/pyaccumulo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251992906,"owners_count":21677022,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest"],"created_at":"2024-11-12T15:23:12.233Z","updated_at":"2025-05-02T05:31:37.741Z","avatar_url":"https://github.com/revelc.png","language":"Python","readme":"pyaccumulo\n==========\n\nA python client library for Apache Accumulo\n\nLicensed under the Apache 2.0 License\n\nThis is still a work in progress.  Pull requests are welcome.\n\n## Requirements\n\n1. A running Accumulo cluster\n2. The new Accumulo Thrift Proxy (https://issues.apache.org/jira/browse/ACCUMULO-482) running.  See https://github.com/accumulo/pyaccumulo/wiki/pyaccumulo-Tutorial for setup details.\n3. Thrift python lib installed\n\n## Installation\n\n    pip install thrift\n    git clone git@github.com:accumulo/pyaccumulo.git\n\n## Basic Usage\n\n### Creating a connection\n\n    from pyaccumulo import Accumulo, Mutation, Range\n    conn = Accumulo(host=\"my.proxy.hostname\", port=50096, user=\"root\", password=\"secret\")\n\n### Basic Table Operations\n\n    table = \"mytable\"\n    if not conn.table_exists(table):\n        conn.create_table(table)\n\n### Writing Mutations with a BatchWriter (Batched and optimized for throughput)\n\n    wr = conn.create_batch_writer(table)\n    for num in range(0, 1000):\n        m = Mutation(\"row_%d\"%num)\n        m.put(cf=\"cf1\", cq=\"cq1\", val=\"%d\"%num)\n        m.put(cf=\"cf2\", cq=\"cq2\", val=\"%d\"%num)\n        wr.add_mutation(m)\n    wr.close()\n\n### Simple writes (immediate and syncronous)\n\n    for num in range(0, 1000):\n        m = Mutation(\"row_%d\"%num)\n        m.put(cf=\"cf1\", cq=\"cq1\", val=\"%d\"%num)\n        m.put(cf=\"cf2\", cq=\"cq2\", val=\"%d\"%num)\n        conn.write(table, m)\n\n### Scanning a Table\n    \n    # scan the entire table\n    for entry in conn.scan(table):\n        print entry.row, entry.cf, entry.cq, entry.cv, entry.ts, entry.val\n\n    # scan() and batch_scan() return a named tuple of (row, cf, cq, cv, ts, val)\n\n    # scan only a portion of the table\n    for entry in conn.scan(table, scanrange=Range(srow='row_1', erow='row_2'), cols=[[\"cf1\"]]):\n        print entry.row, entry.cf, entry.cq, entry.cv, entry.ts, entry.val\n\n### Using a Batch Scanner\n\n    # scan the entire table with 10 threads\n    for entry in conn.batch_scan(table, numthreads=10):\n        print entry.row, entry.cf, entry.cq, entry.cv, entry.ts, entry.val\n    \n## Running the Examples\n\nRun these commands once before running any of the examples.  \n\n    cd pyaccumulo\n    vi settings.py # change these settings to match your proxy HOST/PORT and USER/PASSWORD\n    export PYTHONPATH=\".\"\n    \nExample of simple ingest and scanning\n\n    python examples/simple.py    \n    \nExample use of Combiners for Analytics    \n    \n    python examples/analytics.py    \n\nExample use Intersecting Iterator for search.\n    \n    # index all the files in the pyaccumulo directory\n    $ python examples/intersecting_iterator/ingest.py ii_file_search *\n    Creating table: ii_file_search\n    indexing file examples/analytics.py\n    indexing file examples/regex_search.py\n    indexing file examples/simple.py\n    indexing file examples/indexed_doc_iterator/ingest.py\n    ...\n\n    # Now search the \"ii_file_search\" table for files that contain \"assert_called_with\" and \"assertEquals\"\n    python examples/intersecting_iterator/search.py ii_file_search assert_called_with assertEquals\n    tests/core_tests.py\n    tests/iterator_tests.py\n\nExample use Document Intersecting Iterator for search.  This indexes the data in a slightly different way so the Iterator returns the document value as opposed to having to fetch it separately.\n    \n    # index all the files in the pyaccumulo directory\n    $ python examples/indexed_doc_iterator/ingest.py dociter_file_search *\n    Creating table: dociter_file_search\n    indexing file examples/analytics.py\n    indexing file examples/regex_search.py\n    indexing file examples/simple.py\n    indexing file examples/indexed_doc_iterator/ingest.py\n    ...\n\n    # Now search the \"dociter_file_search\" table for files that contain \"hashlib\" and \"search_terms\"\n    python examples/indexed_doc_iterator/search.py dociter_file_search hashlib search_terms\n    examples/indexed_doc_iterator/search.py\n    examples/intersecting_iterator/search.py\n    \nExample use of Regex Filter for regex based searching\n\n    python examples/regex_search.py\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frevelc%2Fpyaccumulo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frevelc%2Fpyaccumulo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frevelc%2Fpyaccumulo/lists"}