{"id":24973291,"url":"https://github.com/bluedynamics/souper","last_synced_at":"2025-10-13T11:41:08.726Z","repository":{"id":3759585,"uuid":"4835646","full_name":"bluedynamics/souper","owner":"bluedynamics","description":"Generic Indexed Storage based the Zope Object Database (ZODB) and repoze.catalog","archived":false,"fork":false,"pushed_at":"2022-12-05T11:56:51.000Z","size":66,"stargazers_count":5,"open_issues_count":2,"forks_count":4,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-09-24T20:51:53.025Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://pypi.org/project/souper/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bluedynamics.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.rst","contributing":null,"funding":null,"license":"LICENSE.rst","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2012-06-29T18:24:35.000Z","updated_at":"2024-06-25T09:57:23.000Z","dependencies_parsed_at":"2022-09-04T02:20:46.826Z","dependency_job_id":null,"html_url":"https://github.com/bluedynamics/souper","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/bluedynamics/souper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bluedynamics%2Fsouper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bluedynamics%2Fsouper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bluedynamics%2Fsouper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bluedynamics%2Fsouper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bluedynamics","download_url":"https://codeload.github.com/bluedynamics/souper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bluedynamics%2Fsouper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279014804,"owners_count":26085594,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-03T18:27:07.874Z","updated_at":"2025-10-13T11:41:08.705Z","avatar_url":"https://github.com/bluedynamics.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n.. image:: https://travis-ci.org/bluedynamics/souper.svg?branch=master\n    :target: https://travis-ci.org/bluedynamics/souper\n\nZODB Storage for lots of (light weight) data.\n\nUtilizes:\n\n- `ZODB \u003chttp://www.zodb.org/\u003e`_ and its `BTrees \u003chttp://www.zodb.org/documentation/guide/modules.html#btrees-package\u003e`_,\n- `node \u003chttp://pypi.python.org/pypi/node\u003e`_ (and `node.ext.zodb \u003chttp://pypi.python.org/pypi/node.ext.zodb\u003e`_).\n- `repoze.catalog \u003chttp://pypi.python.org/pypi/repoze.catalog\u003e`_,\n\n.. image:: https://raw.githubusercontent.com/bluedynamics/souper/master/docs/Souper-64.png\n\nSouper is a tool for programmers. It offers an integrated storage tied together with indexes in a catalog.\nThe records in the storage are generic.\nIt is possible to store any data on a record if it is persistent pickable in ZODB.\n\nSouper can be used used in any Python application, either standalone using the pure ZODB or with `Pyramid \u003chttp://docs.pylonsproject.org/en/latest/docs/pyramid.html\u003e`_, `Zope \u003chttps://www.zope.org/\u003e`_ or `Plone \u003chttp://plone.org\u003e`_.\n\n\nUsing Souper\n============\n\nProviding a Locator\n-------------------\n\nSoups are looked up by adapting ``souper.interfaces.IStorageLocator`` to some context.\nSouper does not provide any default locator.\nSo first one need to be provided. Let's assume context is some persistent dict-like instance\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from zope.interface import implementer\n    \u003e\u003e\u003e from zope.interface import Interface\n    \u003e\u003e\u003e from zope.component import provideAdapter\n    \u003e\u003e\u003e from souper.interfaces import IStorageLocator\n    \u003e\u003e\u003e from souper.soup import SoupData\n    \u003e\u003e\u003e @implementer(IStorageLocator)\n    ... class StorageLocator(object):\n    ...\n    ...     def __init__(self, context):\n    ...        self.context = context\n    ...\n    ...     def storage(self, soup_name):\n    ...        if soup_name not in self.context:\n    ...            self.context[soup_name] = SoupData()\n    ...        return self.context[soup_name]\n\n    \u003e\u003e\u003e provideAdapter(StorageLocator, adapts=[Interface])\n\nSo we have locator creating soups by name on the fly. Now its easy to get a soup by name:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from souper.soup import get_soup\n    \u003e\u003e\u003e soup = get_soup('mysoup', context)\n    \u003e\u003e\u003e soup\n    \u003csouper.soup.Soup object at 0x...\u003e\n\n\nProviding a Catalog Factory\n---------------------------\n\nDepending on your needs the catalog and its indexes may look different from use-case to use-case.\nThe catalog factory is responsible to create a catalog for a soup. The factory is a named utility implementing ``souper.interfaces.ICatalogFactory``.\nThe name of the utility has to the the same as the soup have.\n\nHere ``repoze.catalog`` is used and to let the indexes access the data on the records by key the ``NodeAttributeIndexer`` is used.\nFor special cases one may write its custom indexers, but the default one is fine most of the time:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from souper.interfaces import ICatalogFactory\n    \u003e\u003e\u003e from souper.soup import NodeAttributeIndexer\n    \u003e\u003e\u003e from souper.soup import NodeTextIndexer\n    \u003e\u003e\u003e from zope.component import provideUtility\n    \u003e\u003e\u003e from repoze.catalog.catalog import Catalog\n    \u003e\u003e\u003e from repoze.catalog.indexes.field import CatalogFieldIndex\n    \u003e\u003e\u003e from repoze.catalog.indexes.text import CatalogTextIndex\n    \u003e\u003e\u003e from repoze.catalog.indexes.keyword import CatalogKeywordIndex\n\n    \u003e\u003e\u003e @implementer(ICatalogFactory)\n    ... class MySoupCatalogFactory(object):\n    ...\n    ...     def __call__(self, context=None):\n    ...         catalog = Catalog()\n    ...         userindexer = NodeAttributeIndexer('user')\n    ...         catalog[u'user'] = CatalogFieldIndex(userindexer)\n    ...         textindexer = NodeTextIndexer(['text', 'user')\n    ...         catalog[u'text'] = CatalogTextIndex(textindexer)\n    ...         keywordindexer = NodeAttributeIndexer('keywords')\n    ...         catalog[u'keywords'] = CatalogKeywordIndex(keywordindexer)\n    ...         return catalog\n\n    \u003e\u003e\u003e provideUtility(MySoupCatalogFactory(), name=\"mysoup\")\n\nThe catalog factory is used soup-internal only but one may want to check if it works fine:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e catalogfactory = getUtility(ICatalogFactory, name='mysoup')\n    \u003e\u003e\u003e catalogfactory\n    \u003cMySoupCatalogFactory object at 0x...\u003e\n\n    \u003e\u003e\u003e catalog = catalogfactory()\n    \u003e\u003e\u003e sorted(catalog.items())\n    [(u'keywords', \u003crepoze.catalog.indexes.keyword.CatalogKeywordIndex object at 0x...\u003e),\n    (u'text', \u003crepoze.catalog.indexes.text.CatalogTextIndex object at 0x...\u003e),\n    (u'user', \u003crepoze.catalog.indexes.field.CatalogFieldIndex object at 0x...\u003e)]\n\n\nAdding records\n--------------\n\nAs mentioned above the ``souper.soup.Record`` is the one and only kind of data added to the soup.\nA record has attributes containing the data:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from souper.soup import get_soup\n    \u003e\u003e\u003e from souper.soup import Record\n    \u003e\u003e\u003e soup = get_soup('mysoup', context)\n    \u003e\u003e\u003e record = Record()\n    \u003e\u003e\u003e record.attrs['user'] = 'user1'\n    \u003e\u003e\u003e record.attrs['text'] = u'foo bar baz'\n    \u003e\u003e\u003e record.attrs['keywords'] = [u'1', u'2', u'ü']\n    \u003e\u003e\u003e record_id = soup.add(record)\n\nA record may contains other records. But to index them one would need a custom indexer.\nSo, usually contained records are valuable for later display, not for searching:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e record['subrecord'] = Record()\n    \u003e\u003e\u003e record['homeaddress'].attrs['zip'] = '6020'\n    \u003e\u003e\u003e record['homeaddress'].attrs['town'] = 'Innsbruck'\n    \u003e\u003e\u003e record['homeaddress'].attrs['country'] = 'Austria'\n\n\nAccess data\n-----------\n\nEven without any query a record can be fetched by id:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from souper.soup import get_soup\n    \u003e\u003e\u003e soup = get_soup('mysoup', context)\n    \u003e\u003e\u003e record = soup.get(record_id)\n\nAll records can be accessed using utilizing the container BTree:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e soup.data.keys()[0] == record_id\n    True\n\n\nQuery data\n----------\n\n`How to query a repoze catalog is documented well. \u003chttp://docs.repoze.org/catalog/usage.html#searching\u003e`_\nSorting works the same too.\nQueries are passed to soups ``query`` method (which uses then repoze catalog).\nIt returns a generator:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e from repoze.catalog.query import Eq\n    \u003e\u003e\u003e [r for r in soup.query(Eq('user', 'user1'))]\n    [\u003cRecord object 'None' at ...\u003e]\n\n    \u003e\u003e\u003e [r for r in soup.query(Eq('user', 'nonexists'))]\n    []\n\nTo also get the size of the result set pass a ``with_size=True`` to the query.\nThe first item returned by the generator is the size:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e [r for r in soup.query(Eq('user', 'user1'), with_size-True)]\n    [1, \u003cRecord object 'None' at ...\u003e]\n\n\nTo optimize handling of large result sets one may not to fetch the record but a generator returning light weight objects. Records are fetched on call:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e lazy = [l for l in soup.lazy(Eq('name', 'name'))]\n    \u003e\u003e\u003e lazy\n    [\u003csouper.soup.LazyRecord object at ...\u003e,\n\n    \u003e\u003e\u003e lazy[0]()\n    \u003cRecord object 'None' at ...\u003e\n\nHere the size is passed as first value of the geneartor too if ``with_size=True`` is passed.\n\n\nDelete a record\n---------------\n\nTo remove a record from the soup python ``del`` is used like one would do on\nany dict:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e del soup[record]\n\n\nReindex\n-------\n\nAfter a records data changed it needs a reindex:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e record.attrs['user'] = 'user1'\n    \u003e\u003e\u003e soup.reindex(records=[record])\n\nSometimes one may want to reindex all data. Then ``reindex`` has to be called without parameters.\nIt may take a while:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e soup.reindex()\n\n\nRebuild catalog\n---------------\n\nUsally after a change of the catalog factory was made - i.e. some index was added - a rebuild of the catalog i needed.\nIt replaces the current catalog with a new one created by the catalog factory and reindexes all data.\nIt may take while:\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e soup.rebuild()\n\n\nReset (or clear) the soup\n-------------------------\n\nTo remove all data from the soup and empty and rebuild the catalog call ``clear``.\n\n**Attention**: *All data is lost!*\n\n.. code-block:: pycon\n\n    \u003e\u003e\u003e soup.clear()\n\n\nSource Code\n===========\n\nThe sources are in a GIT DVCS with its main branches at `github \u003chttp://github.com/bluedynamics/souper\u003e`_.\n\nWe'd be happy to see many forks and pull-requests to make souper even better.\n\n\nContributors\n============\n\n- Robert Niederreiter \u003crnix [at] squarewave [dot] at\u003e\n\n- Jens W. Klein \u003cjk [at] kleinundpartner [dot] at\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbluedynamics%2Fsouper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbluedynamics%2Fsouper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbluedynamics%2Fsouper/lists"}