{"id":29016507,"url":"https://github.com/memgraph/mgcxx","last_synced_at":"2025-07-11T13:43:36.566Z","repository":{"id":208161009,"uuid":"720923321","full_name":"memgraph/mgcxx","owner":"memgraph","description":null,"archived":false,"fork":false,"pushed_at":"2024-09-06T19:20:41.000Z","size":55,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-09-06T22:56:05.922Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/memgraph.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-20T01:20:49.000Z","updated_at":"2024-09-06T19:20:45.000Z","dependencies_parsed_at":"2023-11-20T04:29:34.458Z","dependency_job_id":"30420216-e39e-4e5c-96d4-f98df1062d5c","html_url":"https://github.com/memgraph/mgcxx","commit_stats":null,"previous_names":["memgraph/cxxtantivy","memgraph/mgcxx"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/memgraph/mgcxx","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memgraph%2Fmgcxx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memgraph%2Fmgcxx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memgraph%2Fmgcxx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memgraph%2Fmgcxx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/memgraph","download_url":"https://codeload.github.com/memgraph/mgcxx/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memgraph%2Fmgcxx/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261962050,"owners_count":23236859,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-25T22:30:31.344Z","updated_at":"2025-06-25T22:30:32.010Z","avatar_url":"https://github.com/memgraph.png","language":"Rust","readme":"# mgcxx (experimental)\n\nA collection of C++ wrappers around non-C++ libraries.\nThe list includes:\n  * full-text search enabled by [tantivy](https://github.com/quickwit-oss/tantivy)\n\nRequirements:\n  * cmake 3.15+\n  * rustup toolchain 1.75.0+\n\n## How to build and test?\n\n```\nmkdir build \u0026\u0026 cd build\ncmake ..\nmake \u0026\u0026 ctest\n```\n\n## text_search\n\n### TODOs\n\n- [ ] Polish \u0026 test all error messages\n- [ ] Write unit / integration test to compare STRING vs JSON fiels search query syntax.\n- [ ] Figure out what's the right search syntax for a property graph\n- [ ] Add some notion of pagination\n- [ ] Add some notion of backwards compatiblity -\u003e some help to the user\n- [ ] How to:\n    - [ ] search all properties\n    - [ ] fuzzy search\n          ```\n          // let term = Term::from_field_text(data_field, \u0026input.search_query);\n          // let query = FuzzyTermQuery::new(term, 2, true);\n          ```\n- [ ] Add Github Actions\n- [ ] Add benchmarks:\n    - [ ] Test what's the tradeoff between searching STRING vs JSON TEXT, how does the query look like?\n    - [ ] Search direct field vs JSON, FAST vs SLOW, String vs CxxString\n    - [ ] MATCH (n) RETURN count(n), n.deleted;\n    - [ ] search of a specific property value\n    - [ ] benchmark (add|retrieve simple|complex, filtering, aggregations).\n    - [ ] search of all properties\n    - [ ] Benchmark (search by GID to get document_id + fetch document by document_id) vs (fetch document by document_id) on 100M nodes + 100M edges\n        - [ ] Note [DocAddress](https://docs.rs/tantivy/latest/tantivy/struct.DocAddress.html) is composed of 2 u32 but the `SegmentOrdinal` is tied to the `Searcher` -\u003e is it possible/wise to cache the address (`SegmentId` is UUID)\n            - [ ] A [searcher](https://docs.rs/tantivy/latest/tantivy/struct.IndexReader.html#method.searcher) per transaction -\u003e cache `DocAddress` inside Memgraph's `ElementAccessors`?\n- [ ] Implement the stress test by adding \u0026 searching to the same index concurrently + large dataset generator.\n- [ ] Consider implementing panic! handler preventing outside process to crash (optionally).\n\n### NOTEs\n\n* if a field doesn't get specified in the schema, it's ignored\n* `TEXT` means the field will be tokenized and indexed (required to be able to\n  search)\n* Tantivy add_json_object accepts serde_json::map::Map\u003cString, serde_json::value::Value\u003e\n* C++ text-search API is snake case because it's implemented in Rust\n* Writing each document and then committing (writing to disk) will be\n  expensive. In a standard OLTP workload that's a common case -\u003e introduce some\n  form of batching.\n\n## Resources\n\n* https://fulmicoton.com/posts/behold-tantivy-part2\n* https://stackoverflow.com/questions/37924383/combining-several-static-libraries-into-one-using-cmake\n    --\u003e decided to have 2 separate libraries user code has to link\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmemgraph%2Fmgcxx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmemgraph%2Fmgcxx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmemgraph%2Fmgcxx/lists"}