{"id":27101555,"url":"https://github.com/treeverse/langchain-lakefs","last_synced_at":"2025-10-27T15:48:15.410Z","repository":{"id":282882631,"uuid":"949941894","full_name":"treeverse/langchain-lakefs","owner":"treeverse","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-19T02:02:55.000Z","size":231,"stargazers_count":1,"open_issues_count":3,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-09-22T22:45:54.736Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/treeverse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-17T11:37:38.000Z","updated_at":"2025-04-22T12:26:31.000Z","dependencies_parsed_at":"2025-04-01T16:35:16.184Z","dependency_job_id":"cfb92cec-4559-4a4e-8414-e444cb6e7b73","html_url":"https://github.com/treeverse/langchain-lakefs","commit_stats":null,"previous_names":["treeverse/langchain-lakefs"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/treeverse/langchain-lakefs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treeverse%2Flangchain-lakefs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treeverse%2Flangchain-lakefs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treeverse%2Flangchain-lakefs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treeverse%2Flangchain-lakefs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/treeverse","download_url":"https://codeload.github.com/treeverse/langchain-lakefs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/treeverse%2Flangchain-lakefs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281295816,"owners_count":26476759,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-27T02:00:05.855Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-06T14:39:11.456Z","updated_at":"2025-10-27T15:48:15.379Z","avatar_url":"https://github.com/treeverse.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# langchain-lakefs\n\nThis package provides a LangChain integration with [lakeFS](https://lakefs.io/), allowing you to load documents from lakeFS repositories into your LangChain workflows.\n\n## Features\n\n- Load documents from lakeFS repositories using the official lakeFS Python SDK\n- Support for user metadata retrieval\n- Configurable repository, reference, and path specifications\n- Integration with LangChain's document loading infrastructure\n\n## Installation\n\n```bash\npip install -U langchain-lakefs\n```\n\n## Configuration\n\nYou can configure the `LakeFSLoader` in three ways:\n\n### 1. Direct Initialization\n\nProvide the access key, secret key, and endpoint during initialization:\n\n```python\nfrom langchain_lakefs.document_loaders import LakeFSLoader\n\nlakefs_loader = LakeFSLoader(\n    lakefs_access_key='your_access_key',\n    lakefs_secret_key='your_secret_key',\n    lakefs_endpoint='https://path-to.lakefs.com',\n    repo='your_repo',\n    ref='main',\n    path='path/to/files'\n)\n```\n\n### 2. Configuration File\n\nThe package will automatically read credentials from the `~/.lakectl.yaml` file if available.\n\n### 3. Environment Variables\n\nSet the following environment variables to configure the loader:\n\n```bash\nexport LAKECTL_CREDENTIALS_ACCESS_KEY_ID='your_access_key'\nexport LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY='your_secret_key'\nexport LAKECTL_SERVER_ENDPOINT_URL='https://path-to.lakefs.com'\n```\n\n## Usage\n\n### Document Loader\n\nThe `LakeFSLoader` class allows you to load documents from lakeFS. You need to specify:\n\n- The repository (`repo`)\n- The reference (`ref`) - branch, commit or tag\n- The path to the files you want to load\n\nIf you would like to load the metadata of the files, you can set the `user_metadata` parameter to `True`:\n\n```python\nfrom langchain_lakefs.document_loaders import LakeFSLoader\n\n# Initialize the loader\nlakefs_loader = LakeFSLoader(\n    lakefs_access_key='your_access_key',\n    lakefs_secret_key='your_secret_key',\n    lakefs_endpoint='https://path-to.lakefs.com',\n    repo='your_repo',\n    ref='main',\n    path='path/to/files',\n    user_metadata=True\n)\n\n# Load documents from lakeFS\ndocuments = lakefs_loader.load()\n\n# Process the documents\nfor doc in documents:\n    print(f\"Content: {doc.page_content}\")\n    print(f\"Metadata: {doc.metadata}\")\n```\n\n### Modifying Loader Settings\n\nYou can modify the loader settings after initialization:\n\n```python\n# Change the repository\nlakefs_loader.set_repo(\"another-repo\")\n\n# Change the reference (branch or commit)\nlakefs_loader.set_ref(\"feature-branch\")\n\n# Change the path\nlakefs_loader.set_path(\"another/path\")\n\n# Toggle user metadata retrieval\nlakefs_loader.set_user_metadata(True)\n```\n\n## Examples\n\n### Loading Documents from a Specific Path\n\n```python\nfrom langchain_lakefs.document_loaders import LakeFSLoader\n\nloader = LakeFSLoader(\n    lakefs_endpoint=\"https://example.my-lakefs.com\",\n    lakefs_access_key=\"your-access-key\",\n    lakefs_secret_key=\"your-secret-key\",\n    repo=\"my-repo\",\n    ref=\"main\",\n    path=\"data/documents\"\n)\n\ndocuments = loader.load()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftreeverse%2Flangchain-lakefs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftreeverse%2Flangchain-lakefs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftreeverse%2Flangchain-lakefs/lists"}