{"id":16241902,"url":"https://github.com/mplogas/kernelmemory.filewatcher","last_synced_at":"2025-03-19T17:30:54.793Z","repository":{"id":227946739,"uuid":"770103200","full_name":"mplogas/KernelMemory.FileWatcher","owner":"mplogas","description":"A service for automating document ingestion for Semantic Kernel's KernelMemory service","archived":false,"fork":false,"pushed_at":"2024-06-30T15:54:10.000Z","size":56,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-11T14:08:35.644Z","etag":null,"topics":["csharp","filewatcher","kernel-memory","llm","rag","semantic-kernel"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mplogas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-10T23:02:49.000Z","updated_at":"2024-09-28T08:37:59.000Z","dependencies_parsed_at":"2024-03-30T08:46:19.277Z","dependency_job_id":null,"html_url":"https://github.com/mplogas/KernelMemory.FileWatcher","commit_stats":null,"previous_names":["mplogas/kernelmemory.filewatcher"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mplogas%2FKernelMemory.FileWatcher","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mplogas%2FKernelMemory.FileWatcher/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mplogas%2FKernelMemory.FileWatcher/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mplogas%2FKernelMemory.FileWatcher/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mplogas","download_url":"https://codeload.github.com/mplogas/KernelMemory.FileWatcher/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221729383,"owners_count":16871007,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csharp","filewatcher","kernel-memory","llm","rag","semantic-kernel"],"created_at":"2024-10-10T14:08:57.258Z","updated_at":"2024-10-27T20:25:59.237Z","avatar_url":"https://github.com/mplogas.png","language":"C#","readme":"# KernelMemory.FileWatcher \n## Automated document ingestion for Semantic Kernel's KernelMemory service\n\n\n## Overview\n\nKernelMemory File Watcher is a service designed to automate the document ingestion process for Semantic Kernel's KernelMemory service. It monitors specified directories for file changes and sends these changes to the KernelMemory service for processing. This enables the automatic creation of embeddings for Retrieval Augmented Generation (RAG) whenever a file is modified. The service is designed to run on the edge or wherever your files reside, and can be deployed as a standalone service or a Docker container.\n\n## Main Components\n\n### MessageStore\n\nThe `MessageStore` is responsible for storing and managing file events. It implements the `IMessageStore` interface which defines methods for adding a file event, retrieving the next file event, and checking if there are any file events in the store. The `MessageStore` uses a `ConcurrentDictionary` to store file events, ensuring thread-safety.\n\n### FileWatcherService\n\nThe `FileWatcherService` is responsible for watching specified directories for file changes. It uses the `FileSystemWatcher` class to monitor directories and raises events when files are created, deleted, or modified. These events are then added to the `MessageStore`.\n\n### HttpWorker\n\nThe `HttpWorker` is a hosted service that periodically checks the `MessageStore` for new file events and sends them to the KernelMemory service. It uses an `HttpClient` to send HTTP requests and includes logic for handling different types of file events (e.g., upserts and deletes).\n\n## How It Works\n\n1. The `FileWatcherService` starts watching the specified directories for file changes.\n2. When a file change is detected, a file event is created and added to the `MessageStore`.\n3. The `HttpWorker` periodically checks the `MessageStore` for new file events.\n4. When a new file event is found, the `HttpWorker` sends it to the KernelMemory service for processing.\n\n## Configuration\n\nThe service's configuration is defined in the `appsettings.json` file. Here you can specify the directories to watch, the KernelMemory service's endpoint and API key, and other options.\n\n```json\n{\n  \"FileWatcher\": {\n    \"Directories\": [\n      {\n        \"Path\": \"/tmp/folder_01\",\n        \"Filter\": \"*.md\", // single filter\n        \"Index\": \"folder-01\",\n        \"IncludeSubdirectories\": true\n      },\n      {\n        \"Path\": \"/tmp/folder_02\",\n        \"Filters\": {   // multiple filters\n            \"*.md\",\n            \"*.pdf\"\n        },\n        \"Index\": \"folder-02\",\n        \"IncludeSubdirectories\": true\n      },\n      // More directories...\n    ]\n  },\n  \"KernelMemory\": {\n    \"Endpoint\": \"http://127.0.0.1:9001\",\n    \"ApiKey\": \"\", // not required\n    \"Schedule\":  \"00:00:30\"\n  }\n}\n\n```\n\nIn the `FileWatcher` section, you can specify multiple directories to watch. For each directory, you can specify a path, a filter for the types of files to watch, an index, and whether to include subdirectories.\n\nIn the `KernelMemory` section, you can specify the endpoint of the KernelMemory service, your API key, and the schedule for the `HttpWorker` to check for new file events.\n\n## Running the Service\n\nTo run the service, you can either run the `KernelMemory.FileWatcher` project directly or build and run the Docker container.\n\n### Running as a Standalone Service\n\nTo run the service as a standalone service, you can build and run the `KernelMemory.FileWatcher` project using the following commands:\n\n```bash\ndotnet run --project KernelMemory.FileWatcher\n```\n\n### Running as a Docker Container\n\n```sh\ndocker run -v /path/to/your/appsettings.json:/config/appsettings.json -v /path/to/your/documents-01:/data/documents-01 mplogas/km-filewatcher:latest\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmplogas%2Fkernelmemory.filewatcher","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmplogas%2Fkernelmemory.filewatcher","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmplogas%2Fkernelmemory.filewatcher/lists"}