https://github.com/aborroy/alfresco-content-lake-ui
UI extension for Alfresco Content Lake to be used with ACA or ADW
https://github.com/aborroy/alfresco-content-lake-ui
ai alfresco angular content-lake docker rag
Last synced: 3 months ago
JSON representation
UI extension for Alfresco Content Lake to be used with ACA or ADW
- Host: GitHub
- URL: https://github.com/aborroy/alfresco-content-lake-ui
- Owner: aborroy
- License: apache-2.0
- Created: 2026-02-17T11:54:20.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-09T17:49:31.000Z (3 months ago)
- Last Synced: 2026-03-09T20:54:18.077Z (3 months ago)
- Topics: ai, alfresco, angular, content-lake, docker, rag
- Language: TypeScript
- Homepage:
- Size: 61.5 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Alfresco Content Lake UI
[](LICENSE)
[](https://github.com/Alfresco/alfresco-ng2-components)
[](https://angular.dev/)
[](https://nx.dev/)
[](https://docs.docker.com/compose/)
[]()
UI extension for [alfresco-content-lake](https://github.com/aborroy/alfresco-content-lake) that adds **semantic search** and **RAG question-answering** to Alfresco Content Application (ACA) and Alfresco Digital Workspace (ADW).
## Features
* Semantic search panel: free-text query with configurable `topK` / `minScore`, results grouped by document with similarity scores and expandable chunk snippets
* Chat-style Q&A: natural language questions answered via RAG, displaying the generated answer, model used, timing breakdown, and referenced source documents with chunks
* Document-scoped mode: right-click any document and choose *"Ask AI about this document"* to open the chat pre-scoped to that file
* Folder-scoped mode: right-click any folder and choose *"Ask AI about this folder"* to scope retrieval to that folder subtree
* Sidebar tab: compact chat panel in the info-drawer, automatically scoped to the selected document or folder.
* Conversation sessions: local session list with restore and *New conversation* support across route changes
* Content Lake scope controls: right-click a folder to add or remove the `cl:indexed` aspect, or use the dedicated *Content Lake* sidebar tab
* Document override: set `cl:excludeFromLake` on a document from the *Content Lake* sidebar to opt it out of an indexed folder subtree
* Visual scope indicators: badges show when a folder or document is in Content Lake scope, and when a document is explicitly excluded
* Document preview: result links open the ACA document viewer; closing the preview returns to the RAG Assistant page
* Zero custom auth code: authentication is forwarded transparently via the ADF HTTP interceptor and a shared gateway
## Prerequisites
* A running [alfresco-content-lake](https://github.com/aborroy/alfresco-content-lake) deployment with `rag-service` available
* The `content-lake-repo-model` module deployed in Alfresco Repository so `cl:indexed` and `cl:excludeFromLake` exist
* ACA (Alfresco Content App) source checkout, or ADW (Alfresco Digital Workspace) source
* Node.js 18+
## Install into ACA
### 1. Copy the extension
```bash
# From the root of your alfresco-content-app clone:
cp -r /path/to/alfresco-content-lake-ui/ext-rag projects/ext-rag
```
### 2. Register the module
Edit `app/src/app/extensions.module.ts`:
```typescript
import { provideRagExtension } from 'projects/ext-rag/src/public-api';
export function provideApplicationExtensions(): (Provider | EnvironmentProviders)[] {
return [
...provideRagExtension(),
// keep other extensions here
];
}
```
### 3. Add extension assets to the build
Edit `app/project.json` and add to the `build.options.assets` array:
```json
{
"glob": "ext-rag.plugin.json",
"input": "projects/ext-rag/src/assets",
"output": "./assets/plugins"
}
```
### 4. Configure the RAG service URL
Add to `app/src/app.config.json` (see [`config/app.config.snippet.json`](config/app.config.snippet.json)):
```json
{
"plugins": {
"ragService": {
"baseUrl": "/api/rag",
"searchPath": "/search/semantic",
"promptPath": "/prompt",
"streamPath": "/chat/stream"
},
"contentLakeService": {
"baseUrl": "/api/content-lake"
}
}
}
```
### 5. Configure the dev proxy
Add to `app/proxy.conf.js` (see [`config/proxy.conf.snippet.js`](config/proxy.conf.snippet.js)):
```javascript
'/api/rag': {
target: 'http://localhost:9091',
changeOrigin: true,
secure: false,
logLevel: 'debug'
},
'/api/content-lake': {
target: 'http://localhost:9090',
changeOrigin: true,
secure: false,
logLevel: 'debug'
}
```
### 6. Run
```bash
npm start
```
Open `http://localhost:4200`, log in, and find the *RAG Assistant* entry in the left navigation.
## Accessing sidebar features
Use this flow to access all sidebar-driven features from the document list:
1. Open any library or folder in ACA and select a node.
2. Click the info drawer toggle (right-side panel icon in the top toolbar) to open the right panel.
3. In the panel header, switch to:
* *Ask AI* tab for compact document-scoped chat.
* *Content Lake* tab for scope and ingestion controls.
What you can do from each tab:
* *Ask AI* (document selected):
* Ask questions about the selected document directly from the sidebar.
* Keep context while browsing files without leaving the current page.
* *Content Lake* (folder selected):
* Enable or disable Content Lake inclusion for the folder subtree.
* *Content Lake* (document selected):
* Exclude or include the document from inherited Content Lake scope.
* Check *Ingestion status* and refresh it using the refresh icon.
Notes:
* Sidebar tabs appear only when the extension is correctly registered (`provideRagExtension`) and `ext-rag.plugin.json` is included in build assets.
* *Ingestion status* requires `/api/content-lake/*` to be proxied to `batch-ingester` (see proxy/nginx sections below).
## Content Lake scope controls
The extension also exposes the repository scope model introduced by `alfresco-content-lake`:
* Right-click a folder and use *Enable Content Lake for this folder* or *Disable Content Lake for this folder* to add or remove `cl:indexed`
* Open the *Content Lake* tab in the ACA info drawer to manage the same folder toggle without leaving the current view
* Select a document inside an indexed subtree and use *Exclude this document from Content Lake* to set `cl:excludeFromLake=true`
* Look for the `offline_bolt` badge on nodes that are currently in Content Lake scope, and the `block` badge on documents explicitly excluded from ingestion
These controls call the standard Alfresco Repository nodes API directly. No extra UI-specific backend service is required.
## Install into ADW
The mechanism is identical, only the paths change since ADW uses Nx:
1. Place the extension under `libs/ext-rag/` (or generate a new Nx lib and copy the source)
2. Update `tsconfig.base.json` to map `@myorg/ext-rag` > `libs/ext-rag/src/index.ts`
3. Import `ExtRagModule` in `apps/content-ee/src/app/extension.module.ts`
4. Add the plugin JSON asset in `angular.json` under `content-ee` build assets
5. Configure proxy / gateway the same way
## Docker deployment
The `docker` directory contains a production-ready Dockerfile that builds ACA with the ext-rag extension pre-installed:
```bash
docker build -t alfresco-content-lake-ui -f docker/Dockerfile .
```
See [`docker`](docker/) for details on the nginx template and runtime configuration hook.
## Production deployment (nginx)
Add this block inside the `server { }` in your existing `nginx.conf` (see [`config/nginx.snippet.conf`](config/nginx.snippet.conf)):
```nginx
location /api/rag/ {
proxy_pass http://rag-service:9091/api/rag/;
}
location /api/content-lake/ {
proxy_pass http://batch-ingester:9090/api/content-lake/;
}
```
This ensures:
* Requests from the browser go to the same origin (no CORS issues)
* The ADF HTTP interceptor attaches the Alfresco auth ticket automatically
* `rag-service` receives the ticket and can validate it against the Alfresco authentication API
* `batch-ingester` serves ingestion-status lookups used by the Content Lake sidebar
## Authentication flow
```
Browser (ACA/ADW)
│
│ POST /api/rag/prompt
│ Header: Authorization: Basic
│
▼
nginx / gateway
│
│ proxy_pass → rag-service:9091
│
▼
rag-service
│
│ Validates ticket via:
│ GET /alfresco/api/-default-/public/authentication/versions/1/tickets/-me-
│
▼
Alfresco Repository
```
No custom authentication code is needed in the Angular module. ADF installs an HTTP interceptor that adds the ticket to every same-origin request. The gateway makes `rag-service` reachable on the same origin as Alfresco.
## API contract
The extension expects the following endpoints from [alfresco-content-lake](https://github.com/aborroy/alfresco-content-lake) `rag-service`:
### `POST /api/rag/search/semantic`
Semantic search across indexed content-lake chunks.
**Request:**
```json
{
"query": "a girl falls in a crater",
"topK": 5,
"minScore": 0.5
}
```
**Response:**
```json
{
"query": "a girl falls in a crater",
"model": "OpenAiEmbeddingModel",
"vectorDimension": 1024,
"resultCount": 2,
"totalCount": 2,
"searchTimeMs": 739,
"results": [
{
"rank": 1,
"score": 0.5760,
"chunkText": "found herself falling down a very deep well…",
"sourceDocument": {
"documentId": "c225f4d5-882b-4b99-81d1-3226af2560a4",
"nodeId": "e0f2943f-5e11-4f78-b294-3f5e116f7823",
"name": "down-the-rabbit-hole.pdf",
"path": "/Company Home/Sites/private/documentLibrary",
"mimeType": "application/pdf"
},
"chunkMetadata": {
"embeddingId": "334f91ec-4ed1-41b8-a1aa-bca6c2b1431e",
"embeddingType": "default",
"page": 0,
"paragraph": 3,
"chunkLength": 773
}
}
]
}
```
The UI groups results by `sourceDocument.nodeId`, showing one entry per document with all matching chunks listed underneath.
### `POST /api/rag/prompt`
RAG question-answering with optional conversation session controls.
For document scope, the UI translates `nodeId` into a backend `filter`
expression (`cin_id = ''`).
**Request:**
```json
{
"question": "Why the girl fell in the hole?",
"sessionId": "demo-session-1",
"resetSession": false,
"filter": "cin_id = 'e0f2943f-5e11-4f78-b294-3f5e116f7823'"
}
```
**Response:**
```json
{
"answer": "She fell because the rabbit-hole dipped suddenly downward…",
"question": "Why the girl fell in the hole?",
"sessionId": "demo-session-1",
"retrievalQuery": "why the girl fell in the hole",
"historyTurnsUsed": 2,
"model": "model.gguf",
"tokenCount": 672,
"searchTimeMs": 454,
"generationTimeMs": 7084,
"totalTimeMs": 7539,
"sourcesUsed": 5,
"sources": [
{
"documentId": "c225f4d5-882b-4b99-81d1-3226af2560a4",
"nodeId": "e0f2943f-5e11-4f78-b294-3f5e116f7823",
"name": "down-the-rabbit-hole.pdf",
"path": "/Company Home/Sites/private/documentLibrary",
"chunkText": "found herself falling down a very deep well…",
"score": 0.6628
}
]
}
```
The UI groups sources by `nodeId`, showing one entry per document with all source chunks.
### `POST /api/rag/chat/stream`
Streaming RAG endpoint consumed by the chat panel for progressive rendering.
Expected SSE events:
- `event: token` with incremental token payload (`token`, `delta`, `text`, or `content`)
- `event: metadata` with final response payload (`RagPromptResponse`)
- `event: done` to close the stream
- `event: error` for terminal stream errors