{"id":17722602,"url":"https://github.com/bobbyiliev/materialize-emulator-subscribe-poc","last_synced_at":"2026-05-09T17:33:21.641Z","repository":{"id":259446383,"uuid":"877881142","full_name":"bobbyiliev/materialize-emulator-subscribe-poc","owner":"bobbyiliev","description":null,"archived":false,"fork":false,"pushed_at":"2024-10-29T09:39:22.000Z","size":13,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-28T11:43:04.265Z","etag":null,"topics":["docker","kafka","materialize","python","redpanda","sql","streaming-data"],"latest_commit_sha":null,"homepage":"https://materialize.com/docs","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bobbyiliev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-24T12:05:56.000Z","updated_at":"2024-10-29T09:39:25.000Z","dependencies_parsed_at":"2024-12-13T15:30:06.348Z","dependency_job_id":"480aeaf4-1e1d-4b0d-a216-8e41df0f4096","html_url":"https://github.com/bobbyiliev/materialize-emulator-subscribe-poc","commit_stats":null,"previous_names":["bobbyiliev/materialize-emulator-subscribe-poc"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobbyiliev%2Fmaterialize-emulator-subscribe-poc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobbyiliev%2Fmaterialize-emulator-subscribe-poc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobbyiliev%2Fmaterialize-emulator-subscribe-poc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bobbyiliev%2Fmaterialize-emulator-subscribe-poc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bobbyiliev","download_url":"https://codeload.github.com/bobbyiliev/materialize-emulator-subscribe-poc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246473273,"owners_count":20783236,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","kafka","materialize","python","redpanda","sql","streaming-data"],"created_at":"2024-10-25T15:38:46.461Z","updated_at":"2026-05-09T17:33:16.583Z","avatar_url":"https://github.com/bobbyiliev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Materialize Kafka Data Generation Demo\n\nThis demo sets up a pipeline using Materialize, Redpanda (Kafka), and the Materialize datagen tool to generate and process streaming data.\n\n## Setup\n\n1. Clone the repository:\n```bash\ngit clone git@github.com:bobbyiliev/materialize-emulator-subscribe-poc.git\ncd materialize-emulator-subscribe-poc\n```\n\n2. Start the services:\n```bash\ndocker compose up -d\n```\n\n3. Wait for all services to be healthy (usually takes about 30-45 seconds)\n```bash\ndocker compose ps\n```\n\n## Connect to Materialize\n\nConnect to the Materialize instance:\n```bash\npsql -h localhost -p 6877 -U mz_system materialize\n```\n\n## Create Required Objects\n\n1. **Create a Secret for PostgreSQL Password:**\n    ```sql\n    CREATE SECRET pgpass AS 'postgres';\n    ```\n\n1. **Create PostgreSQL Connection:**\n    ```sql\n    CREATE CONNECTION pg_connection TO POSTGRES (\n        HOST 'postgres',\n        PORT 5432,\n        USER 'postgres',\n        PASSWORD SECRET pgpass,\n        SSL MODE 'disable',\n        DATABASE 'postgres'\n    );\n    ```\n\n1. **Create the Source from PostgreSQL:**\n    ```sql\n    CREATE SOURCE mz_source\n    FROM POSTGRES CONNECTION pg_connection (PUBLICATION 'mz_source')\n    FOR ALL TABLES;\n    ```\n\n1. **Create a Materialized View:**\n    ```sql\n    CREATE MATERIALIZED VIEW datagen_view AS\n        SELECT * FROM products;\n    ```\n\n1. **Verify Data Flow:**\n    ```sql\n    SELECT * FROM datagen_view LIMIT 5;\n    ```\n\n\u003e [!TIP]\n\u003e `CREATE INDEX` creates an in-memory index on a source, view, or materialized view. For more information, see the [Materialize documentation](https://materialize.com/docs/sql/create-index/).\n\u003e In Materialize, indexes store query results in memory within a [cluster](https://materialize.com/docs/concepts/clusters/), and keep these results incrementally updated as new data arrives. By making up-to-date results available in memory, indexes can help [optimize query performance](https://materialize.com/docs/transform-data/optimization/), both when serving results and maintaining resource-heavy operations like joins.\n\n4. Create an index to optimize queries (optional):\n```sql\nCREATE INDEX datagen_view_idx ON datagen_view (id);\n```\n\n\u003e [!TIP]\n\u003e Running `SUBSCRIBE` on an unindexed view can be slow and resource-intensive as it requires a full scan of the view/table/source.\n\n## Verify Setup\n\nCheck that data is flowing:\n```sql\nSELECT * FROM datagen_view LIMIT 5;\n```\n\nCheck the number of records:\n```sql\nSELECT count(*) FROM datagen_view;\n```\n\n## Configuration Details\n\n- The datagen service will generate:\n  - 10,024 records (`-n 10024`)\n  - With a write interval of 2000ms (`-w 2000`)\n  - In Postgres format (`-f postgres`)\n  - Using the schema defined in `/schemas/products.sql`\n\n## Troubleshooting\n\nIf you don't see data flowing:\n1. Check service health:\n```bash\ndocker compose ps\n```\n\n2. Check Postgres logs:\n```bash\ndocker compose logs postgres\n```\n\n3. Check datagen logs:\n```bash\ndocker compose logs datagen\n```\n\n4. Verify the source:\n```sql\nSHOW SOURCES;\n```\n\n5. Check the source status:\n```sql\nSELECT * FROM mz_internal.mz_source_statuses;\n```\n   If the source status is not with `running` status, the `SUBSCRIBE` command will not return any data as no new data is received.\n\n## Using `SUBSCRIBE`\n\nYou can use the `SUBSCRIBE` command to see the data as it flows in:\n```sql\nCOPY (SUBSCRIBE datagen_view) TO STDOUT;\n-- Subscribe with no snapshot:\nCOPY (SUBSCRIBE datagen_view WITH(SNAPSHOT FALSE)) TO STDOUT;\n```\n\n## Using `SUBSCRIBE` with Python and psycopg2\n\nSetup virtual environment:\n```bash\npython3 -m venv venv\nsource venv/bin/activate\npip install psycopg2-binary python-dotenv\n```\n\nRun the the `subscribe.py` script:\n```bash\npython subscribe.py\n```\n\nOutput:\n\n```py\nWaiting for updates...\n(Decimal('1729786668000'), False, 1, 90817, 'Alec89', '53188', '41846', 1, None)\n(Decimal('1729786670000'), False, 1, 96566, 'Christina_Herzog2', '52301', '71868', 1, None)\n(Decimal('1729786672000'), False, 1, 19028, 'Hudson_Heller76', '66648', '24064', 1, None)\n(Decimal('1729786674000'), False, 1, 21478, 'Earnest_Ernser', '49801', '48775', 0, None)\n(Decimal('1729786676000'), False, 1, 20213, 'Thora_Schamberger', '62960', '67815', 1, None)\n```\n\n![simple-example-subscribe](https://github.com/user-attachments/assets/6c92dc54-3cae-4605-ab2a-2885dad0bb86)\n\n## Debugging Materialize\n\n### Check Container Status\n```bash\n# Check all containers status\ndocker compose ps\n\n# Check container health\ndocker compose ps materialized\n```\n\n### View Logs\n```bash\n# View Materialize logs\ndocker compose logs materialized\n\n# Follow logs in real-time\ndocker compose logs -f materialized | grep -v 'No such file or directory'\n\n# View last 100 lines\ndocker compose logs --tail=100 materialized\n\n# Check all services logs\ndocker compose logs\n```\n\n### Resource Management\n1. Docker Desktop Resource Limits\n   - Open Docker Desktop → Settings → Resources\n   - Recommended minimum settings:\n     - CPUs: 6\n     - Memory: 12GB\n     - Swap: 1GB\n   - Apply \u0026 Restart Docker Desktop\n\n2. Check Resource Usage\n```bash\n# View container resource usage\ndocker stats materialized\n```\n\n### Check Source Status\n\nIf Materialize is not receiving data from the source, you will not see any data when running `SUBSCRIBE` without a snapshot. To check the source status, run the following queries:\n\n```sql\n-- Get the source name:\nSHOW SOURCES;\n\n-- Check source status:\nSELECT * FROM mz_internal.mz_source_statuses\nWHERE name = 'SOURCE_NAME';\n```\n\n## Cleanup\n\nStop the services:\n```bash\ndocker compose down -v\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobbyiliev%2Fmaterialize-emulator-subscribe-poc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbobbyiliev%2Fmaterialize-emulator-subscribe-poc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbobbyiliev%2Fmaterialize-emulator-subscribe-poc/lists"}