{"id":21365462,"url":"https://github.com/frameable/log-parse","last_synced_at":"2025-03-16T07:41:17.105Z","repository":{"id":186204867,"uuid":"673071131","full_name":"frameable/log-parse","owner":"frameable","description":"A library of things for turning log lines into sqlite rows","archived":false,"fork":false,"pushed_at":"2023-11-27T21:11:47.000Z","size":179,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-02-25T02:06:59.421Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/frameable.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-07-31T20:02:01.000Z","updated_at":"2023-08-04T21:33:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"b6f735b2-2d68-4950-b750-6fc14e0cdaf5","html_url":"https://github.com/frameable/log-parse","commit_stats":null,"previous_names":["frameable/log-parse"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frameable%2Flog-parse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frameable%2Flog-parse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frameable%2Flog-parse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frameable%2Flog-parse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/frameable","download_url":"https://codeload.github.com/frameable/log-parse/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243841204,"owners_count":20356441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-22T07:11:19.471Z","updated_at":"2025-03-16T07:41:17.086Z","avatar_url":"https://github.com/frameable.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"## log-parse\n\nlog-parse is a library for parsing fluentd logs, and getting them into sqlite databases\n\n## Getting started\n\nTo add log-parse to your project, `npm install --save @frameable/log-parse`\n\nExample code can be found [`examples`](examples/)\n\n\n#### The ctx function\n\n#### Reading logs \n\nGiven a directory with some logs, you can open a log iterator there. It will iterate all of the files in that directory, in order of most recently dated ( the date being determined by the filenanme - (I should write more docs for this )), and yield each log line. It will decompress gzipped files.\n\nFor example, consider the following directory that contains logs\n```shell\n$ pwd \n/var/log/my-app/\n\n$ head current.log\n2023-08-04T00:10:26+00:00 {\"status\": \"OK\"}\n2023-08-04T00:10:27+00:00 {\"status\": \"OK\"}\n2023-08-04T00:10:28+00:00 {\"status\": \"OK\"}\n2023-08-04T00:10:29+00:00 {\"status\": \"NOT OK\", \"message\": \"the server tripped and fell!\"}\n\n$ head file.20230803.log.gz | gunzip\n2023-08-03T00:10:26+00:00 {\"status\": \"OK\"}\n2023-08-03T00:10:27+00:00 {\"status\": \"DEGRADED\", \"message\": \"the server is getting sleepy...\"}\n2023-08-03T00:10:28+00:00 {\"status\": \"DEGRADED\", \"pending_events\": 42}\n2023-08-03T00:10:29+00:00 {\"status\": \"ASLEEP\", \"message\": \"the server is conked out!\"}\n```\n\nWe can iterate those logs as they are. We're using the `ctx` function to specify a default context but with `logfileRoot` set to our target directory. The `Context` struct is going to be the interface to most of the API\n\n```ts\nfor await (const log of iterLogs(ctx({logfileRoot: \"/var/log/my-app\"}))) {\n  console.log(log.content) \n}\n```\n\n#### Parsing logs\n\nIf we want to parse out the JSON body, we can. The built-in generator `entriesKV` is perfect for this - it will regex match for a `body`. By default, it uses the expression `/^(?\u003ctimestamp\u003e[^\\t ]+)[\\t ](?\u003cbody\u003e.+)$/`, but any expression that matches a `body` and `timestamp` can be used with `entryRegex`\n\n```ts\nfor await (const entry of entriesKV(\"my-app\", iterLogs(ctx({logfileRoot: \"/var/log/my-app\"})), ctx())) {\n  console.log(entry.body.status)\n}\n```\n\nThe function `chunkEntries` can be used to read and parse chunks of log lines at once, where those chunks are then yielded.\n\n#### Putting logs in SQL\n\nTo actually get this stuff in the database we need to have a database. We can also make it on the disk at `sqliteRoot`\n\n```ts\nconst database = makeDatabase(\"year-digits\", ctx({sqliteInMemory: true}))\n```\n\nWe can either use insert with the database and some data directly\n\n```ts\nconst entries = const entries = await chunkEntries(entriesKV(\"year-digit\", iterLogs(ctx({logfileRoot: \"/var/log/my-app\"})), ctx()), 4, 0).next() // the first chunk of 4\ninsert(entries.value, database, ctx({sqliteInMemory: true, entryFields: new Set(\"status\", \"message\")}))\n```\n\nWe can also create an `insertFunc` to call later on other collections of entries\n\n```ts\nconst insFunc = insertFunc(database, ctx({sqliteInMemory: true, entryFields: new Set(\"status\", \"message\")}))\nfor await (const chunk of chunkEntries(entriesKV(\"year-digit\", iterLogs(ctx({logfileRoot: \"/var/log/my-app\"})), ctx()), 4, 0)) {\n  insFunc(chunk)\n}\n```\n\n`entryFields` describes what fields to create columns for. log-parse also creates meta fields:\n  - `identifier` uniquely identifies a single log entry in the scope of all of the files in its `logfileRoot`. identifier is in chronological order \n  - `timestamp` comes from the value captured by `entryRegex`, which is then parsed into a `Date`\n  - `data` has a json blob with any entry kv pairs that weren't an `entryField`\n\nNow we have a database that we can interface with regularly. By default our table is `logs`, that can be set with `sqliteTable`.\n\n```sql\nselect * from logs;\n-- 1691021426000|2023-08-03T00:10:26.000Z|{}|OK|\n-- 1691021427001|2023-08-03T00:10:27.000Z|{}|DEGRADED|the server is getting sleepy...\n-- 1691021428002|2023-08-03T00:10:28.000Z|{\"pending_events\":42}|DEGRADED|\n-- 1691021429003|2023-08-03T00:10:29.000Z|{}|ASLEEP|the server is conked out!\n-- 1691107826004|2023-08-04T00:10:26.000Z|{}|OK|\n-- 1691107827005|2023-08-04T00:10:27.000Z|{}|OK|\n-- 1691107828006|2023-08-04T00:10:28.000Z|{}|OK|\n-- 1691107829007|2023-08-04T00:10:29.000Z|{}|NOT OK|the server tripped and fell!\n\nselect count(status), status from logs group by status;\n-- 1|ASLEEP\n-- 2|DEGRADED\n-- 1|NOT OK\n-- 4|OK\n\nselect json(data) from logs where data != '{}';\n-- {\"pending_events\":42}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fframeable%2Flog-parse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fframeable%2Flog-parse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fframeable%2Flog-parse/lists"}