{"id":28560167,"url":"https://github.com/dtstack/jfilebeat","last_synced_at":"2025-06-10T09:07:46.378Z","repository":{"id":104050750,"uuid":"142132962","full_name":"DTStack/jfilebeat","owner":"DTStack","description":"类filebeat的轻量级日志采集工具","archived":false,"fork":false,"pushed_at":"2019-05-17T09:38:49.000Z","size":121,"stargazers_count":68,"open_issues_count":0,"forks_count":39,"subscribers_count":16,"default_branch":"master","last_synced_at":"2024-02-25T12:37:52.659Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DTStack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-07-24T09:02:47.000Z","updated_at":"2023-09-07T02:48:52.000Z","dependencies_parsed_at":"2023-06-29T11:45:51.754Z","dependency_job_id":null,"html_url":"https://github.com/DTStack/jfilebeat","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DTStack%2Fjfilebeat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DTStack%2Fjfilebeat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DTStack%2Fjfilebeat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DTStack%2Fjfilebeat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DTStack","download_url":"https://codeload.github.com/DTStack/jfilebeat/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DTStack%2Fjfilebeat/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259043767,"owners_count":22797163,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-10T09:07:42.924Z","updated_at":"2025-06-10T09:07:46.363Z","avatar_url":"https://github.com/DTStack.png","language":"Java","readme":"### 概述\n- 类filebeat的轻量级日志采集工具，java编写，可运行在aix/linux等机器上，最低支持java5。相对filebeat，增加了事务合并功能（即不连续的行通过关联id可以组合成一行）。\n\n### 参数配置\n- output.logstash.net_max: 限流比特数，0为不限流，默认为0\n- output.logstash.hosts: logstash或者jlogstash的ip端口的列表，列表元素格式如 localhost:8635\n- logging.to_files: 是否输出到文件，默认为true。false会输出到标准输出\n- logging.level: 日志级别，默认为info\n- logging.files.keepfiles: 日志文件保留个数\n- logging.files.path: 文件路径\n- logging.files.name: 文件名称\n- logging.files.max_size: 文件大小阈值\n- filebeat.idle_timeout: 监听文件变化的时间间隔，默认1s\n- filebeat.buffer_size: 缓存一行日志的最大字节数，默认为1024*1024，即1MB\n- filebeat.network_timeout: 网络连接超时时间，默认60s\n- filebeat.prospectors[x].tail_files: 是否从最近的日志开始读起，默认为true。\n- filebeat.prospectors[x].paths: 监听的文件路径，支持模糊匹配及递归匹配，格式如:/home/user/*/*.log\n- filebeat.prospectors[x].encoding: 编码，默认为utf-8\n- filebeat.prospectors[x].exclude_files: 排除的文件，支持正则\n- filebeat.prospectors[x].exclude_lines: 排除的行，支持正则\n- filebeat.prospectors[x].include_lines: 包含的行，支持正则\n- filebeat.prospectors[x].fields：填写键值对，键值对会和消息一起发送给logstash/jlogstash，其中keeptype、logtype、appname、tag、uesr_token、uuid是必带参数。keeptype一般填business，用于标志日志保留时间；appname用于分类日志的；tag用于更细粒度的日志分类的；user_token是租户信息的加密串；uuid是相当于jfilebeat的唯一id。\n- filebeat.prospectors[x].multiline.negate: pattern匹配结果的取反值，默认为true，一般的场景使用默认值即可。\n- filebeat.prospectors[x].multiline.pattern: 匹配语句，支持正则。\n- filebeat.prospectors[x].multiline.match: 可填before和after，默认为after。after可以理解成匹配首行，before则是匹配末行。\n- filebeat.prospectors[x].multiline.timeout: 在指定时间内没完成匹配，将触发超时，直接输出日志，不再匹配，默认为10s。\n- filebeat.prospectors[x].multiline.max_lines: 在指定时间内部匹配将触发超时，直接输出日志，不再匹配，默认为10s。\n- filebeat.prospectors[x].transactionline.patterns: 事务合并匹配语句的列表。列表元素格式如 (?^\\w+).*begin -\u003e ^${name} -\u003e .. -\u003e ^${name}.*end。 意思是从包含begin的行到包含end的行，且都包含name变量的行会合并成一个事务日志。 其中，-\u003e是事务向量，连接事务上下规则，规则支持正则匹配，(?)为捕获功能，捕获的key可以在下游规则使用，使用是采取${}获取变量。..表示无数次沿用上一规则，即^${name}。 ^${name}.*end属于终结规则，他会先于..进行匹配，一旦匹配终结规则，事务合并结束，不再进行..匹配。注意-\u003e前后必须包含空格，否则当做正则本身处理。\n- filebeat.prospectors[x].transactionline.timeout：事务合并的超时时间，类似multiline的timeout。\n- filebeat.prospectors[x].transactionline.max_lines：事务合并的行数上限，类似multiline的max_lines。\n- filebeat.prospectors[x].transactionline.end_flag：可填include和exclude，默认include。include表示终结规则匹配的行会包含在事务当中，exclude则相反，不包含在事务中。\n\n### 日志合并事例\n#### 日志1\n```\n[2018-04-07 09:25:09] INFO   - 交易[0426]--\u003c上送报文\u003e------------组包开始-----\n[2018-04-07 09:30:39] INFO   - [_ZKH],值=[622369**********]\n[2018-04-07 09:25:09] INFO   - [_errmsg],值=[交易成功]\n[2018-04-07 09:25:09] INFO   - [_hostcode],值=[0000]\n[2018-04-07 09:25:09] INFO   - --\u003c下传报文\u003e------------解包结束-----\n[2018-04-07 09:25:09] INFO   - 与主机通信耗时:[210]ms\n[2018-04-07 09:25609] INFO   - 交易[0617]--\u003c上送报文\u003e------------组包开始-----\n```\n日志连续，没有线程并发写入，而且有明确的终结语句\"与主机通信耗时\"，那么只需要用多行合并就行。“与主机通信耗时”是处于末尾行，所以match要填before，pattern的话填“与主机通信耗时”，nagate一律取true即可。timeout和max_lines可以尽量长点，确保最终匹配到\"与主机通信耗时\"，所以匹配规则写成\n\n```\nmultiline:\n      negate: true\n      pattern: '与主机通信耗时'\n      match: \"before\"\n      timeout: \"1800s\"\n      max_lines: 1500\n```\n\n#### 日志2\n```\n0502:155243:481|T1234|L5|routeIn.cpp:289|转发交易请求[WFM:Ncs2pl:ncs2AcctValid]  \n上传数据:\nT1234/名字空间::\nT1234/  域名|类型|长度|数据值\nT1234/DEFAULT::\nT1234/  A162|S|4|0.00\n0502:155243:483|T1234|L8|COrbCli.cpp:814|Send to server: \n0502:155244:245|T1234|L8|COrbCli.cpp:861|Server response:\nT1234/名字空间::\nT1234/  域名|类型|长度|数据值\nT6048/  C601|S|8|验证成功\nT1234/DEFAULT::\nT1234/  C180|S|18|201805020068913050\nT1234/  C601|S|8|验证成功\nT1234/  I010|S|1|1\nT1234/  S100|S|0|\nT1234/  WFMCode|I|1|0\nT1234/  WFMMsg|S|7|Success\nT1234/  _errmsg|S|8|交易成功\nT1234/  _hostcode|S|4|9***\n```\n有多线程并发写，日志不连续，而且逻辑行是多行(如，首行以0502:155243，知道遇到写一个相同格式的行结束)，需求是合并上述从\"转发交易请求\"直到\"Server response\"的同一线程的行。这时候就需要使用多行合并，先合并逻辑行，逻辑行都是以四个数字加冒号开始的，所以pattern写成^\\d{4}:，由于是首行，所以match为after，nagate继续选true，timeout和max_lines选尽量大的值即可，所以多行合并部分写成：\n\n```\nmultiline:\n  negate: true\n  pattern: '^\\d{4}:'\n  match: \"after\"\n  timeout: \"1800s\"\n  max_lines: 5000\n```\n\n如果只配置了多行合并，没有配置事务合并，那么上述日志会被分割成3行：\n\u003cbr\u003e（1）\n\n```\n0502:155243:481|T1234|L5|routeIn.cpp:289|转发交易请求[WFM:Ncs2pl:ncs2AcctValid]  \n上传数据:\nT1234/名字空间::\nT1234/  域名|类型|长度|数据值\nT1234/DEFAULT::\nT1234/  A162|S|4|0.00\n```\n\n（2）\n\n```\n0502:155243:483|T1234|L8|COrbCli.cpp:814|Send to server: \n```\n\n（3）\n\n```\n0502:155244:245|T1234|L8|COrbCli.cpp:861|Server response:\nT1234/名字空间::\nT1234/  域名|类型|长度|数据值\nT1234/DEFAULT::\nT1234/  C180|S|18|201805020068913050\nT1234/  C601|S|8|验证成功\nT1234/  I010|S|1|1\nT1234/  S100|S|0|\nT1234/  WFMCode|I|1|0\nT1234/  WFMMsg|S|7|Success\nT1234/  _errmsg|S|8|交易成功\nT1234/  _hostcode|S|4|9***\n```\n\n如果要把1、2、3合并成一行，这时候就要再进行事务合并，合并成事务行，事务行可以跳行合并多行（如1和2之前的其他线程留下的行会被跳过）。步骤如下：\n\n1. 确定开始规则。行以包含\"转发交易请求\"字眼开始，所以开始规则写成: 转发交易请求\n2. 确定终结规则。行包含\"Server response\"字眼，所以终结规则写成：Server\\s+response\n3. 确定中间规则。行包含与开始规则相同的线程号，这时候怎么办呢？这就需要用(?\u003ckey\u003evalue)把上游的线程号捕获并下传下来，再在中间规则中写上${key}来使用上游的key变量，所以中间规则改成: ${thread}。但注意中间规则是可能需要匹配多个行的\n4. 修改开始规则。由于步骤3的原因，需要在规则当中用(?\u003cthread\u003eT\\d+)捕获线程号，而线程号之前的字符串“0502:155243:481|”则用^\\d+:\\d+:\\d+\\|匹配，所以开始规则写成：^\\d+:\\d+:\\d+\\|(?\u003cthread\u003eT\\d+).*转发交易请求\n5. 修改终结规则。由于要和步骤4的开始规则用同一线程号，所以要改成：${thread}.*Server\\s+response\n6. 用-\u003e连接3、4、5的规则，则pattern为^\\d+:\\d+:\\d+\\|(?\u003cthread\u003eT\\d+).*转发交易请求 -\u003e ${thread} -\u003e .. -\u003e ${thread}.*Server\\s+response。注：由于pattern中含有特殊字符，写在yaml文件里面推荐用单引号括起pattern，不用双引号是因为要额外对特殊字符转义。\n\n最终的配置成：\n\n```\nmultiline:\n  negate: true\n  pattern: '^\\d{4}:'\n  match: \"after\"\n  timeout: \"1800s\"\n  max_lines: 5000\ntransactionline:\n  patterns:\n    - '^\\d+:\\d+:\\d+\\|(?\u003cthread\u003eT\\d+).*转发交易请求 -\u003e ${thread} -\u003e .. -\u003e ${thread}.*Server\\s+response'\n  timeout: \"60s\"\n  max_lines: 1000\n```\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdtstack%2Fjfilebeat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdtstack%2Fjfilebeat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdtstack%2Fjfilebeat/lists"}