{"id":18817547,"url":"https://github.com/sebpuetz/lumberjack","last_synced_at":"2025-09-01T01:33:49.374Z","repository":{"id":57635354,"uuid":"178012000","full_name":"sebpuetz/lumberjack","owner":"sebpuetz","description":"Read and modify constituency trees in Rust.","archived":false,"fork":false,"pushed_at":"2020-05-05T12:29:29.000Z","size":363,"stargazers_count":10,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-08-24T16:49:28.623Z","etag":null,"topics":["constituency","constituency-tree","negra","nlp","ptb","rust","rust-crate","tree"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sebpuetz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-27T14:31:59.000Z","updated_at":"2022-04-28T17:34:14.000Z","dependencies_parsed_at":"2022-09-13T04:24:01.428Z","dependency_job_id":null,"html_url":"https://github.com/sebpuetz/lumberjack","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/sebpuetz/lumberjack","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sebpuetz%2Flumberjack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sebpuetz%2Flumberjack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sebpuetz%2Flumberjack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sebpuetz%2Flumberjack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sebpuetz","download_url":"https://codeload.github.com/sebpuetz/lumberjack/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sebpuetz%2Flumberjack/sbom","scorecard":{"id":808946,"data":{"date":"2025-08-11","repo":{"name":"github.com/sebpuetz/lumberjack","commit":"aa23c10700643546e684da2060c1189b2d502c8b"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.6,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":4,"reason":"6 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: RUSTSEC-2021-0139","Warn: Project is vulnerable to: RUSTSEC-2021-0145 / GHSA-g98v-hv3f-hcfr","Warn: Project is vulnerable to: RUSTSEC-2024-0375","Warn: Project is vulnerable to: RUSTSEC-2019-0036 / RUSTSEC-2020-0036 / GHSA-jq66-xh47-j9f3 / GHSA-r98r-j25q-rmpr","Warn: Project is vulnerable to: RUSTSEC-2020-0146 / GHSA-3358-4f7f-p4j4","Warn: Project is vulnerable to: RUSTSEC-2021-0004 / GHSA-w47j-hqpf-qw9w"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 23 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-23T12:35:10.854Z","repository_id":57635354,"created_at":"2025-08-23T12:35:10.854Z","updated_at":"2025-08-23T12:35:10.854Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273064387,"owners_count":25039259,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["constituency","constituency-tree","negra","nlp","ptb","rust","rust-crate","tree"],"created_at":"2024-11-08T00:12:00.437Z","updated_at":"2025-09-01T01:33:49.347Z","avatar_url":"https://github.com/sebpuetz.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Crate](https://img.shields.io/crates/v/lumberjack.svg)](https://crates.io/crates/lumberjack)\n[![Build Status](https://travis-ci.org/sebpuetz/lumberjack.svg?branch=master)](https://travis-ci.org/sebpuetz/lumberjack)\n\n# lumberjack\nRead and process constituency trees in various formats.\n\n## Install:\n* From crates.io:\n```bash\ncargo install lumberjack-utils\n```\n* From GitHub:\n```bash\ncargo install --git https://github.com/sebpuetz/lumberjack\n```\n\n## Usage as standalone:\n\n* Convert treebank in NEGRA export 4 format to bracketed TueBa V2 format\n```bash\nlumberjack-conversion --input_file treebank.negra --input_format negra \\\n    --output_format tueba --output_file treebank.tueba --projectivize\n``` \n* Retain only root node, `NP`s and `PP`s and print to simple bracketed format:\n```bash\necho \"NP PP\" \u003e filter_set.txt\nlumberjack-conversion --input_file treebank.simple --input_format simple \\\n    --output_format tueba --output_file treebank.filtered \\\n    --filter filter_set.txt\n```\n* Convert from treebank in simple bracketed to CONLLX format and annotate\nparent tags of terminals as features.\n```bash\nlumberjack-conversion --input_file treebank.simple --input_format  simple\\\n    --output_format conllx --output_file treebank.conll --parent \n```\n* Modifications in the following order:\n\n1. Reattach all terminals with part-of-speech starting with `$` to the\nroot node\n2. Remove all nonterminals except the root, `S`s, `NP`s, `PP`s and `VP`s\n3. Assign unique identifiers based on the closest `S` to terminals\n4. Insert nodes with label `label` above terminals that aren't dominated by `NP` or `PP`\n5. Annotate label of parent node on terminals.\n6. Print to CONLLX format with annotations.\n\n```bash\necho \"S VP NP PP\" \u003e filter_set.txt\necho \"NP PP\" \u003e insert_set.txt\necho \"S\" \u003e id_set.txt\nlumberjack-conversion --input_file treebank.simple --input_format simple\\\n    --output_format conllx --insertion_set insert_set.txt \\\n    --insertion_label label --id_set id_set.txt --reattach $\\\n    --parent parent --output_file treebank.conllx\n```\n\n## Usage as rust library:\n* read and projectivize trees from NEGRA format and print to simple\n bracketed format\n```rust\nuse std::io::{BufReader, File};\n\nuse lumberjack::io::{NegraReader, PTBFormat};\nuse lumberjack::Projectivize;\n\nfn print_negra(path: \u0026str) {\n    let file = File::open(path).unwrap();\n    let reader = NegraReader::new(BufReader::new(file));\n    for tree in reader {\n        let mut tree = tree.unwrap();\n        tree.projectivize();\n        println!(\"{}\", PTBFormat::Simple.tree_to_string(\u0026tree).unwrap());\n    }\n}\n```\n* filter non-terminal nodes from trees in a treebank and print to\n simple bracketed format:\n```rust\nuse lumberjack::{io::PTBFormat, Tree, TreeOps, util::LabelSet};\n\nfn filter_nodes(iter: impl Iterator\u003cItem=Tree\u003e, set: LabelSet) {\n    for mut tree in iter {\n        tree.filter_nonterminals(|tree, nt| set.matches(tree[nt].label())).unwrap();\n        println!(\"{}\", PTBFormat::Simple.tree_to_string(\u0026tree).unwrap());\n    }\n}\n```\n* convert treebank in simple bracketed format to CONLLX with constituency structure\nencoded in the features field\n```rust\nuse conllx::graph::Sentence;\nuse lumberjack::io::Encode;\nuse lumberjack::{Tree, TreeOps, UnaryChains};\n\nfn to_conllx(iter: impl Iterator\u003cItem=Tree\u003e) {\n    for mut tree in iter {\n        tree.collaps_unary_chains().unwrap();\n        tree.annotate_absolute().unwrap();\n        println!(\"{}\", Sentence::from(\u0026tree));    \n    }\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsebpuetz%2Flumberjack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsebpuetz%2Flumberjack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsebpuetz%2Flumberjack/lists"}