{"id":36604376,"url":"https://github.com/commitd/embedded-baleen","last_synced_at":"2026-01-12T08:42:16.193Z","repository":{"id":57738708,"uuid":"148156858","full_name":"commitd/embedded-baleen","owner":"commitd","description":"Library to allow embedding of Baleen in other applications","archived":false,"fork":false,"pushed_at":"2020-10-28T16:00:48.000Z","size":39,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-07-06T15:16:30.551Z","etag":null,"topics":["baleen"],"latest_commit_sha":null,"homepage":"http://github.com/dstl/baleen","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/commitd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-10T13:07:26.000Z","updated_at":"2022-01-10T08:50:29.000Z","dependencies_parsed_at":"2022-08-24T17:51:15.684Z","dependency_job_id":null,"html_url":"https://github.com/commitd/embedded-baleen","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/commitd/embedded-baleen","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/commitd%2Fembedded-baleen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/commitd%2Fembedded-baleen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/commitd%2Fembedded-baleen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/commitd%2Fembedded-baleen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/commitd","download_url":"https://codeload.github.com/commitd/embedded-baleen/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/commitd%2Fembedded-baleen/sbom","scorecard":{"id":300920,"data":{"date":"2025-08-11","repo":{"name":"github.com/commitd/embedded-baleen","commit":"40fba5ba0483281b4e21ecd0bcfabc0e2c96c24a"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.9,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/17 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: github.com/commitd/.github/SECURITY.md:1","Info: Found linked content: github.com/commitd/.github/SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: github.com/commitd/.github/SECURITY.md:1","Info: Found text in security policy: github.com/commitd/.github/SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 10 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-17T20:36:05.641Z","repository_id":57738708,"created_at":"2025-08-17T20:36:05.641Z","updated_at":"2025-08-17T20:36:05.641Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28337599,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T06:09:07.588Z","status":"ssl_error","status_checked_at":"2026-01-12T06:05:18.301Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baleen"],"created_at":"2026-01-12T08:42:16.115Z","updated_at":"2026-01-12T08:42:16.182Z","avatar_url":"https://github.com/commitd.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# embedded-baleen\n\nLibrary to allow embedding of Baleen in other applications (http://github.com/dstl/baleen)\n\nThis include a both a single threaded and pooled Baleens. \n\nThe pool options is useful in the case you which you are servicing multiple requests (e.g. through a web interface) and wish to have \n\nEach Baleen in the pool has its own copy of all resources, etc. There is not shared knowledge. This also means that that a pool of N Baleens takes roughly N times more memory than a single Baleen. \n\n## Usage example\n\n```\n// A consumer which will return the text from Baleen\n// Use consumer to convert the jCas into something you want to deal with in your application\nBaleenOutputConverter\u003cString\u003e consumer = (context, jCas) -\u003e Optional.of(jCas.getDocumentText());\n\n// Provide Baleen Pipeline YAML as a string  \nString yaml = ...; // read YAML from file or as a constant\n\n// Initialise Baleen, a poolSize of 1 means it will be standalone\n// any number higher will create multiple Baleen instances\nint poolSize = 1;\nEmbeddableBaleen baleen = EmbeddableBaleen.create(\"my-baleen\", poolSize);\nbaleen.setup(yaml);\n\n// Push documents through\n// You must provide a source (string),\n// the inputstream\n// and the consumer as above\nInputStream is = ...; \nOptional\u003cString\u003e text = baleen.process(\"file.txt\", is, consumer)\n\n// If optional is present then acces\nif(text.isPresent()) {\n   System.out.println(text.get());\n}\n\n// NOTE: handle BaleenException to process errors\n\n```\n\n## Health warning\n\nBaleen was design to run a standalone application.\n\nBaleen has an *enormous* set of dependencies, including artifacts such as Elasticsearch and Tika which have enormous number of dependencies in their own right.\n\nWhilst this library will allow use to to use Baleen in another application's you might find that you have version clashes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcommitd%2Fembedded-baleen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcommitd%2Fembedded-baleen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcommitd%2Fembedded-baleen/lists"}