{"id":13413969,"url":"https://github.com/avito-tech/normalize","last_synced_at":"2026-04-03T18:10:28.305Z","repository":{"id":53580381,"uuid":"350282009","full_name":"avito-tech/normalize","owner":"avito-tech","description":null,"archived":false,"fork":false,"pushed_at":"2021-04-01T08:47:45.000Z","size":12,"stargazers_count":46,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-07-31T20:53:12.255Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/avito-tech.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-22T09:25:14.000Z","updated_at":"2024-05-10T20:25:53.000Z","dependencies_parsed_at":"2022-08-24T11:01:19.110Z","dependency_job_id":null,"html_url":"https://github.com/avito-tech/normalize","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/avito-tech/normalize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avito-tech%2Fnormalize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avito-tech%2Fnormalize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avito-tech%2Fnormalize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avito-tech%2Fnormalize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/avito-tech","download_url":"https://codeload.github.com/avito-tech/normalize/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/avito-tech%2Fnormalize/sbom","scorecard":{"id":218383,"data":{"date":"2025-08-11","repo":{"name":"github.com/avito-tech/normalize","commit":"07914ec46c8de8f84c87f3d54e888f701f6946d3"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.8,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/3 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:17: update your workflow using https://app.stepsecurity.io/secureworkflow/avito-tech/normalize/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/avito-tech/normalize/ci.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:32: update your workflow using https://app.stepsecurity.io/secureworkflow/avito-tech/normalize/ci.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:40: update your workflow using https://app.stepsecurity.io/secureworkflow/avito-tech/normalize/ci.yml/master?enable=pin","Info:   0 out of   2 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 third-party GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/ci.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 4 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-17T02:03:50.461Z","repository_id":53580381,"created_at":"2025-08-17T02:03:50.461Z","updated_at":"2025-08-17T02:03:50.461Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31368160,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-03T17:53:18.093Z","status":"ssl_error","status_checked_at":"2026-04-03T17:53:17.617Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T20:01:54.049Z","updated_at":"2026-04-03T18:10:28.276Z","avatar_url":"https://github.com/avito-tech.png","language":"Go","readme":"# normalize\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Go Reference](https://pkg.go.dev/badge/github.com/avito-tech/normalize.svg)](https://pkg.go.dev/github.com/avito-tech/normalize)\n[![ci](https://github.com/avito-tech/normalize/actions/workflows/ci.yml/badge.svg)](https://github.com/avito-tech/normalize/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/avito-tech/normalize/branch/master/graph/badge.svg?token=DJMFEBX8H7)](https://codecov.io/gh/avito-tech/normalize)\n[![Go Report Card](https://goreportcard.com/badge/github.com/avito-tech/normalize?style=flat)](https://goreportcard.com/report/github.com/avito-tech/normalize)\n\nSimple library for fuzzy text sanitizing, normalizing and comparison.\n\n## Why\nPeople type differently. This may be a problem if you need to associate user input with some internal entity or compare two inputs of different users. Say `abc-01` and `ABC 01` must be considered to be the same strings in your system. There are many heuristics we can apply to make this work:\n\n* Remove special characters.\n* Convert everything to lowercase.\n* etc.\n\nThis library is essentially an easily configurable set of useful helpers implementing all these transformations.\n## Installation\n```bash\ngo get -u github.com/avito-tech/normalize \n```\n## Features\n### Normalize fuzzy text \n```go\npackage main \n\nimport (\n\t\"fmt\"\n\t\"github.com/avito-tech/normalize\"\n)\n\nfunc main() {\n\tfuzzy := \"VAG-1101\"\n\tclean := normalize.Normalize(fuzzy)\n\tfmt.Print(clean) // vag1101\n\n\tmanyFuzzy := []string{\"VAG-1101\", \"VAG-1102\"}\n\tmanyClean := normalize.Many(manyFuzzy)\n\tfmt.Print(manyClean) // {\"vag1101\", \"vag1102\"}\n}\n```\n\n#### Default rules (in order of actual application):\n* Any char except latin/cyrillic letters, German umlauts (`ä`, `ö`, `ü`) and digits are removed.\n* Rare cyrillic letters `ё` and `й` are replaced with  common equivalents `е` and `и`.\n* Latin/cyrillic look-alike pairs are normalized to latin letters, so `В (в)` becomes `B (b)`. Please check all replacement pairs in `WithCyrillicToLatinLookAlike` normalizer in `normalizers.go`.\n* German umlauts `ä`, `ö`, `ü` get converted to latin `a`, `o`, `u`.\n* The whole string gets lower cased.\n\n### Compare fuzzy texts\nCompare two strings with all normalizations described above applied. Provide threshold parameters to tweak how similar strings must be to make the function return `true`. \n`threshold` is relative value, so `0.5` roughly means *\"strings are 50% different after all normalizations applied\"*.\n\n[Levenstein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) is used under the hood to compute distance between strings.\n\n```go\npackage main\n\nimport (\n    \"fmt\"\n    \"github.com/avito-tech/normalize\"\n)\n\nfunc main() {\n\tfuzzy := \"Hyundai-Kia\"\n\totherFuzzy := \"HYUNDAI\"\n\tsimilarityThreshold := 0.3\n\tresult := normalize.AreStringsSimilar(fuzzy, otherFuzzy, similarityThreshold)\n\n\t// distance(hyundaikia, hyundai) = 3\n\t// 3 / len(hyundaikia) = 0.3 \n\tfmt.Print(result) // true\n}\n```\n\n#### Default rules\n* Apply default normalization (described above).\n* Calculate Levenstein distance and return `true` if `distance / strlen \u003c= threshold`.\n\n\n### Configuration\nBoth `AreStringsSimilar` and `Normalize` accept arbitrary number of normalizers as an optional parameter.\nNormalizer is any function that accepts string and returns string.\n\nFor example, following option will leave string unchanged.\n\n```go\npackage main\n\nimport \"github.com/avito-tech/normalize\"\n\nfunc WithNoNormalization() normalize.Option {\n\treturn func(str string) string {\n\t\treturn str\n\t}\n}\n```\n\nYou can configure normalizing to use only those options you need. For example, you can use only lower casing and cyr2lat conversion during normalization. Note that the order of options matters.\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\t\"github.com/avito-tech/normalize\"\n)\n\nfunc main() {\n\tfuzzy := \"АВ-123\"\n\tclean := normalize.Normalize(fuzzy, normalize.WithLowerCase(), normalize.WithCyrillicToLatinLookAlike())\n\tfmt.Print(clean) // ab-123\n}\n```\n","funding_links":[],"categories":["Text Processing","文本处理","Bot Building","Specific Formats","Template Engines"],"sub_categories":["Parsers/Encoders/Decoders","解析 器/Encoders/Decoders","HTTP Clients"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favito-tech%2Fnormalize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Favito-tech%2Fnormalize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Favito-tech%2Fnormalize/lists"}