{"id":51261628,"url":"https://github.com/manticore-projects/i18-extractor","last_synced_at":"2026-06-29T12:32:12.425Z","repository":{"id":366273944,"uuid":"1234236814","full_name":"manticore-projects/i18-extractor","owner":"manticore-projects","description":"Extract String from Java Sources into Language Resource Bundles","archived":false,"fork":false,"pushed_at":"2026-06-21T03:19:13.000Z","size":83,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-21T05:10:40.860Z","etag":null,"topics":["i18n","java","localization","string","text","tool","translate"],"latest_commit_sha":null,"homepage":"https://manticore-projects.com","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/manticore-projects.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-09T23:22:12.000Z","updated_at":"2026-06-21T03:19:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/manticore-projects/i18-extractor","commit_stats":null,"previous_names":["manticore-projects/i18-extractor"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/manticore-projects/i18-extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manticore-projects%2Fi18-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manticore-projects%2Fi18-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manticore-projects%2Fi18-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manticore-projects%2Fi18-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/manticore-projects","download_url":"https://codeload.github.com/manticore-projects/i18-extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manticore-projects%2Fi18-extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34927675,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-29T02:00:05.398Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["i18n","java","localization","string","text","tool","translate"],"created_at":"2026-06-29T12:32:10.642Z","updated_at":"2026-06-29T12:32:12.420Z","avatar_url":"https://github.com/manticore-projects.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# i18n-extractor\n\nExternalises hardcoded user-facing strings from a Java/Swing codebase into a\n`messages_\u003clocale\u003e.properties` ResourceBundle and rewrites call sites to use a\nsmall `I18n.tr(...)` helper.\n\n## Supported input shapes\n\n- String literal: `setText(\"Hello\")`\n- `+` concatenation: `setText(\"Found \" + count + \" records\")`\n- `String.format(...)` (incl. positional `%1$s`, `%d`, `%f`, `%n`, `%%`)\n- `MessageFormat.format(...)` (passthrough)\n- Inline builder: `new StringBuilder().append(...).toString()`\n- Multi-statement local builder used linearly across statements\n- Methods returning a substantial String pattern (e.g. `getWelcomeText()`,\n  `getLicenseNotice()`, `getAboutText()`) — covers helpers whose UI sink is in\n  another class. Triggered when the pattern has ≥ 2 newlines or ≥ 100 chars.\n- **Constructor arguments to UI types**: `new JButton(\"Save\")`,\n  `new JLabel(\"Username:\")`, `new JMenu(\"File\")`, etc. Built-in coverage of\n  standard `javax.swing` classes:\n\n  | Class | Position(s) |\n    |-------|-------------|\n  | `JButton`, `JCheckBox`, `JCheckBoxMenuItem`, `JRadioButton`, `JRadioButtonMenuItem`, `JToggleButton` | 0 |\n  | `JLabel`, `JMenu`, `JMenuItem`, `JPopupMenu` | 0 |\n  | `JFrame`, `JInternalFrame`, `JOptionPane`, `JToolBar` | 0 |\n  | `JDialog` | 0, 1 (handles both `(String)` and `(owner, String)` overloads) |\n  | `AbstractAction` | 0 |\n  | `TitledBorder` | 0, 1 (handles both `(String)` and `(Border, String)` overloads) |\n  | `ProgressMonitor` | 1, 2 (message + note) |\n\n  Excluded from defaults (data-content rather than UI labels, register\n  per-project if needed): `JTextField`, `JTextArea`, `JEditorPane`,\n  `JPasswordField`, `JComboBox`, `JList`, `JTable`, `JFileChooser`. AWT\n  classes also excluded — name collisions with user-defined classes are\n  too easy.\n\n  Custom classes registered via `--ui-constructor ClassName:pos1,pos2,...`.\n\n- **Exception messages** — `throw new Exception(\"message\")` and friends.\n  Built-in coverage:\n\n  | Class | Position |\n    |-------|----------|\n  | `Exception`, `RuntimeException` | 0 |\n  | `IllegalArgumentException`, `IllegalStateException`, `UnsupportedOperationException` | 0 |\n  | `IOException`, `SQLException` | 0 |\n\n  Concatenation chains like\n  `new Exception(\"User \" + uid + \" not allowed to write\")` are flattened to a\n  MessageFormat pattern with the dynamic parts as positional arguments — same\n  treatment as `String.format(...)`.\n\n  Deliberately excluded: `NullPointerException`, `ClassCastException`,\n  `NumberFormatException`, `ConcurrentModificationException` — typically\n  programmatic, not user-facing prose. Add via `--ui-constructor\n  YourException:0` for project-specific exception classes.\n\n- **Constraint-method strings** — for helper classes like `GridBagPane.add(c,\n  \"label=Account:,tooltip=The user account.\")` that take a comma-separated\n  `key=value` configuration string. Register the method name with\n  `--constraint-method add` and the extractor:\n\n    - parses the string via `split(\",+\")` (matching `GridBagPane`'s own parser);\n    - for each `label=`, `tooltip=`, or `setToolTipText=` value, strips the\n      `GridBagPane` prefix markers (`!`, `*`, `?`, `+`) and adds a bundle entry\n      keyed by the value's slug;\n    - **does not modify the call site** — the runtime helper looks up the\n      translation via `I18n.localize(value)` (which computes the same slug) and\n      falls back to the original literal on miss.\n\n  This pairs with the `localize` / `slugify` methods on `I18n.java`. See the\n  patched `GridBagPane.java` deliverable for the runtime side.\n- **Static `Object[][]` menu/toolbar definitions**: rows of\n  `{categoryLabel, new SomeAction[]{...}}` — the row's first String literal is\n  extracted *if* the row also contains an array creation of a registered UI\n  constructor type (so plain data tables don't get caught).\n- **`@I18N` comment directive on any 2D array** (field or local variable):\n  forces extraction by column without requiring a registered UI constructor.\n  Two syntaxes:\n\n  ```java\n  // @I18N(1, 3)              \u003c-- numeric: extract columns 1 and 3 of every row\n  // @I18N(\"KEY, TEXT, SKIP, TEXT\")  \u003c-- role-based:\n  //   KEY  = identifier column; its value becomes the bundle-key prefix\n  //   TEXT = user-facing text; extract and translate\n  //   SKIP = ignored\n  ```\n\n  Example — extracting button labels and tooltips with semantic keys:\n\n  ```java\n  // @I18N(\"KEY,TEXT,TEXT\")\n  buttonsDef = new String[][] {\n      {\"etlButton\",    \"ETL\",            \"Manage ETL Files\"},\n      {\"reportButton\", \"Reports\",        \"Build Reports\"},\n      {\"adminButton\",  \"Administration\", \"Configuration and Administration\"}\n  };\n  ```\n\n  Yields bundle keys `\u003cClassName\u003e.etlButton = ETL` for the first TEXT cell of\n  each row (using the KEY value verbatim) and `\u003cClassName\u003e.etlButton.manage.etl.files\n  = Manage ETL Files` for subsequent TEXT cells (KEY value + slug of the text).\n  Without a KEY column, falls back to the normal slug-from-text key generation.\n\n  For the menu-array pattern with category labels and inner UI constructors,\n  combine `@I18N(0)` (or `@I18N(\"TEXT\")`) on the field — to extract category\n  strings — with `--ui-constructor Action:1,3` for the inner Action constructor\n  arguments. The two mechanisms are complementary: one for the outer array, one\n  for the nested constructors.\n- Arbitrary nesting of all the above\n\n## Project layout expected\n\nThe tool walks `\u003cfolder\u003e/src/main/java` for `.java` sources and writes the\nbundle to `\u003cfolder\u003e/src/main/resources/messages_\u003clocale\u003e.properties` — the\nstandard Maven/Gradle layout. So your target project should look like:\n\n    IFRSBox/\n      src/\n        main/\n          java/\n            com/manticore/ifrsbox/MyPanel.java\n            ...\n          resources/\n\nFor non-standard layouts, override with `--src` and `--res`:\n\n    gradle run --args=\"myproject --src src/swing --res src/i18n\"\n\n## Build\n\n    gradle build\n\nJava 21 toolchain. JavaParser 3.26.x.\n\n## Run\n\nTwo invocation styles:\n\n    # explicit args\n    gradle run --args=\"/home/are/Documents/src/VBox/IFRSBox\"\n    gradle run --args=\"/home/are/Documents/src/VBox/IFRSBox --dry-run --locale en --bundle messages\"\n\n    # property style\n    gradle -Pfolder=/home/are/Documents/src/VBox/IFRSBox run\n    gradle -Pfolder=/home/are/Documents/src/VBox/IFRSBox -PdryRun -Plocale=en run\n\n### Options\n\n| Option              | Property            | Default                   | Description                              |\n|---------------------|---------------------|---------------------------|------------------------------------------|\n| `--bundle \u003cname\u003e`   | `-Pbundle=`         | `messages`                | Bundle base name                         |\n| `--locale \u003ccode\u003e`   | `-Plocale=`         | `en`                      | Locale suffix                            |\n| `--helper-package`  | `-PhelperPackage=`  | `com.manticore.i18n`      | Package of the `I18n.tr` helper          |\n| `--src \u003cpath\u003e`      | `-Psrc=`            | `src/main/java`           | Source root relative to project          |\n| `--res \u003cpath\u003e`      | `-Pres=`            | `src/main/resources`      | Resources root relative to project       |\n| `--dry-run`         | `-PdryRun`          | off                       | Analyse only, no file writes             |\n| `--emit-helper`     | `-PemitHelper`      | off                       | Write the `I18n.java` helper into project|\n| `--check`           | `-Pcheck`           | off                       | CI mode: dry-run + non-zero exit on issues |\n\n## Recommended workflow\n\n1. **Commit a clean baseline** — the tool rewrites source files in place; you\n   want a clean diff to review.\n2. **Dry-run first** to see scope and key naming:\n\n       gradle run --args=\"/path/to/project --dry-run --emit-helper\"\n\n3. **Run for real** on one module/package at a time. Constrain the `src/java`\n   path to scope the run if needed.\n4. **Review the diff.** Common things to look for:\n    - log messages, SQL strings, or property keys that got extracted\n      (heuristics catch most but not all)\n    - concatenations that produced ugly keys — rename in both source and\n      bundle before translation\n    - multi-statement builder cleanup that left a comment orphaned\n5. **Add the I18n helper** if not using `--emit-helper`. Place\n   `I18n.java` in the configured helper package on the classpath.\n6. **Translate** the bundle. For Bahasa Indonesia, copy\n   `messages_en.properties` → `messages_id.properties` and translate values\n   only (never keys).\n\n## Method-return extraction\n\nWhen a method returns a `String` and the pattern is \"substantial\" (≥ 2 newlines\nOR ≥ 100 chars), the extractor will pull the whole returned expression into the\nbundle. The key is `\u003cClassName\u003e.\u003cmethodName\u003e` directly, e.g.\n`AccountPanel.getWelcomeText`, ignoring the slug heuristic.\n\nRefusal cases (left untouched):\n\n- Methods named `toString`, `hashCode`, `equals`, `clone`, `getClass`,\n  `getName`, `getId`, `getKey`, `getCode`\n- Returns inside nested lambdas (return target isn't the enclosing method)\n- Patterns under the substantial-text threshold (most exception messages,\n  toString helpers, getters fall into this bucket)\n- Any pattern that fails `shouldExtract` (SQL, URLs, property keys, etc.)\n\nIf a method falsely passes (e.g. a multi-line debug helper), add its name to\n`Extractor.SKIP_METHOD_NAMES` or rename the heuristic threshold.\n\n## Re-running on a live codebase\n\nThe tool is **idempotent and safe to re-run**. After the first extraction:\n\n- Existing `messages_en.properties` is loaded and preserved.\n- Already-extracted call sites (`I18n.tr(...)`) are skipped — `tr` isn't a UI sink.\n- New literals added by developers are extracted on the next run; identical\n  patterns reuse existing keys via reverse-bundle dedup.\n\nRecommended workflow for ongoing maintenance:\n\n1. **Prevention (new code):** Enable IntelliJ's *Hardcoded strings* inspection\n   project-wide and ship the inspection profile in `.idea/inspectionProfiles/`.\n   Wire `idea.sh inspect` into CI to fail PRs that introduce new literals.\n2. **Periodic sweep:** Run `gradle run --args=\"\u003cfolder\u003e\"` against each module\n   monthly (or on demand). Review the diff and commit.\n3. **CI guard:** `gradle run --args=\"\u003cfolder\u003e --check\"` exits non-zero if there\n   are unextracted strings, orphan keys, or missing translations. Wire it into\n   pipelines as a gating step.\n\n## Maintenance reports (run on every invocation)\n\nThe tool emits three reports after every run:\n\n**Orphan keys** — bundle entries with no `I18n.tr(...)` reference in source.\nUsually caused by a UI element being deleted without removing the bundle entry.\nReviewed manually; not auto-pruned (too risky).\n\nFor dynamic keys (computed at runtime) that legitimately have no static\nreference, annotate with `//$NLS-KEEP$` to silence the warning:\n\n```java\n//$NLS-KEEP$ Status.READY, Status.ERROR, Status.PENDING\nString key = \"Status.\" + status.name();\nreturn I18n.tr(key);\n```\n\n**Unresolved keys** — the mirror image of orphan keys: a key passed to\n`I18n.tr(\"literal\", ...)` that has **no entry in the bundle yet**. This happens\nwhen a developer hand-writes the key ahead of (or instead of) extraction, e.g.\n\n```java\nJOptionPane.showMessageDialog(this, I18n.tr(\"DataCaptureUploadPane.success\"), ...);\n```\n\nwith no `DataCaptureUploadPane.success` line in `messages_en.properties`. At\nruntime the `I18n.tr` helper would fall back to rendering the raw key string to\nthe user, so these are real defects.\n\nUnlike orphans, unresolved keys are **auto-filled** (outside `--dry-run`): the\nkey is added to the source bundle with a best-effort English value derived from\nthe key itself —\n\n- a leading `ClassName.` segment is dropped, the remaining dot-words are\n  title-cased and space-joined (`DataCaptureUploadPane.select.file` → `Select File`);\n- one `{0..n-1}` `MessageFormat` placeholder is appended per call argument, using\n  the **maximum** arity seen across all call sites\n  (`I18n.tr(\"…failure.on.step\", id)` → `Failure On Step {0}`).\n\nThe values are deliberately rough — a sensible starting point to refine in the\nbundle, not a finished translation. They show up as ordinary additions in the\nbundle diff for review. Only string-literal first arguments are considered;\ncomputed keys are skipped (use `//$NLS-KEEP$` for those). In `--dry-run` /\n`--check` the keys are reported but not written, and `--check` exits non-zero\nwhen any remain.\n\n**Missing translations** — keys present in `messages_en.properties` but absent\nor blank in any sibling `messages_*.properties` file. Gives the translator a\nclear worklist after each extraction sweep. The tool reports only — translation\nitself is out of scope. (Auto-filled unresolved keys are folded into this report\ntoo, so a freshly hand-written key surfaces as a translation gap in the other\nlocales on the same run.)\n\n## When English text changes\n\nOnce a string is extracted, the bundle is the source of truth. Edits go there,\nnot in source code. There's no automatic way to flag the corresponding\nnon-source-locale entries as stale when the English value changes — this is an\ninherent property of key-based ResourceBundle and not specific to this tool.\n\nTwo practical options for handling this:\n\n1. **Rename the key** when the meaning changes substantially. The tool will\n   report the old key as orphan and the new key as missing in non-source\n   locales. Translators see both signals on the next run.\n2. **Edit the value in place** when the meaning is unchanged. Document this in\n   the commit message so translators can choose to update or skip.\n\nFor audit-grade tracking of which translations correspond to which English\nversions, you'd need a TMS (Crowdin, Lokalise, POEditor) — they hash source\nstrings and flag stale translations automatically. Worth the cost only if you\nhave many languages or strict regulatory translation requirements.\n\n## Heuristics\n\nA literal is considered user-facing if it:\n- contains at least 2 letters, length ≥ 2, not blank\n- is not a property-key shape (`lowercase.dot.notation`)\n- is not a SCREAMING_CONSTANT\n- doesn't contain SQL keywords (SELECT, FROM, INSERT INTO, …)\n- doesn't start with `/`, `http://`, `https://`, `file://`, `jdbc:`\n\nUI sinks recognised: `setText`, `setTitle`, `setToolTipText`, `setLabel`,\n`showMessageDialog`, `showConfirmDialog`, `showInputDialog`, `showOptionDialog`,\n`createTitledBorder`, `createEtchedBorder`, `addTab`, `insertTab`,\n`setTitleAt`. Edit `Extractor.UI_METHODS` to add more.\n\n## Multi-statement StringBuilder safety\n\nThe dataflow detector refuses extraction (leaves source untouched) when:\n- the variable isn't a local declared in the same method\n- the initialiser isn't `new StringBuilder()` / `new StringBuffer()`\n- any reference is something other than `.append()` / `.toString()`\n- any reference lives in a different `BlockStmt` (so loops, ifs, try blocks\n  all bail out)\n- a second `.toString()` exists (the variable is consumed twice)\n\nFailed extractions are silent — review the file to see what didn't transform.\n\n## Limitations / known gaps\n\n- Cross-method builders (passing a StringBuilder to a helper that appends to\n  it) — not detected, refuses extraction.\n- Aliasing (`StringBuilder ref = sb; ref.append(...)`) — appends via the\n  alias are silently missed; review by hand.\n- Pre-existing `import` of a different `I18n` class will collide; the\n  generated import goes to `\u003chelper-package\u003e.I18n` regardless.\n- Width / precision in printf specs (`%5.2f`) is dropped; hand-edit the\n  bundle to `{0,number,#,##0.00}` if needed.\n- Plurals: stdlib `MessageFormat.ChoiceFormat` works but is ugly. ICU\n  MessageFormat support is out of scope.\n\n## Re-running\n\nRe-running on already-extracted code is safe: existing bundle entries are\npreserved and reused (deduplication by pattern). New strings get appended.\nTranslations in `messages_\u003cother-locale\u003e.properties` are not touched.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanticore-projects%2Fi18-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmanticore-projects%2Fi18-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanticore-projects%2Fi18-extractor/lists"}