{"id":13743275,"url":"https://github.com/asottile/tokenize-rt","last_synced_at":"2025-04-08T12:08:55.363Z","repository":{"id":21530823,"uuid":"93190454","full_name":"asottile/tokenize-rt","owner":"asottile","description":"A wrapper around the stdlib `tokenize` which roundtrips.","archived":false,"fork":false,"pushed_at":"2025-03-31T21:00:01.000Z","size":273,"stargazers_count":52,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-01T11:02:08.941Z","etag":null,"topics":["python","refactoring"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/asottile.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"asottile"}},"created_at":"2017-06-02T17:49:02.000Z","updated_at":"2025-03-31T21:00:05.000Z","dependencies_parsed_at":"2023-11-14T03:28:54.360Z","dependency_job_id":"c1089472-9403-4102-997e-75733b9eca3e","html_url":"https://github.com/asottile/tokenize-rt","commit_stats":{"total_commits":164,"total_committers":4,"mean_commits":41.0,"dds":0.5060975609756098,"last_synced_commit":"36bb087a60ac2a4132ab58ab1f4833312de945e5"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asottile%2Ftokenize-rt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asottile%2Ftokenize-rt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asottile%2Ftokenize-rt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asottile%2Ftokenize-rt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/asottile","download_url":"https://codeload.github.com/asottile/tokenize-rt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247838444,"owners_count":21004580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["python","refactoring"],"created_at":"2024-08-03T05:00:43.782Z","updated_at":"2025-04-08T12:08:55.320Z","avatar_url":"https://github.com/asottile.png","language":"Python","funding_links":["https://github.com/sponsors/asottile"],"categories":["Tools"],"sub_categories":[],"readme":"[![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)\n\ntokenize-rt\n===========\n\nThe stdlib `tokenize` module does not properly roundtrip.  This wrapper\naround the stdlib provides two additional tokens `ESCAPED_NL` and\n`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and\n`tokens_to_src` to roundtrip.\n\nThis library is useful if you're writing a refactoring tool based on the\npython tokenization.\n\n## Installation\n\n```bash\npip install tokenize-rt\n```\n\n## Usage\n\n### datastructures\n\n#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`\n\nA token offset, useful as a key when cross referencing the `ast` and the\ntokenized source.\n\n#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`\n\nConstruct a token\n\n- `name`: one of the token names listed in `token.tok_name` or\n  `ESCAPED_NL` or `UNIMPORTANT_WS`\n- `src`: token's source as text\n- `line`: the line number that this token appears on.\n- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the\n  line.\n\n#### `tokenize_rt.Token.offset`\n\nRetrieves an `Offset` for this token.\n\n### converting to and from `Token` representations\n\n#### `tokenize_rt.src_to_tokens(text: str) -\u003e List[Token]`\n\n#### `tokenize_rt.tokens_to_src(Iterable[Token]) -\u003e str`\n\n### additional tokens added by `tokenize-rt`\n\n#### `tokenize_rt.ESCAPED_NL`\n\n#### `tokenize_rt.UNIMPORTANT_WS`\n\n### helpers\n\n#### `tokenize_rt.NON_CODING_TOKENS`\n\nA `frozenset` containing tokens which may appear between others while not\naffecting control flow or code:\n- `COMMENT`\n- `ESCAPED_NL`\n- `NL`\n- `UNIMPORTANT_WS`\n\n#### `tokenize_rt.parse_string_literal(text: str) -\u003e Tuple[str, str]`\n\nparse a string literal into its prefix and string content\n\n```pycon\n\u003e\u003e\u003e parse_string_literal('f\"foo\"')\n('f', '\"foo\"')\n```\n\n#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -\u003e Iterator[Tuple[int, Token]]`\n\nyields `(index, token)` pairs.  Useful for rewriting source.\n\n#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -\u003e Tuple[int, ...]`\n\nfind the indices of the string parts of a (joined) string literal\n\n- `i` should start at the end of the string literal\n- returns `()` (an empty tuple) for things which are not string literals\n\n```pycon\n\u003e\u003e\u003e tokens = src_to_tokens('\"foo\" \"bar\".capitalize()')\n\u003e\u003e\u003e rfind_string_parts(tokens, 2)\n(0, 2)\n\u003e\u003e\u003e tokens = src_to_tokens('(\"foo\" \"bar\").capitalize()')\n\u003e\u003e\u003e rfind_string_parts(tokens, 4)\n(1, 3)\n```\n\n## Differences from `tokenize`\n\n- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline \"token\"\n- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)\n- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for\n  instance, this means you'll see `Token('STRING', \"f'foo'\", ...)` even in\n  python 2.\n- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal\n  literals (`0755`) in python 3 (for easier rewriting of python 2 code while\n  running python 3).\n\n## Sample usage\n\n- https://github.com/asottile/add-trailing-comma\n- https://github.com/asottile/future-annotations\n- https://github.com/asottile/future-fstrings\n- https://github.com/asottile/pyupgrade\n- https://github.com/asottile/yesqa\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasottile%2Ftokenize-rt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasottile%2Ftokenize-rt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasottile%2Ftokenize-rt/lists"}