https://github.com/asottile/tokenize-rt

A wrapper around the stdlib `tokenize` which roundtrips.
https://github.com/asottile/tokenize-rt

python refactoring

Last synced: 8 months ago
JSON representation

A wrapper around the stdlib `tokenize` which roundtrips.

Host: GitHub
URL: https://github.com/asottile/tokenize-rt
Owner: asottile
License: mit
Created: 2017-06-02T17:49:02.000Z (over 8 years ago)
Default Branch: main
Last Pushed: 2025-03-31T21:00:01.000Z (9 months ago)
Last Synced: 2025-04-01T11:02:08.941Z (9 months ago)
Topics: python, refactoring
Language: Python
Size: 267 KB
Stars: 52
Watchers: 3
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-python-ast - tokenize-rt - A wrapper around the stdlib `tokenize` which roundtrips. (Tools)

README

          [![build status](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml/badge.svg)](https://github.com/asottile/tokenize-rt/actions/workflows/main.yml)

[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/main.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/main)

tokenize-rt

===========

The stdlib `tokenize` module does not properly roundtrip.  This wrapper

around the stdlib provides two additional tokens `ESCAPED_NL` and

`UNIMPORTANT_WS`, and a `Token` data type.  Use `src_to_tokens` and

`tokens_to_src` to roundtrip.

This library is useful if you're writing a refactoring tool based on the

python tokenization.

## Installation

```bash

pip install tokenize-rt

```

## Usage

### datastructures

#### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)`

A token offset, useful as a key when cross referencing the `ast` and the

tokenized source.

#### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)`

Construct a token

- `name`: one of the token names listed in `token.tok_name` or

  `ESCAPED_NL` or `UNIMPORTANT_WS`

- `src`: token's source as text

- `line`: the line number that this token appears on.

- `utf8_byte_offset`: the utf8 byte offset that this token appears on in the

  line.

#### `tokenize_rt.Token.offset`

Retrieves an `Offset` for this token.

### converting to and from `Token` representations

#### `tokenize_rt.src_to_tokens(text: str) -> List[Token]`

#### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str`

### additional tokens added by `tokenize-rt`

#### `tokenize_rt.ESCAPED_NL`

#### `tokenize_rt.UNIMPORTANT_WS`

### helpers

#### `tokenize_rt.NON_CODING_TOKENS`

A `frozenset` containing tokens which may appear between others while not

affecting control flow or code:

- `COMMENT`

- `ESCAPED_NL`

- `NL`

- `UNIMPORTANT_WS`

#### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]`

parse a string literal into its prefix and string content

```pycon

>>> parse_string_literal('f"foo"')

('f', '"foo"')

```

#### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]`

yields `(index, token)` pairs.  Useful for rewriting source.

#### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]`

find the indices of the string parts of a (joined) string literal

- `i` should start at the end of the string literal

- returns `()` (an empty tuple) for things which are not string literals

```pycon

>>> tokens = src_to_tokens('"foo" "bar".capitalize()')

>>> rfind_string_parts(tokens, 2)

(0, 2)

>>> tokens = src_to_tokens('("foo" "bar").capitalize()')

>>> rfind_string_parts(tokens, 4)

(1, 3)

```

## Differences from `tokenize`

- `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token"

- `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`)

- `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for

  instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in

  python 2.

- `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal

  literals (`0755`) in python 3 (for easier rewriting of python 2 code while

  running python 3).

## Sample usage

- https://github.com/asottile/add-trailing-comma

- https://github.com/asottile/future-annotations

- https://github.com/asottile/future-fstrings

- https://github.com/asottile/pyupgrade

- https://github.com/asottile/yesqa

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/asottile/tokenize-rt

Awesome Lists containing this project

README