{"id":26163503,"url":"https://github.com/jlevy/chopdiff","last_synced_at":"2025-12-25T01:35:28.683Z","repository":{"id":280225555,"uuid":"941280462","full_name":"jlevy/chopdiff","owner":"jlevy","description":"Parsing, chunking, diffing, and diff filtering, and windowed transforms of text to support LLM applications","archived":false,"fork":false,"pushed_at":"2025-03-02T04:23:56.000Z","size":146,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-02T05:22:55.614Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jlevy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-01T22:59:08.000Z","updated_at":"2025-03-02T04:23:59.000Z","dependencies_parsed_at":"2025-03-02T05:23:13.677Z","dependency_job_id":null,"html_url":"https://github.com/jlevy/chopdiff","commit_stats":null,"previous_names":["jlevy/chopdiff"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jlevy%2Fchopdiff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jlevy%2Fchopdiff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jlevy%2Fchopdiff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jlevy%2Fchopdiff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jlevy","download_url":"https://codeload.github.com/jlevy/chopdiff/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243050937,"owners_count":20228101,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-11T14:27:23.210Z","updated_at":"2025-12-25T01:35:28.660Z","avatar_url":"https://github.com/jlevy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# chopdiff\n\n`chopdiff` is a small library of tools I've developed to make it easier to do fairly\ncomplex transformations of text documents, especially for LLM applications, where you\nwant to manipulate text, Markdown, and HTML documents in a clean way.\n\nBasically, it lets you parse, diff, and transform text at the level of words, sentences,\nparagraphs, and \"chunks\" (paragraphs grouped in an HTML tag like a `\u003cdiv\u003e`). It aims to\nhave minimal dependencies.\n\nExample use cases:\n\n- **Filter diffs:** Diff two documents and only accept changes that fit a specific\n  filter. For example, you can ask an LLM to edit a transcript, only inserting paragraph\n  breaks but enforcing that the LLM can't do anything except insert whitespace.\n  Or let it only edit punctuation, whitespace, and lemma variants of words.\n  Or only change one word at a time (e.g. for spell checking).\n\n- **Backfill information:** Match edited text against a previous version of a document\n  (using a word-level LCS diff), then pull information from one doc to another.\n  For example, say you have a timestamped transcript and an edited summary.\n  You can then backfill timestamps of each paragraph into the edited text.\n\n- **Windowed transforms:** Walk through a large document N paragraphs, N sentences, or N\n  tokens at a time, processing the results with an LLM call, then \"stitching together\"\n  the results, even if the chunks overlap.\n\n## Installation\n\nDrop the `extras` if you don't want the dependency on `simplemma` (it's about 70MB).\n\nFull deps:\n\n```shell\n# Using uv (recommended)\nuv add chopdiff[extras]\n# Using poetry\npoetry add chopdiff -E extras\n# Using pip\npip install chopdiff[extras]\n```\n\nBasic deps:\n\n```shell\n# Using uv (recommended)\nuv add chopdiff\n# Using poetry\npoetry add chopdiff\n# Using pip\npip install chopdiff\n```\n\n## Comparison to Alternatives\n\nThere are full-blown Markdown and HTML parsing libs (such as\n[Marko](https://github.com/frostming/marko) and\n[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)) but these tend\nto focus specifically on fully parsing documents as parse trees.\nOn the other end of the spectrum, there are NLP libraries (like\n[spaCy](https://github.com/explosion/spaCy)) that do more expensive, full language\nparsing and sentence segmentation.\n\nThis is a lightweight alternative to those approaches when you are just focusing on\nprocessing text, don't want a big dependency (like a full XML parser or NLP toolkit) and\nalso want full control over the original source format (since the original text is\nexactly preserved, even whitespace—every sentence, paragraph, and token is mapped back\nto the original text).\n\nNote you may wish to also use this in conjunction with a Markdown parser or\nauto-formatter, as it can make documents and diffs more readable.\nYou may wish to use [**flowmark**](https://github.com/jlevy/flowmark) for this alongside\nchopdiff.\n\n## Overview\n\nMore on what's here:\n\n- The [`TextDoc`](src/chopdiff/docs/text_doc.py) class allows parsing of documents into\n  sentences and paragraphs.\n  By default, this uses only regex heuristics for speed and simplicity, but optionally\n  you can use a sentence splitter of your choice, like Spacy.\n\n- Tokenization using [\"wordtoks\"](src/chopdiff/docs/wordtoks.py) that lets you measure\n  size and extract subdocs via arbitrary units of paragraphs, sentences, words, chars,\n  or tokens, with mappings between each, e.g. mapping sentence 3 of paragraph 2 to its\n  corresponding character or token offset.\n  The tokenization is simple but flexible, including whitespace (sentence or paragraph\n  breaks) and HTML tags as single tokens.\n  It also maintains exact offsets of each token in the original document text.\n\n- [Word-level diffs](src/chopdiff/docs/token_diffs.py) that don't work at the line level\n  (like usual git-style diffs) but rather treat whitespace, sentence, and paragraph\n  breaks as individual tokens.\n  It performs LCS-style token-based diffs with\n  [cydifflib](https://github.com/rapidfuzz/cydifflib), which is significantly faster\n  than Python's built-in [difflib](https://docs.python.org/3.10/library/difflib.html).\n\n- [Filtering](src/chopdiff/transforms/diff_filters.py) of these text-based diffs based\n  on specific criteria.\n  For example, only adding or removing words, only changing whitespace, only changing\n  word lemmas, etc.\n\n- The [`TokenMapping`](src/chopdiff/docs/token_mapping.py) class offers word-based\n  mappings between docs, allowing you to find what part of a doc corresponds with\n  another doc as a token index mappings.\n\n- [`search_tokens`](src/chopdiff/docs/search_tokens.py) gives simple way to search back\n  and forth among the tokens of a document.\n  That is, you can seek forward or backward to any desired token (HTML tag, word,\n  punctuation, or sentence or paragraph break matching a predicate) from any given\n  position.\n\n- Lightweight \"chunking\" of documents by wrapping paragraphs in `\u003cdiv\u003e`s to indicate\n  chunks. [`TextNode`](src/chopdiff/divs/text_node.py) offers simple recursive parsing\n  around `\u003cdiv\u003e` tags.\n  This is not a general HTML parser, but rather a way to chunk documents into named\n  pieces. Unlike more complex parsers, the `TextNode` parser operates on character\n  offsets, so maintains the original document exactly and allows exact reassembly if\n  desired.\n\n- The word-based token mapping allows\n  [transformation](src/chopdiff/transforms/sliding_transforms.py) of documents via\n  sliding windows, transforming text (e.g. via an LLM call one window at a time, with\n  overlapping windows), then re-stitching the results back together with best\n  alignments.\n\nAll this is done very simply in memory, and with only regex or basic Markdown parsing to\nkeep things simple and with few dependencies.\n\n`chopdiff` has no heavier dependencies like full XML or BeautifulSoup parsing or Spacy\nor nltk for sentence splitting (though you can use these as custom sentence parsers if\nyou like).\n\n## Examples\n\nHere are a couple examples to illustrate how all this works, with verbose logging to see\nthe output. See the [examples/](examples/) directory.\n\n### Inserting Paragraph Breaks\n\nThis is an example of diff filtering (see\n[insert_para_breaks.py](examples/insert_para_breaks.py) for full code):\n\n```python\nimport argparse\nimport logging\nfrom textwrap import dedent\n\nimport openai\nfrom flowmark import fill_text\n\nfrom chopdiff.docs import TextDoc\nfrom chopdiff.transforms import changes_whitespace, filtered_transform, WINDOW_2K_WORDTOKS\n\ndef llm_insert_para_breaks(input_text: str) -\u003e str:\n    \"\"\"\n    Call OpenAI to insert paragraph breaks on a chunk of text.\n    Note there is no guarantee this might not make other\n    non-whitespace changes.\n    \"\"\"\n    client = openai.OpenAI()\n\n    response = client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\"role\": \"system\", \"content\": \"You are a careful and precise editor.\"},\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                    Break the following text into paragraphs.\n\n                    Original text:\n\n                    {input_text}\n\n                    Formatted text:\n                    \"\"\"\n                ),\n            },\n        ],\n        temperature=0.0,\n    )\n\n    return response.choices[0].message.content or \"\"\n\n\ndef insert_paragraph_breaks(text: str) -\u003e str:\n    # Create a TextDoc from the input text\n    doc = TextDoc.from_text(text)\n    print(f\"Input document: {doc.size_summary()}\")\n\n    # Define the transformation function.\n    # Note in this case we run the LLM on strings, but you could also work directly\n    # on the TextDoc if appropriate.\n    def transform(doc: TextDoc) -\u003e TextDoc:\n        return TextDoc.from_text(llm_insert_para_breaks(doc.reassemble()))\n\n    # Apply the transformation with windowing and filtering.\n    #\n    # This will walk along the document in approximately 2K \"wordtok\" chunks\n    # (~1000 words) and apply the transformation to each chunk. Chunks can\n    # slightly overlap to make this more robust.\n    #\n    # The change on each chunk will then be filtered to only include whitespace\n    # changes.\n    #\n    # Finally each change will be \"stitched back\" to form the original document,\n    # by looking for the right alignment of words between the original and the\n    # transformed chunk.\n    #\n    # (Turn on logging to see these details.)\n    result_doc = filtered_transform(\n        doc, transform, windowing=WINDOW_2K_WORDTOKS, diff_filter=changes_whitespace\n    )\n\n    print(f\"Output document: {result_doc.size_summary()}\")\n\n    # Return the transformed text\n    return result_doc.reassemble()\n```\n\nRunning this shows how it works.\nNote GPT-4o-mini makes a typo correction, even though it wasn't requested.\nBut the diff filter enforces that the output exactly contains only paragraph breaks:\n\n```\n$ uv run examples/insert_para_breaks.py examples/gettysberg.txt \n\n--- Original --------------------------------------------------------------\n\nfour score and seven years ago our fathers brought forth on this continent, a new\nnation, conceived in Liberty, and dedicated to the proposition that all men are created\nequal. Now we are engaged in a great civil war, testing whether that nation, or any\nnation so conceived and so dedicated, can long endure. We are met on a great\nbattle-field of that war. We have come to dedicate a portion of that field, as a final\nresting place for those who here gave their lives that that nation might live. It is\naltogether fitting and proper that we should do this. But, in a larger sense, we can not\ndedicate—we can not consecrate—we can not hallow—this ground. The brave men, living and\ndead, who struggled here, have consecrated it, far above our poor power to add or\ndetract. The world will little note, nor long remember what we say here, but it can\nnever forget what they did here. It is for us the living, rather, to be dedicated here\nto the unfinished work which they who fought here have thus far so nobly advanced. It is\nrather for us to be here dedicated to the great task remaining before us—that from these\nhonored dead we take increased devotion to that cause for which they gave the last full\nmeasure of devotion—that we here highly resolve that these dead shall not have died in\nvain—that this nation, under God, shall have a new birth of freedom—and that government\nof the people, by the people, for the people, shall not perish from the earth.\n\nInput document: 1466 bytes (17 lines, 1 paragraphs, 10 sentences, 264 words, 311 tiktokens)\n\nINFO:chopdiff.docs.sliding_transforms:Sliding word transform: Begin on doc: total 575 wordtoks, 1466 bytes, 1 windows, windowing size=2048, shift=1792, min_overlap=8 wordtoks\nINFO:chopdiff.docs.sliding_transforms:Sliding word transform window 1/1 (575 wordtoks, 1466 bytes), at 0 wordtoks so far\nINFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\nINFO:chopdiff.docs.sliding_transforms:Accepted transform changes:\n    TextDiff: add/remove +3/-3 out of 575 total:\n    at pos    0 keep    1 toks:   ⎪four⎪\n    at pos    1 keep   62 toks:   ⎪ score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.⎪\n    at pos   63 repl    1 toks: - ⎪\u003c-SENT-BR-\u003e⎪\n                repl    1 toks: + ⎪\u003c-PARA-BR-\u003e⎪\n    at pos   64 keep  153 toks:   ⎪Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.\u003c-SENT-BR-\u003eWe are met on a great battle-field of that war.\u003c-SENT-BR-\u003eWe have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.\u003c-SENT-BR-\u003eIt is altogether fitting and proper that we should do this.⎪\n    at pos  217 repl    1 toks: - ⎪\u003c-SENT-BR-\u003e⎪\n                repl    1 toks: + ⎪\u003c-PARA-BR-\u003e⎪\n    at pos  218 keep  132 toks:   ⎪But, in a larger sense, we can not dedicate—we can not consecrate—we can not hallow—this ground.\u003c-SENT-BR-\u003eThe brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract.\u003c-SENT-BR-\u003eThe world will little note, nor long remember what we say here, but it can never forget what they did here.⎪\n    at pos  350 repl    1 toks: - ⎪\u003c-SENT-BR-\u003e⎪\n                repl    1 toks: + ⎪\u003c-PARA-BR-\u003e⎪\n    at pos  351 keep  224 toks:   ⎪It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced.\u003c-SENT-BR-\u003eIt is rather for us to be here dedicated to the great task remaining before us—that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion—that we here highly resolve that these dead shall not have died in vain—that this nation, under God, shall have a new birth of freedom—and that government of the people, by the people, for the people, shall not perish from the earth.⎪\nINFO:chopdiff.docs.sliding_transforms:Filtering extraneous changes:\n    TextDiff: add/remove +1/-1 out of 575 total:\n    at pos    0 repl    1 toks: - ⎪four⎪\n                repl    1 toks: + ⎪Four⎪\nINFO:chopdiff.docs.sliding_transforms:Word token changes:\n    Accepted: add/remove +3/-3 out of 575 total\n    Rejected: add/remove +1/-1 out of 575 total\nINFO:chopdiff.docs.sliding_transforms:Sliding word transform: Done, output total 575 wordtoks\n\nOutput document: 1469 bytes (7 lines, 4 paragraphs, 10 sentences, 264 words, 311 tiktokens)\n\n--- With Paragraph Breaks -------------------------------------------------\n\nfour score and seven years ago our fathers brought forth on this continent, a new\nnation, conceived in Liberty, and dedicated to the proposition that all men are created\nequal.\n\nNow we are engaged in a great civil war, testing whether that nation, or any nation so\nconceived and so dedicated, can long endure. We are met on a great battle-field of that\nwar. We have come to dedicate a portion of that field, as a final resting place for\nthose who here gave their lives that that nation might live. It is altogether fitting\nand proper that we should do this.\n\nBut, in a larger sense, we can not dedicate—we can not consecrate—we can not hallow—this\nground. The brave men, living and dead, who struggled here, have consecrated it, far\nabove our poor power to add or detract. The world will little note, nor long remember\nwhat we say here, but it can never forget what they did here.\n\nIt is for us the living, rather, to be dedicated here to the unfinished work which they\nwho fought here have thus far so nobly advanced. It is rather for us to be here\ndedicated to the great task remaining before us—that from these honored dead we take\nincreased devotion to that cause for which they gave the last full measure of\ndevotion—that we here highly resolve that these dead shall not have died in vain—that\nthis nation, under God, shall have a new birth of freedom—and that government of the\npeople, by the people, for the people, shall not perish from the earth.\n$\n```\n\n### Backfilling Timestamps\n\nHere is an example of backfilling data from one text file to another similar but not\nidentical text file (see [backfill_timestamps.py](examples/backfill_timestamps.py) for\ncode). As you can see, the text is aligned by mapping the words and then the timestamps\ninserted at the end of each paragraph based on the first sentence of each paragraph:\n\n```\n$ uv run examples/backfill_timestamps.py \n\n--- Source text (with timestamps) -----------------------------------------\n\n\n\u003cspan data-timestamp=\"0.0\"\u003eWelcome to this um ... video about Python programming.\u003c/span\u003e\n\u003cspan data-timestamp=\"15.5\"\u003eFirst, we'll talk about variables. Variables are containers for storing data values.\u003c/span\u003e\n\u003cspan data-timestamp=\"25.2\"\u003eThen let's look at functions. Functions help us organize and reuse code.\u003c/span\u003e\n\n\n--- Target text (without timestamps) --------------------------------------\n\n\n## Introduction\n\nWelcome to this video about Python programming.\n\nFirst, we'll talk about variables. Next, let's look at functions. Functions help us organize and reuse code.\n\n\n--- Diff ------------------------------------------------------------------\n\nTextDiff: add/remove +9/-32 out of 87 total:\nat pos    0 keep    1 toks:   ⎪\u003c-BOF-\u003e⎪\nat pos    1 add     2 toks: + ⎪##⎪\nat pos    1 keep    1 toks:   ⎪ ⎪\nat pos    2 repl    1 toks: - ⎪\u003cspan data-timestamp=\"0.0\"\u003e⎪\n            repl    2 toks: + ⎪Introduction\u003c-PARA-BR-\u003e⎪\nat pos    3 keep    5 toks:   ⎪Welcome to this⎪\nat pos    8 del     6 toks: - ⎪ um ...⎪\nat pos   14 keep    9 toks:   ⎪ video about Python programming.⎪\nat pos   23 repl    3 toks: - ⎪\u003c/span\u003e \u003cspan data-timestamp=\"15.5\"\u003e⎪\n            repl    1 toks: + ⎪\u003c-PARA-BR-\u003e⎪\nat pos   26 keep   13 toks:   ⎪First, we'll talk about variables.⎪\nat pos   39 repl   19 toks: - ⎪ Variables are containers for storing data values.\u003c/span\u003e \u003cspan data-timestamp=\"25.2\"\u003eThen⎪\n            repl    3 toks: + ⎪\u003c-SENT-BR-\u003eNext,⎪\nat pos   58 keep   11 toks:   ⎪ let's look at functions.⎪\nat pos   69 repl    1 toks: - ⎪ ⎪\n            repl    1 toks: + ⎪\u003c-SENT-BR-\u003e⎪\nat pos   70 keep   14 toks:   ⎪Functions help us organize and reuse code.⎪\nat pos   84 del     2 toks: - ⎪\u003c/span\u003e ⎪\nat pos   86 keep    1 toks:   ⎪\u003c-EOF-\u003e⎪\n\n--- Token mapping ---------------------------------------------------------\n\n0 ⎪\u003c-BOF-\u003e⎪ -\u003e 0 ⎪\u003c-BOF-\u003e⎪\n1 ⎪#⎪ -\u003e 0 ⎪\u003c-BOF-\u003e⎪\n2 ⎪#⎪ -\u003e 0 ⎪\u003c-BOF-\u003e⎪\n3 ⎪ ⎪ -\u003e 1 ⎪ ⎪\n4 ⎪Introduction⎪ -\u003e 2 ⎪\u003cspan data-timestamp=\"0.0\"\u003e⎪\n5 ⎪\u003c-PARA-BR-\u003e⎪ -\u003e 2 ⎪\u003cspan data-timestamp=\"0.0\"\u003e⎪\n6 ⎪Welcome⎪ -\u003e 3 ⎪Welcome⎪\n7 ⎪ ⎪ -\u003e 4 ⎪ ⎪\n8 ⎪to⎪ -\u003e 5 ⎪to⎪\n9 ⎪ ⎪ -\u003e 6 ⎪ ⎪\n10 ⎪this⎪ -\u003e 7 ⎪this⎪\n11 ⎪ ⎪ -\u003e 14 ⎪ ⎪\n12 ⎪video⎪ -\u003e 15 ⎪video⎪\n13 ⎪ ⎪ -\u003e 16 ⎪ ⎪\n14 ⎪about⎪ -\u003e 17 ⎪about⎪\n15 ⎪ ⎪ -\u003e 18 ⎪ ⎪\n16 ⎪Python⎪ -\u003e 19 ⎪Python⎪\n17 ⎪ ⎪ -\u003e 20 ⎪ ⎪\n18 ⎪programming⎪ -\u003e 21 ⎪programming⎪\n19 ⎪.⎪ -\u003e 22 ⎪.⎪\n20 ⎪\u003c-PARA-BR-\u003e⎪ -\u003e 25 ⎪\u003cspan data-timestamp=\"15.5\"\u003e⎪\n21 ⎪First⎪ -\u003e 26 ⎪First⎪\n22 ⎪,⎪ -\u003e 27 ⎪,⎪\n23 ⎪ ⎪ -\u003e 28 ⎪ ⎪\n24 ⎪we⎪ -\u003e 29 ⎪we⎪\n25 ⎪'⎪ -\u003e 30 ⎪'⎪\n26 ⎪ll⎪ -\u003e 31 ⎪ll⎪\n27 ⎪ ⎪ -\u003e 32 ⎪ ⎪\n28 ⎪talk⎪ -\u003e 33 ⎪talk⎪\n29 ⎪ ⎪ -\u003e 34 ⎪ ⎪\n30 ⎪about⎪ -\u003e 35 ⎪about⎪\n31 ⎪ ⎪ -\u003e 36 ⎪ ⎪\n32 ⎪variables⎪ -\u003e 37 ⎪variables⎪\n33 ⎪.⎪ -\u003e 38 ⎪.⎪\n34 ⎪\u003c-SENT-BR-\u003e⎪ -\u003e 57 ⎪Then⎪\n35 ⎪Next⎪ -\u003e 57 ⎪Then⎪\n36 ⎪,⎪ -\u003e 57 ⎪Then⎪\n37 ⎪ ⎪ -\u003e 58 ⎪ ⎪\n38 ⎪let⎪ -\u003e 59 ⎪let⎪\n39 ⎪'⎪ -\u003e 60 ⎪'⎪\n40 ⎪s⎪ -\u003e 61 ⎪s⎪\n41 ⎪ ⎪ -\u003e 62 ⎪ ⎪\n42 ⎪look⎪ -\u003e 63 ⎪look⎪\n43 ⎪ ⎪ -\u003e 64 ⎪ ⎪\n44 ⎪at⎪ -\u003e 65 ⎪at⎪\n45 ⎪ ⎪ -\u003e 66 ⎪ ⎪\n46 ⎪functions⎪ -\u003e 67 ⎪functions⎪\n47 ⎪.⎪ -\u003e 68 ⎪.⎪\n48 ⎪\u003c-SENT-BR-\u003e⎪ -\u003e 69 ⎪ ⎪\n49 ⎪Functions⎪ -\u003e 70 ⎪Functions⎪\n50 ⎪ ⎪ -\u003e 71 ⎪ ⎪\n51 ⎪help⎪ -\u003e 72 ⎪help⎪\n52 ⎪ ⎪ -\u003e 73 ⎪ ⎪\n53 ⎪us⎪ -\u003e 74 ⎪us⎪\n54 ⎪ ⎪ -\u003e 75 ⎪ ⎪\n55 ⎪organize⎪ -\u003e 76 ⎪organize⎪\n56 ⎪ ⎪ -\u003e 77 ⎪ ⎪\n57 ⎪and⎪ -\u003e 78 ⎪and⎪\n58 ⎪ ⎪ -\u003e 79 ⎪ ⎪\n59 ⎪reuse⎪ -\u003e 80 ⎪reuse⎪\n60 ⎪ ⎪ -\u003e 81 ⎪ ⎪\n61 ⎪code⎪ -\u003e 82 ⎪code⎪\n62 ⎪.⎪ -\u003e 83 ⎪.⎪\n63 ⎪\u003c-EOF-\u003e⎪ -\u003e 86 ⎪\u003c-EOF-\u003e⎪\n\u003e\u003e Seeking back tok 1 (\u003c-PARA-BR-\u003e) to para start tok 1 (#), map back to source tok 0 (\u003c-BOF-\u003e)\n\u003e\u003e Failed to extract timestamp at doc token 1 (\u003c-PARA-BR-\u003e) -\u003e source token 0 (\u003c-BOF-\u003e): ¶0,S0\n\u003e\u003e Seeking back tok 6 (\u003c-PARA-BR-\u003e) to para start tok 6 (Welcome), map back to source tok 3 (Welcome)\n\u003e\u003e Adding timestamp to sentence: 'Welcome to this video about Python programming.'\n\u003e\u003e Seeking back tok 21 (\u003c-EOF-\u003e) to para start tok 21 (First), map back to source tok 26 (First)\n\u003e\u003e Adding timestamp to sentence: 'Functions help us organize and reuse code.'\n\n--- Result (with backfilled timestamps) -----------------------------------\n\n## Introduction\n\nWelcome to this video about Python programming. \u003cspan class=\"timestamp\"\u003e⏱️00:00\u003c/span\u003e \n\nFirst, we'll talk about variables. Next, let's look at functions. Functions help us organize and reuse code. \u003cspan class=\"timestamp\"\u003e⏱️00:15\u003c/span\u003e \n$\n```\n\n* * *\n\n## Project Docs\n\nFor how to install uv and Python, see [installation.md](installation.md).\n\nFor development workflows, see [development.md](development.md).\n\nFor instructions on publishing to PyPI, see [publishing.md](publishing.md).\n\n* * *\n\n*This project was built from\n[simple-modern-uv](https://github.com/jlevy/simple-modern-uv).*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjlevy%2Fchopdiff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjlevy%2Fchopdiff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjlevy%2Fchopdiff/lists"}