https://github.com/sayanarijit/matchingsplit
Split a string or group a collection of words into a list by matching another list of similar words, to create accurate subtitles from the actual script and inaccurate (generated) subtitles.
https://github.com/sayanarijit/matchingsplit
Last synced: 6 months ago
JSON representation
Split a string or group a collection of words into a list by matching another list of similar words, to create accurate subtitles from the actual script and inaccurate (generated) subtitles.
- Host: GitHub
- URL: https://github.com/sayanarijit/matchingsplit
- Owner: sayanarijit
- License: mit
- Created: 2023-09-20T13:30:30.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-04T16:17:49.000Z (almost 2 years ago)
- Last Synced: 2025-03-12T13:37:59.953Z (7 months ago)
- Language: Python
- Homepage: https://pypi.org/project/matchingsplit/
- Size: 13.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## matchingsplit
Split a string or group a collection of words into a list by matching another list of similar words, to create accurate subtitles from the actual script and inaccurate (generated) subtitles.
Example
```python
from matchingsplit import split>>> split("this must be a good thing", reference=["this", "is", "a", "good", "thing"])
['this', 'must be', 'a', 'good', 'thing']>>> split("this is a good thing", reference=["this", "must", "be", "a", "good", "thing"])
['this', '', 'is', 'a', 'good', 'thing']>>> split("a big foo bar", ["a", "big", "ff"])
['a', 'big', 'foo bar']>>> split("line1.\n\nline2.\nline3.", reference=["1", "2", "3"], preserve_newlines=True)
['line1.\n\n', 'line2.\n', 'line3.']
```