https://github.com/breezedeus/pix2text-mac

Pix2Text MacOS App: A Mac Desktop App for Mathematical Formula Recognition and Text Recognition. Mac 本地数学公式识别和文本识别工具
https://github.com/breezedeus/pix2text-mac

Last synced: 7 months ago
JSON representation

Pix2Text MacOS App: A Mac Desktop App for Mathematical Formula Recognition and Text Recognition. Mac 本地数学公式识别和文本识别工具

Host: GitHub
URL: https://github.com/breezedeus/pix2text-mac
Owner: breezedeus
License: mit
Created: 2024-03-15T14:55:38.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-06-19T03:31:45.000Z (12 months ago)
Last Synced: 2024-11-18T17:44:09.923Z (7 months ago)
Language: Python
Homepage: https://p2t.breezedeus.com
Size: 3.17 MB
Stars: 38
Watchers: 2
Forks: 3
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

[![Discord](https://img.shields.io/discord/1200765964434821260?label=Discord)](https://discord.gg/GgD87WM8Tf)
[![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Fbreezedeus%2FPix2Text-Mac&label=Visitors&countColor=%23ff8a65&style=flat&labelStyle=none)](https://visitorbadge.io/status?path=https%3A%2F%2Fgithub.com%2Fbreezedeus%2FPix2Text-Mac)
[![license](https://img.shields.io/github/license/breezedeus/pix2text)](./LICENSE)
[![stars](https://img.shields.io/github/stars/breezedeus/pix2text-mac)](https://github.com/breezedeus/Pix2Text-Mac)
![last-commit](https://img.shields.io/github/last-commit/breezedeus/Pix2Text-Mac)
[![Twitter](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2Fbreezedeus)](https://twitter.com/breezedeus)

[👩🏻‍💻 Pix2Text Online Service](https://p2t.breezedeus.com) |
[👨🏻‍💻 Pix2Text Online Demo](https://huggingface.co/spaces/breezedeus/Pix2Text-Demo) |
[📖 Online Doc](https://pix2text.readthedocs.io) |
[💬 Contact](https://www.breezedeus.com/article/join-group)

[中文](./README_cn.md) | English

# Pix2Text-Mac: A Mac desktop application for recognizing mathematical formulas

This project is a Mac local OCR application based on [**Pix2Text**](https://github.com/breezedeus/Pix2Text) (no internet connection required). It can recognize mathematical formula images from the clipboard and convert them to their LaTeX representation, which can then be copied to the clipboard. Additionally, it supports text recognition (Text OCR) from general images.

> Note ⚠️: This application is only available for MacOS.

The initial code of this project was forked from: [horennel/LaTex-OCR_for_macOS](https://github.com/horennel/LaTex-OCR_for_macOS). Special thanks to the author of this project.

## Features

After opening the application, you can see the Pix2Text application icon in the Mac menu bar, as shown below. It includes OCR for 4 different modes.

### 1. `Text_Formula OCR`: Recognizing images with both formulas and text
This mode can recognize images containing both mathematical formulas and text. The recognition result is in Markdown format, which can be pasted into the [Pix2Text Online Service](https://p2t.breezedeus.com) to view the rendered result.

For example, it can recognize the following image ([assets/mixed-en.jpg](./assets/mixed-en.jpg)):

### 2. `Formula OCR`: Recognizing images with pure formulas
This mode can recognize images containing only mathematical formulas. The recognition result is in LaTeX format, which can be pasted into the [Pix2Text Online Service](https://p2t.breezedeus.com) to view the rendered result.

For example, it can recognize the following image ([assets/math-formula-42.png](./assets/math-formula-42.png)):

$English mixed image$

### 3. `Text OCR`: Recognizing images with pure text
This mode can recognize images containing only text. The recognition result is in plain text.

For example, it can recognize the following image ([assets/text.jpg](./assets/text.jpg)):

### 4. `Page OCR`: Recognizing Page Screenshots with Complex Layouts
If an image contains complex layout structures, such as multi-column layouts or includes tables and other information, you can use this mode for recognition. This mode will additionally load the **Layout Analysis** and **Table Recognition** models from `pix2text~=1.1` to recognize all information in the image and integrate the recognition results into Markdown format. You can paste the results into the [Pix2Text web version](https://p2t.breezedeus.com) to view the rendered results.

The recognition results will also be saved to a specified local folder. The folder location can be specified by the `output_md_root_dir` variable in the configuration file [config.yaml](./config.yaml), which defaults to the `/tmp/output_mds` folder. Additionally, the parsing results will be saved to a specified local folder. The folder location can be specified by the `output_debug_dir` variable in the configuration file [config.yaml](./config.yaml), which defaults to the `/tmp/output_debugs` folder. You can manually change the values of these two variables to specify the storage location.

For example, it can recognize the following image ([assets/page.png](./assets/page.png)):

## Installation

#### 1. Clone the repository:

```bash
git clone https://github.com/breezedeus/Pix2Text-Mac
```

#### 2. Install dependencies:

```bash
pip install -r requirements.txt
```

If you want to recognize text images in languages other than **Simplified Chinese and English**, please run the following command to install additional dependencies:

```bash
pip install pix2text[multilingual]>=1.1.0.1
```

#### 3. Verify the installation is working correctly

Use the following command to verify if the installed [Pix2Text](https://github.com/breezedeus/Pix2Text) is working normally:

```bash
p2t predict -l en,ch_sim --resized-shape 768 --file-type page -i assets/page.png -o output-page --save-debug-res output-debug-page
```

#### 4. Package the application:

```bash
python setup.py py2app -A
```

- You can find the application `Pix2Text.app` in the generated `dist` folder. Double-click to open it, or move it to the `Applications` folder.

## How to Use

- Launch the application
- Start the `Pix2Text.app` application, and you will see the Pix2Text application icon in the menu bar.
- Click the `On / Off` button in the menu bar icon to ensure that the `Mixed OCR`, `Formula OCR`, and `Mixed OCR` buttons are lit up.
- Take a screenshot
- Use any screenshot software, such as `Snipaste`, to capture and copy to the clipboard.
- Recognition
- Recognize images with both mathematical formulas and text
- Click the `Text_Formula OCR` button.
- After successful recognition, you will receive a notification in the notification center.
- Recognize images with pure mathematical formulas
- Click the `Formula OCR` button.
- After successful recognition, you will receive a notification in the notification center.
- Recognize images with pure text
- Click the `Text OCR` button.
- After successful recognition, you will receive a notification in the notification center.
- To recognize screenshots of pages with complex layouts
- Click on the `Page OCR` button.
- After successful recognition, you will receive a notification in the notification center.
- If you do not want to receive notifications, you can turn them off in the system settings.
- After receiving a notification, you can paste the result into the [Pix2Text Online Service](https://p2t.breezedeus.com) to view the rendered result.
- You can modify the initialization configuration of Pix2Text by editing the configuration file [config.yaml](./config.yaml), such as which model to use and the path to the model. If you have purchased the [premium models](https://www.breezedeus.com/article/pix2text) (which provides better results), you can refer to the content of [pro-config.yaml](./pro-config.yaml) to modify [config.yaml](./config.yaml).

## Notes

- The first time you start the application, it will download models and configuration files, resulting in a long startup time. Subsequent startups will return to normal speed.
- The storage path for downloaded models and configuration files is `~/.cnstd`, `~/.cnocr`, and `~/.pix2text`.
- The application depends on the Python environment used during packaging. If the Python environment changes (e.g., the virtual environment used for packaging is deleted, the dependencies in the environment used for packaging are deleted or modified, or the Python environment on the computer is completely uninstalled), the application may not work properly and needs to be repackaged.

## Acknowledgments

- The initial code of this project was forked from: [horennel/LaTex-OCR_for_macOS](https://github.com/horennel/LaTex-OCR_for_macOS). Special thanks to the author of this project.
- [Pix2Text](https://github.com/breezedeus/Pix2Text)
- [pyperclip](https://github.com/asweigart/pyperclip)
- [rumps](https://github.com/jaredks/rumps)
- [py2app](https://github.com/ronaldoussoren/py2app)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/breezedeus/pix2text-mac

Awesome Lists containing this project

README