Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/katanaml/streamlit-sparrow-labeling-comp
Streamlit component for invoice document labeling
https://github.com/katanaml/streamlit-sparrow-labeling-comp
python streamlit
Last synced: 4 months ago
JSON representation
Streamlit component for invoice document labeling
- Host: GitHub
- URL: https://github.com/katanaml/streamlit-sparrow-labeling-comp
- Owner: katanaml
- License: mit
- Created: 2022-11-27T15:59:55.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-12-19T16:16:06.000Z (about 2 years ago)
- Last Synced: 2024-09-28T13:41:25.000Z (4 months ago)
- Topics: python, streamlit
- Language: TypeScript
- Homepage: https://pypi.org/project/streamlit-sparrow-labeling/
- Size: 4.74 MB
- Stars: 51
- Watchers: 3
- Forks: 16
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Streamlit - Labeling Component
Streamlit component which provides labeling functionality for documents and invoices. Based on [streamlit-drawable-canvas](https://github.com/andfanilo/streamlit-drawable-canvas).
![](./img/demo.png)
## Features
- Option to switch between rectangle selection/resize and new rectangle creation
- Returning info about currently selected rectangle, this allows to provide label for this rectangle
- Labeling data is stored in simple/readable JSON format. This data is auto converted to Fabric.js format by the component
- Canvas resizing and rectangle data resizing. Coordinates are saved in original dimension## Installation
```shell script
pip install streamlit-sparrow-labeling
```## Example Usage
Copy this code snippet:
```python
from PIL import Image
import streamlit as st
import streamlit_nested_layout
import streamlit_javascript as st_js
from streamlit_sparrow_labeling import st_sparrow_labeling
from streamlit_sparrow_labeling import DataProcessor
import json
import mathst.set_page_config(
page_title="Sparrow Labeling",
layout="wide"
)def run(img_file, rects_file, labels):
ui_width = st_js.st_javascript("window.innerWidth")docImg = Image.open(img_file)
if 'saved_state' not in st.session_state:
with open(rects_file, "r") as f:
saved_state = json.load(f)
st.session_state['saved_state'] = saved_state
else:
saved_state = st.session_state['saved_state']assign_labels = st.checkbox("Assign Labels", True)
mode = "transform" if assign_labels else "rect"data_processor = DataProcessor()
col1, col2 = st.columns([4, 6])
with col1:
height = 1296
width = 864doc_height = saved_state['meta']['image_size']['height']
doc_width = saved_state['meta']['image_size']['width']canvas_width = canvas_available_width(ui_width)
result_rects = st_sparrow_labeling(
fill_color="rgba(0, 151, 255, 0.3)",
stroke_width=2,
stroke_color="rgba(0, 50, 255, 0.7)",
background_image=docImg,
initial_rects=saved_state,
height=height,
width=width,
drawing_mode=mode,
display_toolbar=True,
update_streamlit=True,
canvas_width=canvas_width,
doc_height=doc_height,
doc_width=doc_width,
image_rescale=True,
key="doc_annotation"
)st.caption("Check 'Assign Labels' to enable editing of labels and values, move and resize the boxes to "
"annotate the document.")
st.caption("Add annotations by clicking and dragging on the document, when 'Assign Labels' is unchecked.")with col2:
if result_rects is not None:
with st.form(key="fields_form"):
if result_rects.current_rect_index is not None and result_rects.current_rect_index != -1:
st.write("Selected Field: ",
result_rects.rects_data['words'][result_rects.current_rect_index]['value'])
st.markdown("---")if ui_width > 1500:
render_form_wide(result_rects.rects_data['words'], labels, result_rects, data_processor)
elif ui_width > 1000:
render_form_avg(result_rects.rects_data['words'], labels, result_rects, data_processor)
elif ui_width > 500:
render_form_narrow(result_rects.rects_data['words'], labels, result_rects, data_processor)
else:
render_form_mobile(result_rects.rects_data['words'], labels, result_rects, data_processor)submit = st.form_submit_button("Save", type="primary")
if submit:
with open(rects_file, "w") as f:
json.dump(result_rects.rects_data, f, indent=2)
with open(rects_file, "r") as f:
saved_state = json.load(f)
st.session_state['saved_state'] = saved_state
st.write("Saved!")def render_form_wide(words, labels, result_rects, data_processor):
col1_form, col2_form, col3_form, col4_form = st.columns([1, 1, 1, 1])
num_rows = math.ceil(len(words) / 4)for i, rect in enumerate(words):
if i < num_rows:
with col1_form:
render_form_element(rect, labels, i, result_rects, data_processor)
elif i < num_rows * 2:
with col2_form:
render_form_element(rect, labels, i, result_rects, data_processor)
elif i < num_rows * 3:
with col3_form:
render_form_element(rect, labels, i, result_rects, data_processor)
else:
with col4_form:
render_form_element(rect, labels, i, result_rects, data_processor)def render_form_avg(words, labels, result_rects, data_processor):
col1_form, col2_form, col3_form = st.columns([1, 1, 1])
num_rows = math.ceil(len(words) / 3)for i, rect in enumerate(words):
if i < num_rows:
with col1_form:
render_form_element(rect, labels, i, result_rects, data_processor)
elif i < num_rows * 2:
with col2_form:
render_form_element(rect, labels, i, result_rects, data_processor)
else:
with col3_form:
render_form_element(rect, labels, i, result_rects, data_processor)def render_form_narrow(words, labels, result_rects, data_processor):
col1_form, col2_form = st.columns([1, 1])
num_rows = math.ceil(len(words) / 2)for i, rect in enumerate(words):
if i < num_rows:
with col1_form:
render_form_element(rect, labels, i, result_rects, data_processor)
else:
with col2_form:
render_form_element(rect, labels, i, result_rects, data_processor)def render_form_mobile(words, labels, result_rects, data_processor):
for i, rect in enumerate(words):
render_form_element(rect, labels, i, result_rects, data_processor)def render_form_element(rect, labels, i, result_rects, data_processor):
default_index = 0
if rect['label']:
default_index = labels.index(rect['label'])value = st.text_input("Value", rect['value'], key=f"field_value_{i}",
disabled=False if i == result_rects.current_rect_index else True)
label = st.selectbox("Label", labels, key=f"label_{i}", index=default_index,
disabled=False if i == result_rects.current_rect_index else True)
st.markdown("---")data_processor.update_rect_data(result_rects.rects_data, i, value, label)
def canvas_available_width(ui_width):
# Get ~40% of the available width, if the UI is wider than 500px
if ui_width > 500:
return math.floor(38 * ui_width / 100)
else:
return ui_widthif __name__ == "__main__":
custom_labels = ["", "item", "item_price", "subtotal", "tax", "total"]
run("docs/image/receipt_00001.png", "docs/json/receipt_00001.json", custom_labels)
```## API
```
result_rects = st_sparrow_labeling(
fill_color="rgba(0, 151, 255, 0.3)",
stroke_width=2,
stroke_color="rgba(0, 50, 255, 0.7)",
background_image=docImg,
initial_rects=saved_state,
height=height,
width=width,
drawing_mode=mode,
display_toolbar=True,
update_streamlit=True,
canvas_width=canvas_width,
doc_height=doc_height,
doc_width=doc_width,
image_rescale=True,
key="doc_annotation"
)
```- **fill_color** : Color of fill for Rect in CSS color property. Defaults to "#eee".
- **stroke_width** : Width of drawing brush in CSS color property. Defaults to 20.
- **stroke_color** : Color of drawing brush in hex. Defaults to "black".
- **background_image** : Pillow Image to display behind canvas. Automatically resized to canvas dimensions. Being behind the canvas, it is not sent back to Streamlit on mouse event. Overrides background_color. Changes to this will reset canvas contents.
- **initial_rects** : Initial Rects to display on canvas. Defaults to empty list.
- **update_streamlit** : Whenever True, send canvas data to Streamlit when object/selection is updated or mouse up.
- **height** : Height of canvas in pixels. Defaults to 400.
- **width** : Width of canvas in pixels. Defaults to 600.
- **drawing_mode** : Enable free drawing when "freedraw", object manipulation when "transform", otherwise create new objects with "line", "rect", "circle" and "polygon". Defaults to "freedraw".
- On "polygon" mode, double-clicking will remove the latest point and right-clicking will close the polygon.
- **initial_drawing** : Initialize canvas with drawings from here. Should be the `json_data` output from other canvas. Beware: if you try to import a drawing from a bigger/smaller canvas, no rescaling is done in the canvas and the import could fail.
- **display_toolbar** : If `False`, don't display the undo/redo/delete toolbar.
- **canvas_width** : Width of the canvas in pixels.
- **doc_height** : Original height of the document in pixels.
- **doc_width** : Original width of the document in pixels.
- **image_rescale** : If `True`, rescale the image to fit the canvas. Defaults to `False`.## Development
### Install
- JS side
```shell script
cd frontend
npm install
```- Python side
```shell script
conda create -n streamlit-sparrow-labeling python=3.11
conda activate streamlit-sparrow-labeling
pip install -e .
```### Run
Both webpack dev server and Streamlit should run at the same time.
- JS side
```shell script
cd frontend
npm run start
```- Python side
```shell script
streamlit run app.py
```