Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/llabres/library-dataset
Dataset and Code from the paper: Library Dataset: Automatic Inventory as a Many to Many Matching Task
https://github.com/llabres/library-dataset
Last synced: about 1 month ago
JSON representation
Dataset and Code from the paper: Library Dataset: Automatic Inventory as a Many to Many Matching Task
- Host: GitHub
- URL: https://github.com/llabres/library-dataset
- Owner: llabres
- License: mit
- Created: 2024-01-25T08:28:39.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-08-26T15:18:17.000Z (5 months ago)
- Last Synced: 2024-08-27T16:46:38.717Z (5 months ago)
- Size: 38.6 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-Segment-Anything - [code
README
# Library Dataset: Image-text matching for large-scale book collections
The Library Dataset consists of 285 high resolution images of bookshelves with a total 7,536 books. We also provide two book catalogues for matching, the true library inventory (closed-set scenario) and a large-scale catalogue (open-set scenario).
One Shelf Example | Two Shelves Example
:-------------------------:|:-------------------------:
![](sample_images/one_shelf_example.jpg) | ![](sample_images/two_shelves_example.jpg)## Dataset
### Target Lists
- [True Library Inventory](data/target_lists/library_books.csv) - `data/target_lists/library_books.csv`: List of 15,229 books in the library. Contains the following columns:
- `title`: Title of the book.
- `author`: Author of the book.
- `tag`: Non-unique identifier used by the library, it is present on all book-spines in as black text on a white background.
- `series`: For books that belong to a series, the name of the series.
- `ISBN`: International Standard Book Number.The Large-Scale Catalogue must be downloaded from [here](https://cvcuab-my.sharepoint.com/:u:/g/personal/allabres_cvc_uab_cat/ETEdq8Q803tAiRmwypHSnPUBv0tlF_B9pQsJJVB61lRyqw?e=B6gDqb).
- [Large-Scale Catalogue](data/target_lists/all_books.csv) - `data/target_lists/all_books.csv`: Contains the following columns:
- `title`: Title of the book.
- `author`: Author of the book - `series`: For books that to a series, the name of the series.
- `language`: Language of the book.
### Annotations
The main annotation for this dataset is the books that appear in each image. The annotations are stored in the following file:[Annotations](data/annotations/library_books_annotations.csv) - `data/annotations/library_books_annotations.csv`
### Images
There are two sets of images, original, and split. The split images where created by cutting the original images to reduce size in order to fit in the OCR API size limits.
Both sets of images can be downloaded from the following links:
- [Original Images](https://cvcuab-my.sharepoint.com/:u:/g/personal/allabres_cvc_uab_cat/EeLjfNfMHItDps97t7xZ7UgBW-xBnuewRbHGEUmGsMpEFg?e=Fnn2Eq)
- [Split Images](https://cvcuab-my.sharepoint.com/:u:/g/personal/allabres_cvc_uab_cat/ES2oBS5DuhROlKbHQIGS0akBQuG3KO_8c5QNd27QhZaYOg?e=Y6XTyz)Unzip the images in `data/images/`.
## Demos
Check out the demos at `demo/`:
Book Identification Demo | Book Search Demo
:-------------------------:|:-------------------------:
![](demo/book_identification_demo/result_example.jpg) | ![](demo/search_demo/demo_ui.png)- [Book Identification Demo](demo/book_identification_demo/README.md): Given an image of a bookshelf, this demo identifies the books present in the image.
- [Book Search Demo](demo/search_demo/README.md): This demo prompts the user to input a book title or author name, and returns the image where the book has been detected.