Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/idea-fasoc/datasheet-scrubber
https://github.com/idea-fasoc/datasheet-scrubber
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/idea-fasoc/datasheet-scrubber
- Owner: idea-fasoc
- License: mit
- Created: 2019-01-09T23:49:04.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-01-09T16:43:47.000Z (6 months ago)
- Last Synced: 2024-01-17T12:57:52.339Z (6 months ago)
- Language: Python
- Size: 72.2 MB
- Stars: 45
- Watchers: 3
- Forks: 8
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- awesome-opensource-hardware - datasheet-scrubber
- awesome-eda - University of Michigan (Intent Driven Analog Design)
README
# FASoC Datasheet-Scrubber
The FASoC Datasheet Scrubber is a utility that scrubs through large sets of PDF datasheets/documents in order to extract key circuit information. The information gathered is used to build a database of commercial off-the-shelf (COTS) IP that can be used to build larger SoC in the FASoC design. More information [here](https://fasoc.engin.umich.edu/datasheet-scrubber).
To get more details about the datasheet scrubber, please refer to our [IEEE TCAD](https://ieeexplore.ieee.org/document/9733041) paper.
If you find this tool useful in your research, we kindly request to cite our paper below:
- M. Fayazi et al., "FASCINET: A Fully Automated Single-Board Computer Generator Using Neural Networks," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
- Z. Colter et al., "Tablext: A combined neural network and heuristic based table extractor," in *Array*,Vol. 15, 2022, pp. 100220.
### Setup instructions
1. Ensure your machine has the correct Python version and all of the Python modules required to run through the datasheet scrubber.
- Requirements: [Anaconda](anaconda.com), and [Tensorflow](https://docs.anaconda.com/free/anaconda/applications/tensorflow/). Python versions below 3.6 are not supported.
1. Ensure you have ssh keys setup for github. Instructions for generating and adding ssh keys can be found [here](https://help.github.com/en/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent).1. Clone the Datasheet Scrubber repository
```bash
git clone [email protected]:idea-fasoc/datasheet-scrubber.git
```# Database
The FASoC database contains more than 700,000 records of Integrated Circuits (ICs) components collected from [Digikey](https://www.digikey.com/products/ics/en).
### Database Web ApplicationIn order to access a sample of this collection, visit our [web application](https://fasoc.herokuapp.com/) or proceed [here](https://github.com/idea-fasoc/fasoc-webapp).
### Raw Database
To have access to the entire collection of components, please visit [here](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/src/database).# Datasheet-Scrubber
Datasheet scrubber includes three steps of [category recognition](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/src/category_recognition), [table extracton](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/src/table_extraction) and [text extraction](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/src/text_extraction).
### Test
Examples of how to use the [category recognition](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/tests/category_recognition), [table extractor](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/tests/table_extraction), and [web application database](https://github.com/idea-fasoc/datasheet-scrubber/tree/master/tests/web-app-db) are provided.