Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mozilla/fathom
A framework for extracting meaning from web pages
https://github.com/mozilla/fathom
Last synced: 3 days ago
JSON representation
A framework for extracting meaning from web pages
- Host: GitHub
- URL: https://github.com/mozilla/fathom
- Owner: mozilla
- License: mpl-2.0
- Created: 2016-03-18T20:03:05.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2023-11-25T13:34:18.000Z (11 months ago)
- Last Synced: 2024-10-29T10:00:38.652Z (6 days ago)
- Language: JavaScript
- Homepage: http://mozilla.github.io/fathom/
- Size: 23.6 MB
- Stars: 1,972
- Watchers: 55
- Forks: 75
- Open Issues: 112
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Fathom
Fathom is a supervised-learning system for recognizing parts of web pages—pop-ups, address forms, slideshows—or for classifying a page as a whole. A DOM flows in one side, and DOM nodes flow out the other, tagged with types and probabilities that those types are correct. A Prolog-like language makes it straightforward to specify the “smells” that suggest each type, and a neural-net-based trainer determines the optimal contribution of each smell. Finally, the FathomFox web extension lets you collect and label a corpus of web pages for training.
Continue reading at .
__[Documentation](https://mozilla.github.io/fathom)__