https://github.com/traven-b/mymodern
Crystal app lists checked-out and ready-hold books at multiple library websites. Uses CSS selectors, CSP concurrency. Beginner-friendly with mock data and easy inclusion of custom scraping logic.
https://github.com/traven-b/mymodern
beginner-friendly book-tracking cli-application concurrency crystal crystal-lang csp css-selectors fibers html-parsing http-client library-tools mock-data modular-design public-libraries web-scraping
Last synced: 2 months ago
JSON representation
Crystal app lists checked-out and ready-hold books at multiple library websites. Uses CSS selectors, CSP concurrency. Beginner-friendly with mock data and easy inclusion of custom scraping logic.
- Host: GitHub
- URL: https://github.com/traven-b/mymodern
- Owner: Traven-B
- License: mit
- Created: 2024-06-28T01:51:39.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-07-04T19:12:26.000Z (11 months ago)
- Last Synced: 2025-03-30T18:46:28.504Z (2 months ago)
- Topics: beginner-friendly, book-tracking, cli-application, concurrency, crystal, crystal-lang, csp, css-selectors, fibers, html-parsing, http-client, library-tools, mock-data, modular-design, public-libraries, web-scraping
- Language: Crystal
- Homepage:
- Size: 509 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# mymodern
mymodern is a command line program written in [Crystal][] that scrapes web
pages from public library websites. It uses CSS selector rules to parse the
pages and lists checked-out books and books on hold ready for pickup.The program uses Crystal's concurrency support to handle multiple HTTP requests
simultaneously. It is modular, allowing you to define modules for different
library websites and customize the scraping logic for each.## Installation
To set up mymodern on your machine:
1. Clone the repository.
2. Navigate to the project directory.
3. Run the following commands:```
shards install
shards build
```The library [kostya/lexbor](https://github.com/kostya/lexbor) parses HTML using
CSS selectors. It wraps native C code and the compilation step for this
dependency happens automatically when you run `shards install`Once installed, the library will be available locally within your project directory.
For more information about the project and how to customize it for your needs,
please refer to our:- [Detailed README](project_docs/DETAILED_README.md) for **detailed
installation instructions**, and subsequent setup and usage notes.- [Project Structure Documentation](project_docs/PROJECT_STRUCTURE.md) which
outlines the specific parts you'll need to adapt or modify to work with your
library's website.## Usage
```terminal
~/at_the_prompt/crystal/mymodern$ bin/mymodern --help
Usage: mymodern [OPTIONS]
Scrape pages at public libraries' web sites.
-m, --mock this option mocks everything
-t, --trace trace where it's all happening
-h, --help show this message
```Example program output:
```terminal
Hennepin Books OutThe Astronomer
Tuesday December 04, 2018Subterranean Twin Cities
Wednesday December 05, 2018
Renewed: 1 time
2 people waitingHennepin Books on Hold
A History of America in Ten Strikes
Wednesday November 14, 2018St. Paul Books Out
St. Paul Books on Hold
The mysterious flame of Queen Loana
Saturday November 17, 2018```
## Development
The program calls upon identically named methods provided by one or another
library's module.```
module Hennepin
def self.lib_data
# returns data in the form of a NamedTupledef self.parse_checkedout_page(page)
# return a sorted array of CheckedOutRecord
# uses already existing record and comparison definitionsdef self.parse_on_hold_page(page)
# return a sorted array of OnHoldRecord
```Code for another library where you're a regular patron would do the same, and
parts of the program that invoke these methods iterate over an array of the
module names, and don't hard code the number or names of the modules.You can adapt the modules that define these few methods, name them Springfield
and Shelbyville, and require them from a single file which also lists them in
an array of module names.```
require "./hennepin"
require "./stpaul"MODULE_NAMES = [Hennepin, StPaul]
```So this can be used to scrape web pages at 1, 2, or 3 libraries, and all things
being equal, runs in constant time no matter how many you're doing. *\* Because concurrency.
1 web site, 1.5 times faster. All things being equal, 2 websites, could be 3
times faster, 3 web sites, 4.5 times faster. Conceivably.## Contributing
1. Fork the repository ()
2. Create a new feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Pull Request## Contributors
- [Traven-B](https://github.com/Traven-B) Michael Kamb - creator, maintainer
[Crystal]: https://crystal-lang.org
[kostya/lexbor]: https://github.com/kostya/lexbor