https://github.com/traven-b/myasync
Ruby app lists checked-out and ready-hold books at multiple library websites. Uses CSS selectors, async http concurrency. Beginner-friendly with mock data and easy inclusion of custom scraping logic.
https://github.com/traven-b/myasync
async beginner-friendly book-tracking cli-application concurrency css-selectors fibers html-parsing http-client library-tools mock-data modular-design public-libraries ruby web-scraping
Last synced: 4 months ago
JSON representation
Ruby app lists checked-out and ready-hold books at multiple library websites. Uses CSS selectors, async http concurrency. Beginner-friendly with mock data and easy inclusion of custom scraping logic.
- Host: GitHub
- URL: https://github.com/traven-b/myasync
- Owner: Traven-B
- Created: 2024-07-27T00:56:05.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-06T12:28:51.000Z (5 months ago)
- Last Synced: 2025-06-02T06:18:16.398Z (4 months ago)
- Topics: async, beginner-friendly, book-tracking, cli-application, concurrency, css-selectors, fibers, html-parsing, http-client, library-tools, mock-data, modular-design, public-libraries, ruby, web-scraping
- Language: HTML
- Homepage:
- Size: 125 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# myasync
myasync is a command line program written in [Ruby][] that scrapes web
pages from public library websites. It uses CSS selector rules to parse the
pages and lists checked-out books and books on hold ready for pickup.The program uses Ruby's concurrency support to handle multiple HTTP requests
simultaneously. It is modular, allowing you to define modules for different
library websites and customize the scraping logic for each.## Usage
The program works out of the box as a demo when given the --mock option.
```sh
app/myasync -h
Usage: myasync [OPTIONS]
Scrape pages at public libraries' web sites.
-m, --mock this option mocks everything
-s, --sleep-range 2,2.2 1 (or 2) comma separated numbers specifying
(range of) seconds that mocked requests sleep
-l, --local use server on localhost
-t, --trace trace where it's all happening
-h, --help, --usage Show this message
```Example program output:
```sh
Hennepin Books OutThe Astronomer
Tuesday December 04, 2018Subterranean Twin Cities
Wednesday December 05, 2018
Renewed: 1 time
2 people waitingHennepin Books on Hold
A History of America in Ten Strikes
Wednesday November 14, 2018St. Paul Books Out
St. Paul Books on Hold
The mysterious flame of Queen Loana
Saturday November 17, 2018
```## Installation
1. **Clone the Repository**:
```bash
git clone --depth 1 https://github.com/Traven-B/myasync.git
cd myasync
```2. **Install Dependencies**:
Ensure you have Bundler installed:
```bash
which bundle
```If not found:
```bash
gem install bundler
```Configure Bundler to install gems locally:
```bash
bundle config set --local path vendor/bundle
```to create a `.bundle/config` file with instruction to install gems in project directory.
```
---
BUNDLE_PATH: "vendor/bundle"
```Install the required gems:
```bash
bundle install
```3. **Run the Application**:
Use `bundle exec` to ensure the correct gem versions are used:```bash
bundle exec app/mymodest --mock
```## Quick Start
**When you first run the program, use the `--mock` option.**
This lets you see example output immediately, even if you haven’t set up credentials,
URLs, and scraping routines.The following will work also:
```sh
myasync --help
myasync --mock # See simulated output right away
myasync --mock --trace # With trace/debug output
myasync --mock --sleep-range 2 --trace
myasync --mock --sleep-range 2,2.2 --trace
```### Fake the Internet: Use a Local Server
Instead of faking internet requests, you can run against a local server:
```sh
$ bundle exec app/myasync --local
Failed to open TCP connection to localhost:3000Start a server on localhost. Run:
bundle exec lib/local_server.rb```
Start a server on localhost in a separate terminal:
```sh
$ bundle exec lib/local_server.rb
== Sinatra (v4.1.1) has taken the stage on 3000 for development with backup from WEBrick
(Ctrl-C to exit)
```Then try your command again:
```sh
$ bundle exec app/myasync --local
```The --local option is for advanced development and concurrency testing, but
like --mock, it also works out of the box.## Project Structure
- `lib/`: Contains the main code using the Faraday HTTP gem - a more adaptable but involved implementation.
- `application.rb` (symlink): Points to one of the two async implementations below.
- `_app_Async_do.rb`: Implementation using `Async do ... end` blocks within a method.
- `_app_async_def.rb`: Version using `async/await` with syntactic sugar.
- `archive/`: Contains earlier code using the `async/http/internet` gem.
- A similar setup with two different async idioms is present there as well.
- `use_async_method_or_block.sh`: Script to switch which async implementation is active by updating the `application.rb` symlink.
- `lib/local_server.rb`: Sinatra server used when `myasync --local` is specified.**Note:** The active async implementation is controlled by the `application.rb`
symlink. Use the provided script to switch between versions as needed.## Switching Async Implementations
To choose which async implementation is active, use:
```
./use_async_method_or_block.sh async_def # Use async/await version
./use_async_method_or_block.sh Async_do # Use Async do ... end version```
Running the script with no arguments reports which version is currently active.
**Note on Git and symlinks:**
Git tracks the symlink’s name and target path (not the contents of the target file). If you switch the symlink to point to a different file, `git status` may show it as modified. You only need to `git add` and commit if you want to record the new target in version control.## Development
### Modular Design
**myasync** is designed to support multiple library websites through a
module system. Adding a new library only requires:1. **Defining a module** with parsing logic and configuration.
2. **Adding fixture HTML files** for mock/local modes.
3. **No changes to the core CLI code** are needed.### Adding a New Library Module
1. **Create a module** in `lib/` (e.g., `springfield.rb`):
```ruby
module Springfield
BASE_URL_ACTUAL = "https://springfield.lib.example.com"
BASE_URL_LOCAL = "http://localhost:3000/springfield"
BASE_URL = Local.local? ? BASE_URL_LOCAL : BASE_URL_ACTUALdef self.lib_data
{
post_url: "#{BASE_URL_LOCAL}login",
checked_out_url:"#{BASE_URL}/checkedout",
checked_out_fixture: "springfield_checked_out.html", # Your fixture file name
# ... other config
}
enddef self.parse_checkedout_page(page)
# Custom parsing logic for this library
end
end
```2. **Add the module** to `MODULE_NAMES` in `lib/module_names.rb`:
```ruby
MODULE_NAMES = [Spingfield, Shelbyville]
```3. **Add fixture HTML files** to `mock_data/html_pages/` (e.g., `springfield_checked_out.html`).
### How Discovery Works
Automatic Discovery: The program automatically discovers which library modules
to use, their URLs, and related configuration. Just define your modules and add
them to MODULE_NAMES - no manual wiring or hardcoded lists needed.Minimal Configuration: Each module provides its own URLs and fixture filenames
via its lib_data method. The main program and local server use this info
directly, so you don’t need to configure URLs or file paths elsewhere.Mock Mode: The --mock option automatically uses your fixture HTML files for all
intercepted network requests, with no need for real HTTP calls.Local Server: The local_server.rb script dynamically maps URLs from all modules
in MODULE_NAMES to serve their corresponding fixture files for development and
concurrency testing. No manual URL setup is required - the server reads
everything it needs from your modules.In short: Whether running in normal, mock, or local server mode, the program
“just works” as long as your modules and fixtures are in place. Add a new
library module, list it in MODULE_NAMES, and both the CLI and local server will
pick it up automatically.---
### Example Workflow for a New Library
1. Define `lib/libraries/seattle.rb` with parsing logic.
2. Add `Seattle` to `MODULE_NAMES`.
3. Add `seattle_checked_out.html`, `seattle_on_hold.html` to `mock_data/html_pages/`.
4. Test immediately:```bash
bundle exec app/mymodest --mock # Uses your fixtures
bundle exec lib/local_server.rb # Serves them at localhost:3000/seattle/...
```For more information on how to customize this code for your use, please refer
to our sister project in the Ruby-like language Crystal.Please refer to the following documents from the Crystal project for detailed
information:- [Detailed README](https://github.com/Traven-B/mymodern/blob/main/project_docs/DETAILED_README.md) for **detailed installation instructions**, and subsequent setup and usage notes.
- [Project Structure Documentation](https://github.com/Traven-B/mymodern/blob/main/project_docs/PROJECT_STRUCTURE.md) which outlines the specific parts you'll need to adapt or modify to work with your library's website.## Contributing
1. Fork the repository ()
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Pull Request## Contributors
- [Traven-B](https://github.com/Traven-B) Michael Kamb - creator, maintainer
[Ruby]: https://www.ruby-lang.org