{"id":29791630,"url":"https://github.com/autoparallel/learner","last_synced_at":"2025-07-28T00:32:51.612Z","repository":{"id":260835920,"uuid":"882475213","full_name":"Autoparallel/learner","owner":"Autoparallel","description":"Making learning sh*t less annoying","archived":false,"fork":false,"pushed_at":"2025-01-04T15:10:59.000Z","size":705,"stargazers_count":36,"open_issues_count":41,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-04T16:20:06.042Z","etag":null,"topics":["automation","learning","papers","research"],"latest_commit_sha":null,"homepage":"https://crates.io/crates/learner","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Autoparallel.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-02T21:48:10.000Z","updated_at":"2024-12-30T22:30:52.000Z","dependencies_parsed_at":"2024-11-17T22:23:03.749Z","dependency_job_id":"d923a659-7ba9-42c3-a739-66707f2097a1","html_url":"https://github.com/Autoparallel/learner","commit_stats":null,"previous_names":["autoparallel/learnerd","autoparallel/learner"],"tags_count":33,"template":true,"template_full_name":null,"purl":"pkg:github/Autoparallel/learner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Autoparallel%2Flearner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Autoparallel%2Flearner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Autoparallel%2Flearner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Autoparallel%2Flearner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Autoparallel","download_url":"https://codeload.github.com/Autoparallel/learner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Autoparallel%2Flearner/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267446802,"owners_count":24088561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-27T02:00:11.917Z","response_time":82,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","learning","papers","research"],"created_at":"2025-07-28T00:32:43.306Z","updated_at":"2025-07-28T00:32:51.598Z","avatar_url":"https://github.com/Autoparallel.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# learner\n*A Rust-powered academic research management system*\n\n[![Library](https://img.shields.io/badge/lib-learner-blue)](https://crates.io/crates/learner)\n[![Crates.io](https://img.shields.io/crates/v/learner)](https://crates.io/crates/learner)\n[![docs.rs](https://img.shields.io/docsrs/learner)](https://docs.rs/learner)\n\u0026nbsp;\u0026nbsp;|\u0026nbsp;\u0026nbsp;\n[![CLI](https://img.shields.io/badge/cli-learnerd-blue)](https://crates.io/crates/learnerd)\n[![Crates.io](https://img.shields.io/crates/v/learnerd)](https://crates.io/crates/learnerd)\n[![CI](https://github.com/autoparallel/learner/actions/workflows/check.yaml/badge.svg)](https://github.com/autoparallel/learner/actions/workflows/check.yaml)\n[![License](https://img.shields.io/crates/l/learner)](LICENSE)\n\n\u003cimg src=\"assets/header.svg\" alt=\"learner header\" width=\"600px\"\u003e\n\n\u003c/div\u003e\n\n[Features](#features) | \n[Installation](#installation) | \n[Usage](#usage) | \n[Configuration](#configuration) | \n[Roadmap](#roadmap) | \n[Contributing](#contributing) | \n[Development](#development) | \n[SDK](#sdk) | \n[License](#license) | \n[Acknowledgements](#acknowledgements)\n\n---\n## Features\n\n- Paper Metadata Management\n  - Support for arXiv, IACR, and DOI sources\n  - Automatic source detection from URLs or identifiers\n  - Full metadata extraction including authors and abstracts\n\n- Local Database\n  - SQLite-based storage with full-text search\n  - Configurable document storage\n  - Platform-specific defaults\n\n- Interactive Interfaces\n  - Terminal User Interface (TUI) with vim-style navigation\n  - Command-line interface (CLI) for scripting and automation with shell CLI completions\n  - Search, filter, and preview functionality\n  - Document management and viewing\n  - Daemon support for background operations\n\n## Installation\n\n### Library\n\n```toml\n[dependencies]\nlearner = { version = \"*\" }  # Uses latest version\n```\n\n### CLI Tool\n\n```bash\ncargo +nightly install learnerd --features tui\n```\n\nThis installs both the CLI tool and TUI interface, accessible via the `learner` command.\n\n\nTo obtain shell completions for `learner`:\n```\n# replace fish with your shell: bash, zsh or whatever\n# then, move completions to somewhere reasonable, and source them from your shell setup config.\nlearner -g fish \u003e learner_completions.fish\nsource learner_completions.fish\n```\n\n## Usage\n\n### Library Usage\n\n```rust\nuse learner::{Paper, Database};\n\n#[tokio::main]\nasync fn main() -\u003e Result\u003e {\n    let db = Database::open(Database::default_path()).await?;\n    \n    // Add papers from various sources\n    let paper = Paper::new(\"https://arxiv.org/abs/2301.07041\").await?;\n    paper.save(\u0026db).await?;\n    \n    // Download associated document\n    let storage = Database::default_storage_path();\n    paper.download_pdf(\u0026storage).await?;\n    \n    Ok(())\n}\n```\n\n### Command Line Interface\n\n```bash\n# Initialize database\nlearner init --default-retrievers\n\n# Add papers\nlearner add 2301.07041\nlearner add \"https://arxiv.org/abs/2301.07041\" --pdf\nlearner add \"10.1145/1327452.1327492\" --no-pdf\n\n# Search papers\nlearner search \"quantum computing\"\nlearner search \"quantum\" --author \"Feynman\" --detailed\nlearner search \"neural\" --source arxiv --before 2023\n\n# Remove papers\nlearner remove \"outdated paper\"\nlearner remove \"temp\" --force --remove-pdf\n```\n\n### Terminal User Interface\nIf you install with\n```\ncargo install learnerd --features tui\n```\nyou can get access to a Terminal User Interface (TUI). To launch the interactive TUI just do:\n```bash\nlearner\n```\n\nTUI navigation:\n- `↑`/`k`, `↓`/`j`: Navigate papers\n- `←`/`h`, `→`/`l`: Switch panes\n- `:`: Enter command mode\n- `o`: Open selected PDF\n- `q`: Quit\n\nTUI commands:\n```bash\n:add      # Add a paper\n:remove   # Remove paper(s)\n:search   # Search papers\n```\n\n(TODO:) Search within TUI supports all filters:\n```bash\n:search \"quantum\" --author \"Feynman\"\n:search \"neural\" --source arxiv --before 2023\n```\n\n### System Daemon Management\n\n`learnerd` can run as a background service for paper monitoring and updates.\nCurrently, there are no distinct processes it runs but there is a tracking issue: [issue #83](https://github.com/Autoparallel/learner/issues/83).\n\n#### System Service \n```bash\n# Install and start\nsudo learnerd daemon install\nsudo systemctl enable --now learnerd  # Linux\nsudo launchctl load /Library/LaunchDaemons/learnerd.daemon.plist  # macOS\n\n# Remove\nsudo learnerd daemon uninstall\n```\n\n#### Logs\n- Linux: /var/log/learnerd/\n- macOS: /Library/Logs/learnerd/\n\nFiles: `learnerd.log` (main, rotated daily), `stdout.log`, `stderr.log`\n\n#### Troubleshooting\n\n- **Permission Errors:** Check ownership of log directories\n- **Won't Start:** Check system logs and remove stale PID file if present\n- **Installation:** Run commands as root/sudo\n\n## Configuration\n\nThe `learner` system uses a flexible configuration system that allows customization of paper sources, storage paths, and retrieval behavior.\n\n### Default Locations\n\n- **Config**: \n  - Linux: `~/.config/learner/config.toml`\n  - macOS: `~/Library/Application Support/learner/config.toml`\n  - Windows: `%APPDATA%\\learner\\config.toml`\n\n- **Database**:\n  - Linux: `~/.local/share/learner/learner.db`\n  - macOS: `~/Library/Application Support/learner/learner.db`\n  - Windows: `%APPDATA%\\learner\\learner.db`\n\n- **Papers**:\n  - Linux/macOS: `~/Documents/learner/papers`\n  - Windows: `Documents\\learner\\papers`\n\n### Configuration File\n\nThe configuration file (`config.toml`) allows you to customize:\n```toml\n# Base configuration\n[config]\ndatabase_path = \"/custom/path/to/db.sqlite\" # Where the datbase itself is stored\nstorage_path = \"/custom/path/to/papers\"     # Where the documents are stored\nretrievers_path = \"/custom/path/to/papers\"  # Where configuration for retrievers are stored\n```\n\n### Adding Custom Sources\n\n1. Create a source configuration in TOML:\n```toml\n[sources.new_source]\nname = \"New Paper Source\"\nbase_url = \"https://api.example.com\"\npattern = \"^PREFIX-\\\\d+$\"  # Regex for identifier validation\nendpoint_template = \"/api/v1/papers/{identifier}\"\nheaders = { \"API-Key\" = \"your-key\" }  # Optional headers\n\n# For JSON responses\nresponse_format = { type = \"json\" }\nfield_maps.title = { path = \"data.title\" }\nfield_maps.abstract = { path = \"data.description\" }\nfield_maps.pdf_url = { \n    path = \"data.files.pdf\",\n    transform = { type = \"url\", base = \"https://cdn.example.com\", suffix = \".pdf\" }\n}\n\n# For XML responses\nresponse_format = { type = \"xml\" }\nfield_maps.title = { path = \"paper/title\" }\nfield_maps.authors = { path = \"paper/authors/author\" }\n```\nPut this TOML configuration file in your `~/.learner/retrievers/` (or equivalent) directory.\nExamples can be found in `crates/learner/config/retrievers/`.\n\n### Source Requirements\n\nCustom sources must provide:\n1. A unique identifier pattern (regex)\n2. An API endpoint that returns paper metadata\n3. Field mappings for required metadata:\n   - Title\n   - Authors\n   - Abstract\n   - Publication date\n   - Optional: PDF URL, DOI\n\n### Supported Response Formats\n\n- **JSON**: \n  - Path-based field extraction\n  - Value transformations (dates, URLs)\n  - Array handling for authors/references\n\n- **XML**:\n  - XPath-style field selection\n  - Namespace handling\n  - Multiple value aggregation\n\n## Project Structure\n\n1. `learner` - Core library\n   - Paper metadata extraction and management\n   - Database operations and search\n   - PDF handling and source-specific clients\n   - Error handling and type safety\n\n2. `learnerd` - CLI application\n   - Paper and document management interface\n   - System daemon capabilities\n   - Logging and diagnostics\n\n## Roadmap\n\n- [ ] Generic LLM integration (similar to the configurable `Retriever` abstraction)\n- [ ] RAG system\n- [ ] Document version control and annotations\n- [ ] Paper discovery and streaming\n- [ ] Configurable daemon process (e.g., watch file system, RSS, automated LLM querying)\n- [ ] REST API and Daemonize so `learner` can be a plugin with/for other apps (e.g., Raycast, Syncthing)\n- [ ] Database improvements (more searchable fields, tags, organization)\n- [ ] TUI improvements (organization, flexibility, in-terminal paper reading)\n- [ ] Citation analysis and related works.\n\n## Contributing\n\nContributions welcome! Please open an issue before making major changes.\n\n### CI Workflow\n\nOur automated pipeline ensures:\n\n- Code Quality\n  - rustfmt and taplo for consistent formatting\n  - clippy for Rust best practices\n  - cargo-udeps for dependency management\n  - cargo-semver-checks for API compatibility\n\n- Testing\n  - Full test suite across workspace and platforms\n\nAll checks must pass before merging pull requests.\n\n## Development\n\nThis project uses [just](https://github.com/casey/just) as a command runner.\n\n```bash\n# Setup\ncargo install just\njust setup\n\n# Common commands\njust test       # run tests\njust fmt        # format code\njust ci         # run all checks\njust build-all  # build all targets\n```\n\n\u003e [!TIP]\n\u003e Running `just setup` and `just ci` locally is a quick way to get up to speed and see that the repo is working on your system!\n\n## SDK\nThis repository now supplies a very basic SDK for validating a `Retriever` and a `Resource` TOML configurations. \nTo work with this SDK, use:\n```\n# Setup\njust setup-sdk\n\n# Validations\njust validate-retriever \u003cPATH\u003e \u003cOPTIONAL_INPUT\u003e # optionally supply url/identifer\njust validate-resource \u003cPATH\u003e\n\n# Examples\njust validate-retriever crates/learner/config/retrievers/arxiv.toml 2301.07041\njust validate-resource crates/learner/config/resources/thesis.toml\n```\n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- [arXiv API](https://arxiv.org/help/api/index) for paper metadata\n- [IACR](https://eprint.iacr.org/) for cryptography papers\n- [CrossRef](https://www.crossref.org/) for DOI resolution\n- [SQLite](https://www.sqlite.org/) for local database support\n\n---\n\n\u003cdiv align=\"center\"\u003e\nMade for making learning sh*t less annoying.\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautoparallel%2Flearner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautoparallel%2Flearner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautoparallel%2Flearner/lists"}