https://github.com/autoparallel/learner

Making learning sh*t less annoying
https://github.com/autoparallel/learner
automation learning papers research
Last synced: 11 months ago
JSON representation
Making learning sh*t less annoying
Host: GitHub
URL: https://github.com/autoparallel/learner
Owner: Autoparallel
Created: 2024-11-02T21:48:10.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-04T15:10:59.000Z (over 1 year ago)
Last Synced: 2025-01-04T16:20:06.042Z (over 1 year ago)
Topics: automation, learning, papers, research
Language: Rust
Homepage: https://crates.io/crates/learner
Size: 688 KB
Stars: 36
Watchers: 1
Forks: 2
Open Issues: 41
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project

README

          


# learner

*A Rust-powered academic research management system*

[![Library](https://img.shields.io/badge/lib-learner-blue)](https://crates.io/crates/learner)

[![Crates.io](https://img.shields.io/crates/v/learner)](https://crates.io/crates/learner)

[![docs.rs](https://img.shields.io/docsrs/learner)](https://docs.rs/learner)

  |  

[![CLI](https://img.shields.io/badge/cli-learnerd-blue)](https://crates.io/crates/learnerd)

[![Crates.io](https://img.shields.io/crates/v/learnerd)](https://crates.io/crates/learnerd)

[![CI](https://github.com/autoparallel/learner/actions/workflows/check.yaml/badge.svg)](https://github.com/autoparallel/learner/actions/workflows/check.yaml)

[![License](https://img.shields.io/crates/l/learner)](LICENSE)





[Features](#features) | 

[Installation](#installation) | 

[Usage](#usage) | 

[Configuration](#configuration) | 

[Roadmap](#roadmap) | 

[Contributing](#contributing) | 

[Development](#development) | 

[SDK](#sdk) | 

[License](#license) | 

[Acknowledgements](#acknowledgements)

---

## Features

- Paper Metadata Management

  - Support for arXiv, IACR, and DOI sources

  - Automatic source detection from URLs or identifiers

  - Full metadata extraction including authors and abstracts

- Local Database

  - SQLite-based storage with full-text search

  - Configurable document storage

  - Platform-specific defaults

- Interactive Interfaces

  - Terminal User Interface (TUI) with vim-style navigation

  - Command-line interface (CLI) for scripting and automation with shell CLI completions

  - Search, filter, and preview functionality

  - Document management and viewing

  - Daemon support for background operations

## Installation

### Library

```toml

[dependencies]

learner = { version = "*" }  # Uses latest version

```

### CLI Tool

```bash

cargo +nightly install learnerd --features tui

```

This installs both the CLI tool and TUI interface, accessible via the `learner` command.

To obtain shell completions for `learner`:

```

# replace fish with your shell: bash, zsh or whatever

# then, move completions to somewhere reasonable, and source them from your shell setup config.

learner -g fish > learner_completions.fish

source learner_completions.fish

```

## Usage

### Library Usage

```rust

use learner::{Paper, Database};

#[tokio::main]

async fn main() -> Result> {

    let db = Database::open(Database::default_path()).await?;

    

    // Add papers from various sources

    let paper = Paper::new("https://arxiv.org/abs/2301.07041").await?;

    paper.save(&db).await?;

    

    // Download associated document

    let storage = Database::default_storage_path();

    paper.download_pdf(&storage).await?;

    

    Ok(())

}

```

### Command Line Interface

```bash

# Initialize database

learner init --default-retrievers

# Add papers

learner add 2301.07041

learner add "https://arxiv.org/abs/2301.07041" --pdf

learner add "10.1145/1327452.1327492" --no-pdf

# Search papers

learner search "quantum computing"

learner search "quantum" --author "Feynman" --detailed

learner search "neural" --source arxiv --before 2023

# Remove papers

learner remove "outdated paper"

learner remove "temp" --force --remove-pdf

```

### Terminal User Interface

If you install with

```

cargo install learnerd --features tui

```

you can get access to a Terminal User Interface (TUI). To launch the interactive TUI just do:

```bash

learner

```

TUI navigation:

- `↑`/`k`, `↓`/`j`: Navigate papers

- `←`/`h`, `→`/`l`: Switch panes

- `:`: Enter command mode

- `o`: Open selected PDF

- `q`: Quit

TUI commands:

```bash

:add      # Add a paper

:remove   # Remove paper(s)

:search   # Search papers

```

(TODO:) Search within TUI supports all filters:

```bash

:search "quantum" --author "Feynman"

:search "neural" --source arxiv --before 2023

```

### System Daemon Management

`learnerd` can run as a background service for paper monitoring and updates.

Currently, there are no distinct processes it runs but there is a tracking issue: [issue #83](https://github.com/Autoparallel/learner/issues/83).

#### System Service 

```bash

# Install and start

sudo learnerd daemon install

sudo systemctl enable --now learnerd  # Linux

sudo launchctl load /Library/LaunchDaemons/learnerd.daemon.plist  # macOS

# Remove

sudo learnerd daemon uninstall

```

#### Logs

- Linux: /var/log/learnerd/

- macOS: /Library/Logs/learnerd/

Files: `learnerd.log` (main, rotated daily), `stdout.log`, `stderr.log`

#### Troubleshooting

- **Permission Errors:** Check ownership of log directories

- **Won't Start:** Check system logs and remove stale PID file if present

- **Installation:** Run commands as root/sudo

## Configuration

The `learner` system uses a flexible configuration system that allows customization of paper sources, storage paths, and retrieval behavior.

### Default Locations

- **Config**: 

  - Linux: `~/.config/learner/config.toml`

  - macOS: `~/Library/Application Support/learner/config.toml`

  - Windows: `%APPDATA%\learner\config.toml`

- **Database**:

  - Linux: `~/.local/share/learner/learner.db`

  - macOS: `~/Library/Application Support/learner/learner.db`

  - Windows: `%APPDATA%\learner\learner.db`

- **Papers**:

  - Linux/macOS: `~/Documents/learner/papers`

  - Windows: `Documents\learner\papers`

### Configuration File

The configuration file (`config.toml`) allows you to customize:

```toml

# Base configuration

[config]

database_path = "/custom/path/to/db.sqlite" # Where the datbase itself is stored

storage_path = "/custom/path/to/papers"     # Where the documents are stored

retrievers_path = "/custom/path/to/papers"  # Where configuration for retrievers are stored

```

### Adding Custom Sources

1. Create a source configuration in TOML:

```toml

[sources.new_source]

name = "New Paper Source"

base_url = "https://api.example.com"

pattern = "^PREFIX-\\d+$"  # Regex for identifier validation

endpoint_template = "/api/v1/papers/{identifier}"

headers = { "API-Key" = "your-key" }  # Optional headers

# For JSON responses

response_format = { type = "json" }

field_maps.title = { path = "data.title" }

field_maps.abstract = { path = "data.description" }

field_maps.pdf_url = { 

    path = "data.files.pdf",

    transform = { type = "url", base = "https://cdn.example.com", suffix = ".pdf" }

}

# For XML responses

response_format = { type = "xml" }

field_maps.title = { path = "paper/title" }

field_maps.authors = { path = "paper/authors/author" }

```

Put this TOML configuration file in your `~/.learner/retrievers/` (or equivalent) directory.

Examples can be found in `crates/learner/config/retrievers/`.

### Source Requirements

Custom sources must provide:

1. A unique identifier pattern (regex)

2. An API endpoint that returns paper metadata

3. Field mappings for required metadata:

   - Title

   - Authors

   - Abstract

   - Publication date

   - Optional: PDF URL, DOI

### Supported Response Formats

- **JSON**: 

  - Path-based field extraction

  - Value transformations (dates, URLs)

  - Array handling for authors/references

- **XML**:

  - XPath-style field selection

  - Namespace handling

  - Multiple value aggregation

## Project Structure

1. `learner` - Core library

   - Paper metadata extraction and management

   - Database operations and search

   - PDF handling and source-specific clients

   - Error handling and type safety

2. `learnerd` - CLI application

   - Paper and document management interface

   - System daemon capabilities

   - Logging and diagnostics

## Roadmap

- [ ] Generic LLM integration (similar to the configurable `Retriever` abstraction)

- [ ] RAG system

- [ ] Document version control and annotations

- [ ] Paper discovery and streaming

- [ ] Configurable daemon process (e.g., watch file system, RSS, automated LLM querying)

- [ ] REST API and Daemonize so `learner` can be a plugin with/for other apps (e.g., Raycast, Syncthing)

- [ ] Database improvements (more searchable fields, tags, organization)

- [ ] TUI improvements (organization, flexibility, in-terminal paper reading)

- [ ] Citation analysis and related works.

## Contributing

Contributions welcome! Please open an issue before making major changes.

### CI Workflow

Our automated pipeline ensures:

- Code Quality

  - rustfmt and taplo for consistent formatting

  - clippy for Rust best practices

  - cargo-udeps for dependency management

  - cargo-semver-checks for API compatibility

- Testing

  - Full test suite across workspace and platforms

All checks must pass before merging pull requests.

## Development

This project uses [just](https://github.com/casey/just) as a command runner.

```bash

# Setup

cargo install just

just setup

# Common commands

just test       # run tests

just fmt        # format code

just ci         # run all checks

just build-all  # build all targets

```

> [!TIP]

> Running `just setup` and `just ci` locally is a quick way to get up to speed and see that the repo is working on your system!

## SDK

This repository now supplies a very basic SDK for validating a `Retriever` and a `Resource` TOML configurations. 

To work with this SDK, use:

```

# Setup

just setup-sdk

# Validations

just validate-retriever   # optionally supply url/identifer

just validate-resource 

# Examples

just validate-retriever crates/learner/config/retrievers/arxiv.toml 2301.07041

just validate-resource crates/learner/config/resources/thesis.toml

```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [arXiv API](https://arxiv.org/help/api/index) for paper metadata

- [IACR](https://eprint.iacr.org/) for cryptography papers

- [CrossRef](https://www.crossref.org/) for DOI resolution

- [SQLite](https://www.sqlite.org/) for local database support

---



Made for making learning sh*t less annoying.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/autoparallel/learner

Awesome Lists containing this project

README