An open API service indexing awesome lists of open source software.

https://github.com/dhondta/zotero-cli

Tinyscript tool for sorting and exporting Zotero references based on pyzotero
https://github.com/dhondta/zotero-cli

citations cli-app pagerank pyzotero reference-management research-tools tinyscript zotero

Last synced: about 2 months ago
JSON representation

Tinyscript tool for sorting and exporting Zotero references based on pyzotero

Awesome Lists containing this project

README

        


Zotero CLI Tweet


Sort and rank your Zotero references easy from your CLI.
Ask questions to your Zotero documents with GPT locally.

[![PyPi](https://img.shields.io/pypi/v/zotero-cli-tool.svg)](https://pypi.python.org/pypi/zotero-cli-tool/)
![Platform](https://img.shields.io/badge/platform-linux-yellow.svg)
[![Python Versions](https://img.shields.io/pypi/pyversions/peid.svg)](https://pypi.python.org/pypi/zotero-cli-tool/)
[![Build Status](https://github.com/dhondta/zotero-cli/actions/workflows/publish-package.yml/badge.svg)](https://github.com/dhondta/zotero-cli/actions/workflows/publish-package.yml)
[![Read The Docs](https://readthedocs.org/projects/zotero-cli/badge/?version=latest)](https://zotero-cli.readthedocs.io/en/latest/?badge=latest)
[![Known Vulnerabilities](https://snyk.io/test/github/dhondta/zotero-cli/badge.svg?targetFile=requirements.txt)](https://snyk.io/test/github/dhondta/zotero-cli?targetFile=requirements.txt)
[![DOI](https://zenodo.org/badge/321932121.svg)](https://zenodo.org/badge/latestdoi/321932121)
[![License](https://img.shields.io/pypi/l/zotero-cli-tool.svg)](https://pypi.python.org/pypi/zotero-cli-tool/)

This [Tinyscript](https://github.com/dhondta/python-tinyscript) tool relies on [`pyzotero`](https://github.com/urschrei/pyzotero) for communicating with [Zotero's Web API](https://www.zotero.org/support/dev/web_api/v3/start). It allows to list field values, show items in tables in the CLI or also export sorted items to an Excel file.

```session
$ pip install zotero-cli-tool
```

## :fast_forward: Quick Start

The first time you start it, the tool will ask for your API identifier and key. It will cache it to `~/.zotero/creds.txt` with permissions set to `rw` for your user only. Data is cached to `~/.zotero/cache/`. If you are using a shared group library, you can either pass the "`-g`" ("`--group`") option in your `zotero-cli` command or, for setting it permanently, touch an empty file `~/.zotero/group`.

- Manually update cached data

```sh
$ zotero-cli reset
```

Note that it could take a while. That's why caching is interesting for further use.

- Count items in a collection

```sh
$ zotero-cli count --filter "collections:biblio"
123
```

- List values for a given field

```sh
$ zotero-cli list itemType

Type
----
computer program
conference paper
document
journal article
manuscript
thesis
webpage

```

- Show entries with the given set of fields, filtered based on multiple critera and limited to a given number of items

```sh
$ zotero-cli show year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"

Year Title Type #Pages
---- ----- ---- ------
2016 Classifying Packed Programs as Malicious Software Detected conference paper 3
2016 Detecting Packed Executable File: Supervised or Anomaly Detection Method? conference paper 5
2016 Entropy analysis to classify unknown packing algorithms for malware detection conference paper 21
2017 Packer Detection for Multi-Layer Executables Using Entropy Analysis journal article 18
2018 Sensitive system calls based packed malware variants detection using principal component initialized MultiLayers neural networks journal article 13
2018 Effective, efficient, and robust packing detection and classification journal article 15
2019 Efficient automatic original entry point detection journal article 14
2019 All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis journal article 16
2020 Experimental Comparison of Machine Learning Models in Malware Packing Detection conference paper 3
2020 Building a smart and automated tool for packed malware detections using machine learning thesis 99

```

- Export entries

```sh
$ zotero-cli export year title itemType numPages --filter "collections:biblio" --filter "title:detect" --limit ">date:10"
$ file export.xlsx
export.xlsx: Microsoft Excel 2007+

```

> **Supported formats**: `csv`, `html`, `json`, `md` (Markdown), `rst` (RestructuredText), `xml`, `xslx`, `yaml`
> Get help with `zotero-cli export --help`

- Use a predefined query

```sh
$ zotero-cli show - --query "top-50-most-relevants"
```

> **Note**: "`-`" is used for the `field` positional argument to tell the tool to select the predefined list of fields included in the query.

This is equivalent to:

```sh
$ zotero-cli show year title numPages itemType --limit ">rank:50"
```

Available queries:
- `no-attachment`: list of all items with no attachment ; displayed fields: `title`
- `no-url`: list of all items with no URL ; displayed fields: `year`, `title`
- `top-10-most-relevants`: top-10 best ranked items ; displayed fields: `year`, `title`, `numPages`, `itemType`
- `top-50-most-relevants`: same as top-10 but with the top-50

Mark items:

```sh
$ zotero-cli mark read --filter "title:a nice paper"
$ zotero-cli mark unread --filter "title:a nice paper"
```

> **Markers**:
>
> - `read` / `unread`: by default, items are displayed in bold ; marking an item as read will make it display as normal
> - `irrelevant` / `relevant`: this allows to exclude a result from the output list of items
> - `ignore` / `unignore`: this allows to completely ignore an item, including in the ranking algorithm

## :computer: Local GPT

This feature is based on [PrivateGPT](https://github.com/imartinez/privateGPT). It can be used to ingest local Zotero documents and ask questions based on a chosen GPT model.

- Install optional dependencies

```sh
$ pip install zotero-cli-tool[gpt]
```

- Install a model among the followings:

- `ggml-gpt4all-j-v1.3-groovy.bin` (default)
- `ggml-gpt4all-l13b-snoozy.bin`
- `ggml-mpt-7b-chat.bin`
- `ggml-v3-13b-hermes-q5_1.bin`
- `ggml-vicuna-7b-1.1-q4_2.bin`
- `ggml-vicuna-13b-1.1-q4_2.bin`
- `ggml-wizardLM-7B.q4_2.bin`
- `ggml-stable-vicuna-13B.q4_2.bin`
- `ggml-mpt-7b-base.bin`
- `ggml-nous-gpt4-vicuna-13b.bin`
- `ggml-mpt-7b-instruct.bin`
- `ggml-wizard-13b-uncensored.bin`

```sh
$ zotero-cli install
```

The latest installed model gets selected for the `ask` command (see hereafter).

- Ingest your documents

```sh
$ zotero-cli ingest
```

- Ask questions to your documents

```sh
$ zotero-cli ask
Using embedded DuckDB with persistence: data will be stored in: /home/morfal/.zotero/db
Found model file.
[...]
Enter a query:

```

## :bulb: Special Features

Some additional fields can be used for listing/filtering/showing/exporting data.

- Computed fields

- `authors`: the list of `creators` with `creatorType` equal to `author`
- `citations`: the number of relations the item has to other items with a later date
- `editors`: the list of `creators` with `creatorType` equal to `editor`
- `numAttachments`: the number of child items with `itemType` equal to `attachment`
- `numAuthors`: the number of `creators` with `creatorType` equal to `author`
- `numCreators`: the number of `creators`
- `numEditors`: the number of `creators` with `creatorType` equal to `editor`
- `numNotes`: the number of child items with `itemType` equal to `note`
- `numPages`: the (corrected) number of pages, either got from the original or `pages` field
- `references`: the number of relations the item has to other items with an earlier date
- `year`: the year coming from the `datetime` parsing of the `date` field

- Extracted fields (from the `extra` field)

- `comments`: custom field for adding comments
- `results`: custom field for mentioning results related to the item
- `what`: custom field for a short description of what the item is about
- `zscc`: number of Scholar citations, computed with the [Zotero Google Scholar Citations](https://github.com/beloglazov/zotero-scholar-citations) plugin

- PageRank-based reference ranking algorithm

- `rank`: computed field aimed to rank references in order of relevance ; this uses an algorithm similar to Google's PageRank while weighting references in function of their year of publication (giving more importance to recent references, which cannot have as much citations as older references anyway)

## :clap: Supporters

[![Stargazers repo roster for @dhondta/zotero-cli](https://reporoster.com/stars/dark/dhondta/zotero-cli)](https://github.com/dhondta/zotero-cli/stargazers)

[![Forkers repo roster for @dhondta/zotero-cli](https://reporoster.com/forks/dark/dhondta/zotero-cli)](https://github.com/dhondta/zotero-cli/network/members)

Back to top