https://github.com/jjasghar/ai-abstract-classifier

Run abstracts through a local LLM to give you some possible scores on what you are looking for.
https://github.com/jjasghar/ai-abstract-classifier

abstracts cfp granite llm

Last synced: about 1 year ago
JSON representation

Run abstracts through a local LLM to give you some possible scores on what you are looking for.

Host: GitHub
URL: https://github.com/jjasghar/ai-abstract-classifier
Owner: jjasghar
License: other
Created: 2025-01-27T23:09:25.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-14T23:11:31.000Z (over 1 year ago)
Last Synced: 2025-06-13T16:37:22.660Z (about 1 year ago)
Topics: abstracts, cfp, granite, llm
Language: Python
Homepage:
Size: 77.1 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# ai-abstract-classifier

## Scope

This is an application that takes [AnythingLLM][anythingllm] and a selection of [abstracts][abstracts]
and asks a local LLM ([granite][granite] ideally) if the abstract has been written by an AI and/or
is a possible sales pitch.
It gives a file called `overview.csv` with a confidence score of up to `100` if it's been AI or
too "sales-y."

You can also inject a `csv` into this instead of reading an API, either [pretalx][pretalx] or [sessionize][sessionize] for the time being.

## CSV notes

Take a look at [test_data/testing.csv](./test_data/testing.csv) as an example. You run it via `python main.py -c CSV_FILE`, check `python main.py -h` for help.

**NOTE**: This is `,` seporated for the time being, so you'll need to remove all the `,` from the actual abstracts so it can be parsed correctly.

The sections that are needed the `csv` are as follows:
- code
- title
- abstract
- description

## Configuration

Everything is configured in the [config.toml](./config.toml.example) file, copy it to
the working directory and do something like the following:

First install AnythingLLM, [here](https://anythingllm.com/desktop), and configure it
with something along these lines of [this](https://ibm.github.io/opensource-ai-workshop/lab-3/).

**Note**: As of this release you will need to configure the model you want this to us via the
"default" AnythingLLM configuration. It seems for _now_ you can't programaticly change the workspace
for different models, so this is the work around.

Check out [testing_notes.md](./test_data/testing_notes.md) for some of the numbers ran with other
models on the same data.

Run these following commands:

```bash
git clone git@github.com:jjasghar/pretalx-ai-validator.git
cd pretalx-ai-validator
python3.11 -m venv --upgrade-deps venv
source venv/bin/activate
pip install -r requirements.txt
cp config.toml.example config.toml
vim config.toml
python main.py -h
```

## Utils

There is a [jsons_to_markdown.py](./utils/jsons_to_markdown.py) to convert the [chat_primes](./chat_primes/jsons/) to readable format(s).

## License & Authors

If you would like to see the detailed LICENSE click [here](./LICENSE).

- Author: JJ Asghar

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```

[anythingllm]: https://github.com/Mintplex-Labs/anything-llm
[abstracts]: https://talks.devopsdays.org/devopsdays-austin-2024/cfp
[granite]: http://ollama.com/library/granite3.1-dense
[sessionize]: https://sessionize.com
[pretalx]: https://pretalx.com/p/about/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jjasghar/ai-abstract-classifier

Awesome Lists containing this project

README