{"id":15442521,"url":"https://github.com/ycatsh/connor","last_synced_at":"2025-04-08T03:16:44.215Z","repository":{"id":257792984,"uuid":"862134316","full_name":"ycatsh/connor","owner":"ycatsh","description":"A starting take on a fast and local utility that organizes files based on their textual content using NLP","archived":false,"fork":false,"pushed_at":"2024-11-28T20:06:26.000Z","size":13692,"stargazers_count":61,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-31T14:14:29.112Z","etag":null,"topics":["artificial-intelligence","cli","cosine-similarity","file-organizer","gui","latent-dirichlet-allocation","natural-language-processing","pyqt6","python","sentence-transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ycatsh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-24T05:15:04.000Z","updated_at":"2025-03-03T00:55:22.000Z","dependencies_parsed_at":null,"dependency_job_id":"765c0a5b-e2a9-4fb9-8275-604dab1a23dd","html_url":"https://github.com/ycatsh/connor","commit_stats":{"total_commits":44,"total_committers":3,"mean_commits":"14.666666666666666","dds":"0.045454545454545414","last_synced_commit":"cffab0f55863f6b691264002f18bf2854c5c8fa5"},"previous_names":["ycatsh/connor"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ycatsh%2Fconnor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ycatsh%2Fconnor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ycatsh%2Fconnor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ycatsh%2Fconnor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ycatsh","download_url":"https://codeload.github.com/ycatsh/connor/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247767236,"owners_count":20992548,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","cli","cosine-similarity","file-organizer","gui","latent-dirichlet-allocation","natural-language-processing","pyqt6","python","sentence-transformers"],"created_at":"2024-10-01T19:28:17.432Z","updated_at":"2025-04-08T03:16:44.180Z","avatar_url":"https://github.com/ycatsh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n\u003cimg src=\"./.github/logo.png\" alt=\"Connor\"\u003e\n\u003c/h1\u003e\n\nConnor is a file organizer written in [python](https://www.python.org/). It makes use of the [sentence-transformers](https://sbert.net/) framework for the main organization process and the [PyQt6](https://doc.qt.io/qtforpython-6/) GUI toolkit for the graphical user interface. **It is by no means supposed to substitute for organzing files by hand. It is just a concept**. Connor features a fast and fully local file organizer that uses natural language processing to organize computer files based on their textual content.\n\u003cbr\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n![releases](https://img.shields.io/github/v/release/ycatsh/connor?color=507591\u0026labelColor=1d1e1f\u0026style=flat)\n![issues-open](https://img.shields.io/github/issues/ycatsh/connor?color=507591\u0026labelColor=1d1e1f\u0026style=flat)\n![stars](https://img.shields.io/github/stars/ycatsh/connor?color=507591\u0026labelColor=1d1e1f\u0026style=flat)\n\n\u003c/div\u003e\n\nhttps://github.com/user-attachments/assets/b0d151c6-9a8b-4710-92e9-d410edc57b84\n\n## Features\nConnor works locally on your computer using a pre-trained NLP model `sentence-transformers/paraphrase-MiniLM-L6-v2` to understand the meaning of the data and calculate the cosine similarity between files. The folders are appropriately named using topic modeling through the Latent Dirichlet Allocation (LDA) technique.\n\nThe file names and contents are read, then cosine similarity is used to calculate the similarity between the content of every file with respect to every other file. Based on the condition that the similarity scores between the files are above the provided threshold, the files are grouped in key-value pairs into a dictionary where each category corresponds to a folder. \n\nLatent Dirichlet Allocation is then used to generate topic names for the contents in each folder, i.e., the categories in the dictionary. Folders are created using the most relevant topic names, and the corresponding files are then moved into their appropriate folders.\n\nFor files such as images (image support will be added later), executables, binaries, etc. that cannot be read are organized into a ``_misc`` folder based on their file extensions.\n\n\u003cbr\u003e\n\n### File Organization Summary\n1. Organize files within a selected folder or manually uploaded files (uploading files is only supported for GUI).\n2. Organize text-based files (`.docx`, `.txt`, `.pdf`, etc.) using NLP.\n3. Creates a separate folder named \"Miscellaneous\" for dissimilar or unprocessable files based on extension.\n4. Provide a summary (tree structure) of the organization process upon completion.\n\n### Customization Options\n1. Similarity Threshold: Allows you to choose a similarity percentage threshold for grouping similar files.\n2. Reading Word Limit: You can set a limit on the number of words to read from the file content.\n3. Folder Name Word Limit: You can specify the maximum number of words allowed in the created folder names.\n4. Default Parameters: You can modify these three parameters and save them for future sessions.\n\n### User Preferences\nCommand Line Interface: Simple and concise command line interface to quickly organize folders.\nGraphical User Interface: Provides a simplistic and straightforward GUI for ease of use with upload files feature.\n\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n## Installation\nThere are installation instructions for both GUI and CLI. You can choose the one you want to install. If you're opting for building the application from [source](https://github.com/ycatsh/connor#source) then adding the run file to path is recommended.\n\n**Install Connor via pip:**\n1. Make sure you have `python` and `pip` installed and added to path.\n2. Run `pip install connor-nlp`  \n\n\u003cbr\u003e\n\n**Install the GUI version of Connor (executable)**\n1. Go to the [latest release](https://github.com/ycatsh/connor/releases).\n3. Follow the steps there.\n2. Run the executable (`.exe`).  \n\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n## Usage\n\n### Command Structure\n\n```bash\nconnor [command] [options]\n```\n\n### Commands\n#### `run`: Run the folder organization process.\n\n**Usage:**\n```bash\nconnor run \u003cfolder_path\u003e\n```\n\n**Options:**\n- `folder_path`: Required. Absolute path to the folder that you want to organize.\n\n**Example:**\n```bash\nconnor run /path/to/your/folder\n```\n\n\u003cbr\u003e\n\n#### `settings`: Update the default settings for the tool.\n\n**Usage:**\n```bash\nconnor settings [options]\n```\n\n**Options:**\n- `-f, --folder-word-limit`: Set the maximum length for folder names. (default: 3)\n- `-r, --reading-limit`: Specify the word limit for reading files. (default: 200)\n- `-t, --similarity-threshold`: Define the similarity threshold percentage. (default: 50)\n- `--show`: Show current settings\n\n**Example:**\n```bash\nconnor settings -f 2 -r 150 -t 60\n```\n\n```console\n$ connor settings --show\nTo see how to update: Connor settings [-h]\n\nCurrent settings:\n  folder words limit     3\n  reading limit          200\n  similarity threshold   50%\n```\n\n\u003cbr\u003e\n\n#### `--gui`: Run Connor as a full fledged GUI from the terminal.\n\n**Usage:**\n```bash\nconnor --gui\n```\n\n\u003cbr\u003e\n\n### Help\nTo view help information for commands and options use the ``-h`` or `--help` flag.  \n\n**Example:**\n```console\n$ connor -h\nusage: Connor [-h] [--gui] {settings,run} ...\n\nConnor: Fast and local NLP file organizer\n\npositional arguments:\n  {settings,run}\n    settings      Update the settings for the organizer\n    run           Run the folder organization process\n\noptions:\n  -h, --help      show this help message and exit\n  --gui           Run the application in GUI mode.\n```\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n## Source\n#### 1. Clone repository:\n```bash\ngit clone https://github.com/ycatsh/connor.git\ncd connor\n```  \n#### 2. Create and activate virtual environment:\n```bash\npython3 -m venv venv\nsource venv/bin/activate\n```  \n#### 3. Install dependencies:\n```bash\npip3 install -r requirements.txt\n```\n#### 4. Run program:\nFor GUI:\n```bash\npython3 run.py --gui\n```\nFor CLI:\n```bash\npython3 run.py -h\n```\n\n#### 5. Install locally (optional):\n```bash\npip3 install .\n```  \n  \n**Example:**  \n```bash\nconnor --gui\n```\n```bash\nconnor -h\n```\n\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n## License\nThis project is distributed under MIT License, which can be found in LICENSE in the root dir of the project. I reserve the right to place future versions of this project under a different license.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fycatsh%2Fconnor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fycatsh%2Fconnor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fycatsh%2Fconnor/lists"}