Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vivekkdagar/pyquantify
CLI-based Semantic Analysis Tool
https://github.com/vivekkdagar/pyquantify
ai cli github linguagem-python linux machine-learning matplotlib-pyplot naive-bayes-classifier nlp package projects pypi python-packaging python3 semantic-analysis windows windows-11 wordcloud
Last synced: 2 months ago
JSON representation
CLI-based Semantic Analysis Tool
- Host: GitHub
- URL: https://github.com/vivekkdagar/pyquantify
- Owner: vivekkdagar
- License: gpl-3.0
- Created: 2024-01-16T11:11:16.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-04-06T18:55:44.000Z (9 months ago)
- Last Synced: 2024-09-28T23:01:18.588Z (3 months ago)
- Topics: ai, cli, github, linguagem-python, linux, machine-learning, matplotlib-pyplot, naive-bayes-classifier, nlp, package, projects, pypi, python-packaging, python3, semantic-analysis, windows, windows-11, wordcloud
- Language: Python
- Homepage: https://pypi.org/project/pyquantify/
- Size: 2.01 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pyquantify
Pyquantify is a powerful CLI tool for semantic analysis. It leverages natural language processing to unveil insights from text, files, or websites, empowering sophisticated data visualization and exploration.
## Badges
![Pyquantify Version](https://img.shields.io/badge/PyQuantify-1.2.0-brightgreen)
[![GPLv3 License](https://img.shields.io/badge/License-GPL%20v3-yellow.svg)](https://opensource.org/licenses/)
[![Python 3](https://img.shields.io/badge/Python-3-brightgreen)](https://www.python.org/)
## Table Of Contents
* [Features](#features)
* [Installation](#installation)
- [Install pyquantify with pip](#install-pyquantify-with-pip)
- [or you can build locally](#or-you-can-build-locally)
* [Before running pyquantify](#before-running-pyquantify)
* [Usage/Examples](#usageexamples)
* [FAQ](#faq)
* [Screenshots](#screenshots)
* [Acknowledgements](#acknowledgements)## Demo
## Features
1. **Text Summarization**:
- Utilizes the BERT model for summarizing text.
- Provides caching functionality to speed up summarization for previously processed text.
- Supports exporting summaries to text files.2. **Text Analysis**:
- Preprocesses text data including tokenization and part-of-speech tagging.
- Generates various metrics such as character count, word count, sentence count, etc.
- Analyzes morphological data including lemmatized forms, part-of-speech tags, and word frequencies.
- Performs sentiment analysis using the TextBlob library.
- Visualizes data through word clouds and word frequency charts.3. **Text Processing**:
- Offers functionality for cleaning and preprocessing text data.
- Implements functions for generating word clouds and word frequency charts.
- Calculates cosine similarity between two texts.4. **Data Loading and Exporting**:
- Supports loading text data from raw input, files, or websites.
- Provides export functionality for analyzed data, summaries, sentiment analysis results, and keywords extracted from text.5. **CLI Interface**:
- Implements a command-line interface (CLI) using Click library.
- Offers commands for various text analysis and summarization tasks, including data visualization and sentiment analysis.
- Provides options for specifying data loading mode and exporting analysis results.6. **Parallel Processing**: Utilizes multiprocessing and concurrent.futures for parallel processing of tasks, improving performance for tasks like sentiment analysis and summarization.
7. **Unit Testing**:
- Includes unit tests for different modules and functionalities using the `unittest` framework.
- Uses mocking to isolate and test individual components such as data loading, summarization, and exporting.8. **Exception Handling and Error Reporting**:
- Handles exceptions gracefully and provides informative error messages.
- Reports errors such as unsupported operating systems, file not found, and invalid input modes.
## Installation#### Install pyquantify with pip
```bash
pip install pyquantify
```
#### or you can build locallyClone the project
```bash
git clone https://github.com/vivekkdagar/pyquantify.git
```Go to the project directory
```bash
cd pyquantify
```Build the package:
```bash
python3 -m build
```Install the package:
```bash
pip install dist/*gz
```## Before running pyquantify
Clone the project
```bash
git clone https://github.com/vivekkdagar/pyquantify.git
```Go to the project directory
```bash
cd pyquantify
```Run the script nltk_datasets.py in scripts directory
```bash
python3 nltk_datasets.py
```Download dataset for spacy
```bash
python3 -m spacy download en_core_web_sm
```## Usage/Examples
Pyquantify provides several commands for analyzing and visualizing text data. Below is a guide on how to use the key functionalities:
1. **Search for a Specific Word in Morphological Analysis:**
```bash
pyquantify search-word --mode [raw/file/website] --word [desired_word]
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--word`: Specify the word you want to search for.2. **Generate Word Frequency Plot:**
```bash
pyquantify visualize --mode [raw/file/website] --freq-chart --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--freq-chart`: Flag to generate word frequency chart.
- `--export`: Optional flag to export the frequency plot to a file.3. **Generate Word Cloud:**
```bash
pyquantify visualize --mode [raw/file/website] --wordcloud --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--wordcloud`: Flag to generate word cloud.
- `--export`: Optional flag to export the word cloud to a file.4. **Text Analysis and Metrics Generation:**
```bash
pyquantify analyze --mode [raw/file/website] --n [number_of_rows] --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--n`: Optional parameter to display a specific number of rows in the analysis.
- `--export`: Optional flag to export the analysis results to files.5. **Summarize Text:**
```bash
pyquantify summarize --mode [raw/file/website] --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--export`: Optional flag to export the summary to a file.6. **Sentiment Analysis**
```bash
pyquantify sentiment-analysis --mode [raw/file/website] --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--export`: Optional flag to export the summary to a file.7. **View the pyquantify git page**
```bash
pyquantify git
```8. **Extract keywords from the data**
```bash
pyquantify keywords --mode [raw/file/website] --export
```- `--mode`: Specify the data loading mode (raw input, file, or website).
- `--export`: Optional flag to export the extracted keywords to a file.9. **Calculate Cosine Similarity:**
```bash
pyquantify similarity --mode [raw/file/website] --other [raw/file/website]
```- `--mode`: Specify the data loading mode for the first text (raw input, file, or website).
- `--other`: Specify the data loading mode for the second text (raw input, file, or website).Feel free to explore additional options and functionalities by checking the help documentation for each command:
```bash
pyquantify [command] --help
```## FAQ
### Q: What is Pyquantify?
Pyquantify is a tool designed for in-depth analysis of textual data, focusing on extracting meaning and linguistic insights. It provides features like word frequency, morphology, and metrics generation, enhancing data exploration and visualization.
### Q: Why Develop Pyquantify as a Semantic Profiler?
Pyquantify was created for the DSA subject in the fifth semester of college. The goal was to offer a versatile NLP tool, empowering users to analyze and profile text efficiently. The tool's features aim to deepen understanding and exploration of linguistic aspects within textual data.
### Q: Why Did Pyquantify evolve from a Word Frequency Counter?
Originally conceived as a word frequency counter, Pyquantify's development took a different direction. The decision to expand its capabilities was driven by the desire to create a more comprehensive tool for natural language processing. The project evolved to encompass semantic profiling, offering a richer set of features such as morphology analysis, metrics generation, and enhanced data visualization. This shift aimed to provide users with a more powerful and versatile solution for exploring and understanding textual data beyond simple word frequency analysis.
### Q: Why the name change from NLPFreq to Pyquantify?
NLPFreq felt limiting and didn't capture the full scope of the project. Pyquantify more accurately reflects its capabilities as a Python-based tool for quantitative data analysis.
## Screenshots![App Screenshot](https://via.placeholder.com/468x300?text=App+Screenshot+Here)
## Acknowledgements
- [NeuralNine's Publish Your Own Python Package](https://www.youtube.com/watch?v=tEFkHEKypLI)
- [Github Readme Generator](https://readme.so/)
- [Table of Content Generator](https://derlin.github.io/bitdowntoc/)