https://github.com/web-lifter/keyword-clustering
A script is designed to help SEO specialists, data analysts, and digital marketers optimise their keyword targeting strategies.
https://github.com/web-lifter/keyword-clustering
keyword-analysis keyword-extraction semantic-analysis semantic-segmentation seo
Last synced: 2 months ago
JSON representation
A script is designed to help SEO specialists, data analysts, and digital marketers optimise their keyword targeting strategies.
- Host: GitHub
- URL: https://github.com/web-lifter/keyword-clustering
- Owner: web-lifter
- Created: 2023-08-28T03:34:31.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-23T02:19:56.000Z (3 months ago)
- Last Synced: 2025-03-28T20:51:28.395Z (2 months ago)
- Topics: keyword-analysis, keyword-extraction, semantic-analysis, semantic-segmentation, seo
- Homepage: https://weblifter.com.au
- Size: 1.51 MB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 3D Keyword Clustering for SEO
## Overview
The 3D Keyword Clustering script is designed to help SEO specialists, data analysts, and digital marketers optimise their keyword targeting strategies. By using machine learning algorithms and natural language processing techniques, this script clusters keywords in a 3D space based on their relevance to specific web pages and search queries.
## Table of Contents
- [Installation](#installation)
- [Usage](#usage)
- [How It Works](#how-it-works)
- [Contributing](#contributing)
- [License](#license)## Installation
1. Clone this repository:
```bash
git clone https://github.com/yourusername/3D-Keyword-Clustering.git## Usage
To run the script, navigate to the directory where main.py is located and execute:
```bash
python main.py
```
## How It Works### TF-IDF Vectorization
The script uses Term Frequency-Inverse Document Frequency (TF-IDF) to convert each keyword into a numerical vector. This quantifies the 'importance' of each keyword in relation to the corpus.
```bash
def fit_vectorizer(corpus):
stop_words = set(stopwords.words('english'))
return TfidfVectorizer(stop_words=list(stop_words), ngram_range=(1, 2)).fit(corpus)
```### Cosine Similarity
Cosine similarity is calculated between the TF-IDF vectors of the keyword and the unique web pages/search queries.
```bash
def calculate_similarity(vectorizer, phrase1, phrase2):
vectors = vectorizer.transform([phrase1, phrase2]).toarray()
return cosine_similarity(vectors)[0, 1]
```
### Keyword Scoring
Each keyword is given a similarity score based on its relevance to specific web pages and primary topics.
```bash
def compute_keyword_similarity_scores(keyword, topics, pages, vectorizer):
topic_scores = {topic: calculate_similarity(vectorizer, keyword, topic) for topic in topics}
...
```
### BenefitsThis approach offers a nuanced way to cluster keywords for SEO. By thinking of a website as a 3D object rather than a 2D plane, we can better understand the relationships between keywords, web pages, and user queries. This could significantly enhance SEO strategies, making them more dynamic and tailored to various dimensions of user engagement and content relevance.
## Contributing
If you would like to contribute, please read CONTRIBUTING.md for details on the code of conduct and the process for submitting pull requests.
## License
This project is licensed under the MIT License. See the LICENSE.md file for details.