https://github.com/laothes/community-dectection-airlines
Python implementations of Community Detection Algorithms, including Cliques, k-cores, SNN, IHCS, LCMA, Girvan-Newman.
https://github.com/laothes/community-dectection-airlines
cliques community-detection flights girvan-newman-algorithm ihcs k-core lcma master pyecharts snn
Last synced: 3 months ago
JSON representation
Python implementations of Community Detection Algorithms, including Cliques, k-cores, SNN, IHCS, LCMA, Girvan-Newman.
- Host: GitHub
- URL: https://github.com/laothes/community-dectection-airlines
- Owner: laothes
- License: mit
- Created: 2024-12-03T14:09:49.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-12-15T03:13:30.000Z (6 months ago)
- Last Synced: 2025-02-07T21:39:00.762Z (5 months ago)
- Topics: cliques, community-detection, flights, girvan-newman-algorithm, ihcs, k-core, lcma, master, pyecharts, snn
- Language: HTML
- Homepage:
- Size: 6.77 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Community Detection in Airport Networks
This repository implements various community detection algorithms in Python to analyze airport networks. The algorithms aim to identify clusters of airports with high connectivity within the network.
## Project Structure
```
.
├── data/
│ ├── airports.dat # Airport information dataset
│ └── routes.dat # Flight routes dataset
├── echart/ # Dynamic visualization results
│ ├── AirportBased/ # Visualizations for airport-based analysis
│ │ └── *.html # Interactive network visualizations
│ └── RouteBased/ # Visualizations for route-based analysis
│ └── *.html # Interactive network visualizations
├── algorithms/
│ ├── ihcs.py # Iterated Highly Connected Subgraphs implementation
│ ├── lcma.py # Local Clique Merging Algorithm implementation
│ ├── snn.py # Shared Near Neighbor implementation
│ └── divisive.py # Modified Girvan-Newman algorithm implementation
├── notebooks/
│ ├── community-detection_airportbased.ipynb # Airport-based analysis
│ └── community-detection_routebased.ipynb # Route-based analysis
└── requirements.txt
```## Data Source
This project utilizes data from OpenFlights: https://openflights.org/data.php.
- `airports.dat`: Contains airport information including ID, name, location, etc.
- `routes.dat`: Contains route information including source and destination airports.## Implemented Algorithms
* **Cliques**: Identifies complete subgraphs where every pair of nodes is connected.
* **k-cores**: Finds dense subgraphs where each node has at least k neighbors within the subgraph.
* **Shared Near Neighbor (SNN)**: Groups nodes based on the number of shared neighbors they have.
* **Iterated Highly Connected Subgraphs (IHCS)**: Iteratively identifies and merges highly connected subgraphs.
* **Local Clique Merging Algorithm (LCMA)**: Progressively merges local cliques to form larger communities.
* **Girvan-Newman Algorithm**: A divisive hierarchical clustering algorithm based on edge betweenness centrality, adapted specifically for airport networks.## Dependencies
Core Dependencies:
* networkx (for graph manipulation and algorithms)
* pandas (for data manipulation)
* numpy (for numerical operations)
* pyecharts (for network visualization)
- pyecharts.charts
- pyecharts.options
- pyecharts.globalsPython Standard Library:
* typing (for type hints)
* itertools (for combinatorial operations)
* collections (for specialized container datatypes)You can install all required packages using:
```bash
pip install -r requirements.txt
```## Analysis Notebooks
The project includes two main analysis notebooks with different network construction approaches:
1. `community-detection_airportbased.ipynb`
- Airport-centric network construction:
* Filters airports based on minimum flight threshold (n = 300 flights)
* Only includes routes where both source and destination airports meet this threshold
* Results in a network focused on major aviation hubs
- Implements all algorithms using this filtered network structure
- Visualizes communities and their relationships
- Suitable for analyzing patterns among major airports2. `community-detection_routebased.ipynb`
- Route-centric network construction:
* Filters routes based on minimum frequency threshold (n = 5 occurrences)
* Only includes airports that appear in these filtered routes
* Results in a network focused on frequently used connections
- Implements all algorithms using this filtered network structure
- Includes route-specific visualizations and analysis
- Better for analyzing actual flight traffic patternsBoth approaches allow for adjusting the threshold n to analyze different network densities and focus on different aspects of the air transportation network.
## Independent Algorithm Implementations
The project features four custom algorithm implementations:
1. `ihcs.py` - Iterated Highly Connected Subgraphs
- Iteratively identifies dense subgraphs
- Merges subgraphs based on connectivity thresholds
- Optimized for airport network analysis2. `lcma.py` - Local Clique Merging Algorithm
- Identifies and merges local cliques
- Progressive community formation
- Specifically adapted for airport connectivity patterns3. `snn.py` - Shared Near Neighbor
- Groups nodes based on shared neighbor counts
- Identifies communities through neighbor similarity
- Effective for finding naturally clustered airports4. `divisive.py` - Modified Girvan-Newman Algorithm
- Based on NetworkX's girvan_newman implementation
- Customized for airport network analysis
- Uses edge betweenness centrality to identify community boundaries
- Includes additional geographical considerations for airport communities## Visualization Results
The `echart` folder contains interactive HTML visualizations organized by analysis type:
### Airport-Based Analysis (`echart/AirportBased/`)
- Network visualizations based on airport connectivity
- Communities detected using airport-centric approach
- Geographic distribution of major airport hubs### Route-Based Analysis (`echart/RouteBased/`)
- Network visualizations based on route frequency
- Communities detected using route-centric approach
- Frequent route pattern visualizations### Visualization Suggestions
1. Display Dimensions:
- Width: 1600px
- Height: 800px
- **Note**: Best viewed on large screens or displays with at least 1920×1080 resolution2. Viewing Instructions:
- Open any `.html` file in the respective folders using a web browser
- Allow a few moments for large visualizations to fully render
- Use browser zoom controls to adjust the view if needed## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
* OpenFlights.org for providing the airport network dataset
* NetworkX developers for the graph analysis library
* PyeCharts team for the visualization tools