https://github.com/codegen-sh/codebase-analytics
Analytics for codebase maintainability and complexity
https://github.com/codegen-sh/codebase-analytics
Last synced: 3 months ago
JSON representation
Analytics for codebase maintainability and complexity
- Host: GitHub
- URL: https://github.com/codegen-sh/codebase-analytics
- Owner: codegen-sh
- Created: 2025-02-14T00:06:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-25T22:31:02.000Z (about 1 year ago)
- Last Synced: 2025-02-25T23:27:29.449Z (about 1 year ago)
- Language: TypeScript
- Homepage: https://codebase-analytics.vercel.app
- Size: 153 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Codebase Analytics
A web application that provides comprehensive analytics for GitHub repositories. The project combines a Modal-based FastAPI backend with a Next.js frontend to provide efficient and beautiful codebase metrics.
Users submit a GitHub repository through the frontend. The Modal API processes the request using custom implementations of the metric calculations using the `codegen` library. Results are returned to the frontend for display.
## How it Works
### Backend (Modal API)
The backend is built using [Modal](https://modal.com/) and [FastAPI](https://fastapi.tiangolo.com/), providing a serverless API endpoint for code research.
There is a main API endpoint that handles codebase analysis requests. It uses the `codegen` library for these operations.
The following metrics are calculated:
- `Maintainability Index`: Measures the maintainability of the codebase, on a scale of 0-100.
- `Cyclomatic Complexity`: Measures the complexity of the codebase. Higher values indicate more complex code.
- `Halstead Volume`: Quantifies the information content in the code based on operators and operands. Higher values indicate more complex code.
- `Depth of Inheritance`: Measures the depth of inheritance of the codebase.
- `Lines of Code (LOC, SLOC, LLOC)`: Measures the number of lines of code in the codebase. LOC, SLOC, and LLOC represent the total, source, and logical lines of code, respectively.
- `Comment Density`: Measures the density of comments in the codebase, given as a percentage of the total lines of code. More comments can indicate better documentation.
```python
complexity = calculate_cyclomatic_complexity(func)
operators, operands = get_operators_and_operands(func)
volume, _, _, _, _ = calculate_halstead_volume(operators, operands)
loc = len(func.code_block.source.splitlines())
mi_score = calculate_maintainability_index(volume, complexity, loc)
```
Monthly commit history is also pulled from the GitHub API and displayed in a chart.
The codebase is also graded on a scale of A-F for overall quality and maintainability. The complexity of the codebase is also given a qualitative rank.
### Frontend (Next.js)
The frontend provides an interface for users to submit a GitHub repository and research query. The components come from the [shadcn/ui](https://ui.shadcn.com/) library. This triggers the Modal API to perform the code research and returns the results to the frontend.
## Getting Started
1. Set up environment variables in an `.env` file:
```
OPENAI_API_KEY=your_key_here
```
2. Deploy or serve the Modal API:
```bash
modal serve backend/api.py
```
`modal serve` runs the API locally for development, creating a temporary endpoint that's active only while the command is running.
```bash
modal deploy backend/api.py
```
`modal deploy` creates a persistent Modal app and deploys the FastAPI app to it, generating a permanent API endpoint.
After deployment, you'll need to update the API endpoint in the frontend configuration to point to your deployed Modal app URL.
3. Run the Next.js frontend:
```bash
cd frontend
npm install
npm run dev
```
## Learn More
More information about the `codegen` library can be found [here](https://codegen.com/).