https://github.com/wal33d/us-school-districts-api
High-performance, memory-optimized API for US school district lookups using official NCES boundary data
https://github.com/wal33d/us-school-districts-api
api education express geospatial gis nces nodejs open-data pm2 rest-api school-districts typescript
Last synced: 3 months ago
JSON representation
High-performance, memory-optimized API for US school district lookups using official NCES boundary data
- Host: GitHub
- URL: https://github.com/wal33d/us-school-districts-api
- Owner: Wal33D
- License: mit
- Created: 2025-04-15T01:40:22.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-09-12T04:59:07.000Z (10 months ago)
- Last Synced: 2025-10-04T22:52:41.605Z (9 months ago)
- Topics: api, education, express, geospatial, gis, nces, nodejs, open-data, pm2, rest-api, school-districts, typescript
- Language: TypeScript
- Homepage: https://github.com/Wal33D/us-school-districts-api#readme
- Size: 266 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# US School Districts Service
High-performance API for US school district boundary lookups using NCES official data with SQLite.
## Overview
Provides school district information based on geographic coordinates for various platform. Uses official NCES (National Center for Education Statistics) shapefile data stored in SQLite with spatial indexing for ultra-fast lookups. Built with TypeScript and Express, featuring minimal memory usage (~100MB) and exceptional reliability.
## Performance Improvements (v2.0)
| Metric | Old (R-tree) | New (SQLite) | Improvement |
|--------|--------------|--------------|-------------|
| Memory Usage | 1,700 MB | 100 MB | **95% reduction** |
| Success Rate | 70% | 100% | **Perfect reliability** |
| Response Time | Degraded under load | <50ms consistent | **Stable performance** |
| Startup Time | 30+ seconds | <1 second | **30x faster** |
## Features
- **SQLite Database** - Pre-built spatial database for instant lookups
- **Spatial Indexing** - Bounding box indexes for O(log n) performance
- **Memory Efficient** - Only ~100MB RAM usage (vs 1.7GB before)
- **100% Reliability** - No timeouts or failures under heavy load
- **NCES Data** - Official government school district boundaries
- **TypeScript** - Full type safety with zero errors/warnings
- **Production Ready** - PM2 support with graceful shutdown
## Installation
```bash
# Install dependencies
npm install
# Build TypeScript
npm run build
# Setup database (one-time - requires NCES shapefile)
npm run setup-db
# Run development server
npm run dev
# Run production
npm start
```
## Database Setup
The service requires a one-time database creation from NCES shapefiles:
1. **Download NCES Data**:
- Visit [NCES School District Boundaries](https://nces.ed.gov/programs/edge/Geographic/DistrictBoundaries)
- Download the latest shapefile (e.g., `EDGE_SCHOOLDISTRICT_TL24_SY2324.zip`)
- Extract to `school_district_data/` directory
2. **Build Database**:
```bash
npm run build
npm run setup-db
```
This creates a ~200MB SQLite database with 13,382 US school districts.
## Environment Variables
### Core Settings
| Variable | Description | Default |
|----------|-------------|---------|
| PORT | Service port | 3712 |
| NODE_ENV | Environment mode | development |
| LOG_LEVEL | Logging level | info |
## API Endpoints
### Health Check
```bash
GET /health
```
Returns service health status and memory usage.
### Single Lookup
```bash
GET /school-district?lat=42.3601&lng=-71.0589
```
**Response:**
```json
{
"status": true,
"districtId": "2502790",
"districtName": "Boston School District",
"gradeRange": {
"lowest": "Pre-K",
"highest": "12"
},
"area": {
"landSqMiles": 48.34,
"waterSqMiles": 41.27
},
"schoolYear": "2023-2024",
"stateCode": "25",
"coordinates": {
"lat": 42.3601,
"lng": -71.0589
}
}
```
### POST Lookup
```bash
POST /lookup
Content-Type: application/json
{
"lat": 42.3601,
"lng": -71.0589
}
```
### Batch Lookup
```bash
POST /school-districts/batch
Content-Type: application/json
{
"coordinates": [
{"lat": 42.3601, "lng": -71.0589},
{"lat": 40.7128, "lng": -74.0060}
]
}
```
Returns array of results for each coordinate (max 100 per batch).
## Production Deployment
### PM2 Configuration
```bash
# Start with PM2
pm2 start ecosystem.config.js
# Monitor
pm2 monit
# Reload (zero-downtime)
pm2 reload ecosystem.config.js
# View logs
pm2 logs us-school-districts-api
```
The service is configured with:
- Auto-restart on failure
- Memory limit: 150MB (auto-restart if exceeded)
- Daily restart at 3 AM for maintenance
- Graceful shutdown handling
## Architecture
### SQLite-Based Design
```
┌─────────────────────────────────────────┐
│ School Districts API │
├─────────────────────────────────────────┤
│ │
│ Request → Bounding Box Query │
│ ↓ │
│ SQLite Spatial Index │
│ ↓ │
│ Candidates (1-5 districts) │
│ ↓ │
│ Point-in-Polygon Check │
│ ↓ │
│ Return Match or Nearest │
│ │
├─────────────────────────────────────────┤
│ Memory: ~100MB Response: <50ms │
└─────────────────────────────────────────┘
```
### Why SQLite?
1. **Memory Efficiency**: Database on disk, not in RAM
2. **Fast Queries**: Prepared statements + spatial indexes
3. **Reliability**: No garbage collection issues
4. **Portability**: Single file database
5. **Read-Only Safety**: Database opened in read-only mode
## CLI Tool
Test the API using the built-in CLI:
```bash
# Single coordinate lookup
npm run cli lookup --latitude 42.3601 --longitude -71.0589
# Batch processing from file
npm run cli batch --file coordinates.json
# Health check
npm run cli health
# Performance test
npm run cli test --requests 1000
```
## Development
```bash
# Development with auto-reload
npm run dev
# Run tests
npm test
# Lint and format
npm run lint
npm run format
# Type checking
npm run type-check
# Clean build
npm run clean && npm run build
```
## Performance
Stress test results (1000 requests, 10 concurrent):
```
Total Requests: 1000
Successful: 1000 (100.0%)
Failed: 0 (0.0%)
Average Response: 109ms
Requests/Second: 90.58
Memory Usage: ~100MB (stable)
```
## Troubleshooting
### Database Not Found
```
Error: Database not found at .../districts.db
```
**Solution**: Run `npm run setup-db` after placing shapefile in `school_district_data/`
### High Memory Usage
If memory exceeds 150MB, PM2 will auto-restart. Check for:
- Memory leaks in custom code
- Excessive concurrent requests
- Large batch sizes
### Slow Lookups
Normal response time is <50ms. If slower:
- Check disk I/O performance
- Verify spatial indexes exist
- Monitor concurrent request load
## License
MIT
## Author
**Waleed Judah** (Wal33D)
- Email: aquataze@yahoo.com
- GitHub: [@Wal33D](https://github.com/Wal33D)