An open API service indexing awesome lists of open source software.

https://github.com/gitchrisqueen/cpcc_task_automation

A multi-page Streamlit app showcasing generative AI uses cases with LangChain, OpenAI, and others to help automate task for instructors at CPCC.
https://github.com/gitchrisqueen/cpcc_task_automation

ai automation langchain streamlit

Last synced: about 2 months ago
JSON representation

A multi-page Streamlit app showcasing generative AI uses cases with LangChain, OpenAI, and others to help automate task for instructors at CPCC.

Awesome Lists containing this project

README

          

# CPCC Task Automation

> An intelligent automation platform that helps college instructors save hours each week by automating attendance tracking, project feedback, and exam grading.

## Overview

**CPCC Task Automation** is a Python-based educational automation platform designed for CPCC instructors. It combines web scraping (Selenium), AI-powered analysis (OpenAI API), and a multi-page Streamlit interface to automate time-consuming teaching tasks.

**Target Users**: College instructors at Central Piedmont Community College (CPCC), particularly those teaching programming courses.

**Value Proposition**: Transform 5-10 hours of weekly administrative work into 15 minutes of automated processing.

### Core Features

- **Attendance Tracking**: Automatically scrapes BrightSpace activities (assignments, quizzes, discussions) and records attendance in MyColleges
- **Project Feedback**: AI-generated personalized feedback on student submissions using GPT models
- **Exam Grading**: Automated exam grading with custom error definitions and rubrics
- **Student Lookup**: Find and analyze student information across systems

## Quick Start

### Prerequisites

- Python 3.12+
- [Poetry](https://python-poetry.org/docs/#installation) 1.7.1+
- Chrome browser (for web scraping)
- Git

### Installation

1. **Clone the repository:**
```bash
git clone https://github.com/gitchrisqueen/cpcc_task_automation
cd cpcc_task_automation
```

2. **Install dependencies:**
```bash
poetry install
```

3. **Configure credentials:**
Create `.streamlit/secrets.toml` with your credentials (see Configuration section below)

### Running the Application

#### Option 1: Interactive Launcher (Recommended)
```bash
./run.sh
```
Follow the prompts to choose between Streamlit UI or CLI mode.

#### Option 2: Streamlit UI
```bash
poetry run streamlit run src/cqc_streamlit_app/Home.py
```
Open your browser to `http://localhost:8501`

#### Option 3: Command Line Interface
```bash
poetry run python src/cqc_cpcc/main.py
```
Follow the interactive prompts to select an action.

## Configuration

### Required Settings

Configure these settings in `.streamlit/secrets.toml` (for local development) or environment variables (for deployment):

```toml
OPENAI_API_KEY = "sk-..." # OpenAI API key for AI features (legacy)
OPENROUTER_API_KEY = "sk-..." # OpenRouter API key for AI routing (recommended)
INSTRUCTOR_USERID = "your_username" # MyColleges/BrightSpace username
INSTRUCTOR_PASS = "your_password" # MyColleges/BrightSpace password
FEEDBACK_SIGNATURE = "Professor Name" # Your signature for feedback documents
ATTENDANCE_TRACKER_URL = "https://..." # Google Sheets URL for attendance tracking
```

**Note:** The application now uses OpenRouter.ai for AI routing by default, which provides automatic model selection and access to multiple AI providers. You can still use OpenAI directly if preferred.

### Optional Settings

```toml
HEADLESS_BROWSER = "true" # Run browser in headless mode
WAIT_DEFAULT_TIMEOUT = "30" # Selenium wait timeout (seconds)
MAX_WAIT_RETRY = "3" # Max retries for wait operations
RETRY_PARSER_MAX_RETRY = "3" # Max retries for LLM output parsing
```

## Tech Stack

### Core Technologies
- **Python**: 3.12+
- **Web Scraping**: Selenium 4.x, webdriver-manager, chromedriver-autoinstaller
- **AI/ML**: OpenAI API (GPT-4o, GPT-4o-mini), LangChain-Core (types), LangChain-OpenAI (optional)
- **UI Framework**: Streamlit 1.x (multi-page app)
- **Testing**: pytest, pytest-mock, pytest-asyncio

### Key Libraries
- **Data Processing**: pandas, BeautifulSoup4, python-docx, mammoth
- **Date/Time**: dateparser, datetime
- **Vector Store**: ChromaDB
- **Environment**: os-env for configuration
- **Display**: pyvirtualdisplay (for headless browser automation)

## Project Structure

```
cpcc_task_automation/
├── src/
│ ├── cqc_cpcc/ # Core automation package
│ │ ├── main.py # CLI entry point
│ │ ├── attendance.py # Attendance automation
│ │ ├── brightspace.py # BrightSpace scraping
│ │ ├── my_colleges.py # MyColleges integration
│ │ ├── project_feedback.py # AI feedback generation
│ │ ├── exam_review.py # Exam grading logic
│ │ ├── find_student.py # Student lookup
│ │ └── utilities/ # Shared utilities
│ │ ├── selenium_util.py # Selenium helpers
│ │ ├── date.py # Date/time utilities
│ │ ├── logger.py # Logging configuration
│ │ └── AI/ # AI/LangChain modules
│ └── cqc_streamlit_app/ # Streamlit UI package
│ ├── Home.py # Main entry point
│ └── pages/ # Multi-page app routes
├── tests/ # Unit and integration tests
├── docs/ # Documentation
├── scripts/ # Shell automation scripts
├── pyproject.toml # Poetry configuration
└── docker-compose.yml # Docker configuration
```

## Running Tests

```bash
# Run all tests
poetry run pytest

# Run only unit tests
poetry run pytest -m unit

# Run only integration tests
poetry run pytest -m integration

# Run with coverage report
poetry run pytest --cov=src --cov-report=html

# Show slowest tests
poetry run pytest --durations=5
```

## Available Scripts

```bash
./run.sh # Interactive launcher
./scripts/run_tests.sh # Run test suite
./scripts/kill_selenium_drivers.sh # Kill stuck Selenium processes
```

## Features in Detail

### 1. Take Attendance

Automatically calculates student attendance by analyzing activity completion in BrightSpace and records results in MyColleges and a tracking spreadsheet.

**How it works:**
1. Logs into MyColleges to retrieve course list
2. For each course, scrapes BrightSpace activities (assignments, quizzes, discussions)
3. Identifies students who completed activities in the configured date range
4. Records attendance in MyColleges and tracking spreadsheet

**Time savings**: 2-3 hours per week → 10-15 minutes automated

### 2. Give Feedback

Generates personalized, AI-powered feedback on student programming projects using OpenAI GPT models.

**How it works:**
1. Downloads student submission files from BrightSpace
2. Parses content (code, documents)
3. Sends to OpenAI with project instructions and rubric
4. Generates structured feedback with specific issues and suggestions
5. Creates Word documents with feedback

**Time savings**: 5-10 minutes per student → 30 seconds automated

### 3. Grade Exam

Automates programming exam grading using AI to identify errors according to defined rubrics.

**How it works:**
1. Analyzes exam instructions and solution code
2. Generates error definitions (syntax, logic, style)
3. Evaluates each student submission against the solution
4. Calculates scores based on rubric
5. Generates detailed feedback reports

**Time savings**: 5-8 minutes per student → 1 minute automated

## Documentation

For detailed technical documentation, see:

- **[docs/README.md](docs/README.md)** - Documentation hub and index
- **[ARCHITECTURE.md](docs/ARCHITECTURE.md)** - System architecture and design decisions
- **[PRODUCT.md](docs/PRODUCT.md)** - Product features and user personas
- **[CONTRIBUTING.md](docs/CONTRIBUTING.md)** - Development guidelines

## Important Notes

### BrightSpace Integration
- Uses Selenium web scraping (not BrightSpace API)
- Attendance is inferred from activity completion dates
- Default date range: last 7 days, ending 2 days ago
- May take 5-10 minutes per course to scrape data

### MyColleges Integration
- Requires instructor login credentials
- Supports Duo 2FA authentication
- Records official attendance per course section

### AI Features
- Uses OpenAI GPT-4o (primary) and GPT-4o-mini (retry)
- API usage is metered (pay per token)
- Typical cost: $0.05-$0.15 per student submission
- Includes retry logic for malformed responses

### Security
- Credentials stored in environment variables (not in code)
- API keys never logged
- HTTPS for all web requests
- No long-term storage of student data

## Testing

### Running Tests Locally

The project uses pytest for testing. You can run tests using the provided script:

```bash
# Interactive mode - select test type from menu
./scripts/run_tests.sh

# Non-interactive mode - specify test type
./scripts/run_tests.sh unit # Run unit tests only
./scripts/run_tests.sh all # Run all tests
./scripts/run_tests.sh integration # Run integration tests
./scripts/run_tests.sh e2e # Run end-to-end tests
```

Or use Poetry directly:

```bash
# Run unit tests with coverage
poetry run pytest -m unit --ignore=tests/e2e --cov=src

# Run all tests
poetry run pytest
```

### Continuous Integration (CI)

The project uses GitHub Actions for automated testing:

- **unit-tests.yml**: Automatically runs unit tests on every pull request and push to `master`
- Ensures code quality before merging
- Generates coverage reports
- Blocks merge if tests fail (when branch protection is enabled)

To enable required status checks for pull requests, see **[docs/ci-branch-protection.md](docs/ci-branch-protection.md)** for detailed setup instructions.

## GitHub Actions

The project includes automated workflows:
- **unit-tests.yml**: Unit test CI workflow (runs on PRs and pushes to master)
- **Selenium_Action.yml**: Manual web scraping workflow
- **Cron_Action.yml**: Scheduled automation workflow

## Docker Support

```bash
docker-compose up
```

Requires environment variables to be configured in `.env` file.

## Support

- **Issues**: [GitHub Issues](https://github.com/gitchrisqueen/cpcc_task_automation/issues)
- **Email**: christopher.queen@gmail.com

## License

Copyright (c) 2024 Christopher Queen Consulting LLC

## Acknowledgments

Built with:
- [Streamlit](https://streamlit.io/) - Web UI framework
- [OpenAI](https://openai.com/) - GPT models and AI capabilities
- [Selenium](https://www.selenium.dev/) - Web automation
- [LangChain Core](https://www.langchain.com/) - Type definitions and callbacks