{"id":29666502,"url":"https://github.com/liteobject/speech-to-form","last_synced_at":"2025-07-22T15:10:03.316Z","repository":{"id":304923908,"uuid":"1018642751","full_name":"LiteObject/speech-to-form","owner":"LiteObject","description":"Voice-enabled web form using Flask and OpenAI. Users fill out form fields by speaking; the app extracts structured data from speech and prompts for missing info until the form is complete.","archived":false,"fork":false,"pushed_at":"2025-07-16T03:43:51.000Z","size":18,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-17T06:22:56.316Z","etag":null,"topics":["ai","flask","llm","openai","python","voice-recognition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LiteObject.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-12T17:52:20.000Z","updated_at":"2025-07-16T03:45:11.000Z","dependencies_parsed_at":"2025-07-17T11:16:03.688Z","dependency_job_id":"1ada184d-238b-4999-9475-d6f000b0269f","html_url":"https://github.com/LiteObject/speech-to-form","commit_stats":null,"previous_names":["liteobject/speech-to-form"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/LiteObject/speech-to-form","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiteObject%2Fspeech-to-form","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiteObject%2Fspeech-to-form/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiteObject%2Fspeech-to-form/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiteObject%2Fspeech-to-form/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LiteObject","download_url":"https://codeload.github.com/LiteObject/speech-to-form/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiteObject%2Fspeech-to-form/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266516526,"owners_count":23941451,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","flask","llm","openai","python","voice-recognition"],"created_at":"2025-07-22T15:10:02.282Z","updated_at":"2025-07-22T15:10:03.301Z","avatar_url":"https://github.com/LiteObject.png","language":"Python","readme":"# Voice-Enabled Form Demo Application\n\nThis is a Flask web application that demonstrates intelligent voice-enabled form filling using speech recognition and AI-powered text processing with real-time field highlighting and step-by-step guidance.\n\n## ✨ Features\n\n### Core Functionality\n- **Smart Web Form**: Interactive form with fields for name, email, phone, and address\n- **Speech Recognition**: Uses browser's Web Speech API for real-time voice input\n- **AI-Powered Processing**: OpenAI GPT-4o-mini integration with intelligent regex fallback\n- **Step-by-Step Guidance**: Highlights current field and guides users through form completion\n- **Real-time Field Highlighting**: Visual feedback showing which field to fill next\n\n### Advanced Features  \n- **Intelligent Email Extraction**: Converts speech patterns like \"john at example dot com\" to valid email format\n- **Flexible Phone Number Processing**: Handles various phone number formats and speech variations\n- **Missing Information Detection**: Automatically identifies and requests incomplete fields\n- **Comprehensive Logging**: Debug-level logging for development and troubleshooting\n- **Health Check Endpoint**: Monitor application and OpenAI API connectivity status\n- **Graceful Fallback**: Seamless switch to regex extraction when OpenAI is unavailable\n\n## 🚀 Quick Start\n\n### 1. Project Structure\n```\nspeech-to-form/\n├── app.py                 # Main Flask application with OpenAI integration\n├── requirements.txt       # Python dependencies\n├── .env                  # Environment variables (create from .env.example)\n├── .env.example          # Example environment configuration\n├── speech_to_form.log    # Application logs (auto-generated)\n└── templates/\n    └── index.html        # Frontend with step-by-step field highlighting\n```\n\n### 2. Installation\n```bash\n# Clone the repository\ngit clone \u003crepository-url\u003e\ncd speech-to-form\n\n# Install dependencies\npip install -r requirements.txt\n\n# Set up environment variables\ncp .env.example .env\n# Edit .env file with your OpenAI API key (optional)\n```\n\n### 3. Environment Setup\nCreate a `.env` file with your OpenAI API key (optional - the app works without it):\n```env\nOPENAI_API_KEY=your-actual-openai-api-key-here\n```\n\n**Note**: If no OpenAI API key is provided, the application automatically uses intelligent regex-based extraction as a fallback.\n\n### 4. Run the Application\n```bash\npython app.py\n```\n\nThe application will be available at `http://localhost:5000`\n\n## 🎤 How to Use\n\n### Step-by-Step Voice Form Filling\n1. **Open the Application**: Navigate to `http://localhost:5000` in your browser\n2. **Grant Microphone Permission**: Allow the browser to access your microphone when prompted\n3. **Follow Visual Guidance**: The first field (name) will be highlighted automatically\n4. **Start Recording**: Click \"Start Recording\" and speak your information for the highlighted field\n5. **Field-by-Field Progress**: After each successful extraction, the next empty field will be highlighted\n6. **Complete the Form**: Continue until all fields are filled\n\n### Example Speech Patterns\nThe application intelligently handles various speech patterns:\n\n**Natural Speech Examples:**\n- \"My name is John Doe\"\n- \"My email is john at example dot com\" *(converts to john@example.com)*\n- \"Phone number is five five five one two three four five six seven\"\n- \"I live at 123 Main Street, New York\"\n\n**Combined Information:**\n- \"Hi, I'm Sarah Johnson, my email is sarah at gmail dot com, phone 555-987-6543, and I live at 456 Oak Avenue, Chicago\"\n\n## 🔧 Technical Details\n\n### Speech Recognition\n- **Web Speech API**: Real-time continuous speech recognition\n- **Browser Support**: Chrome, Edge, Safari (best), Firefox (limited)\n- **Language**: English (US) with interim results\n- **Error Handling**: Automatic recovery and restart on connection issues\n\n### AI Processing Architecture\n- **Primary**: OpenAI GPT-4o-mini with structured JSON output\n- **Fallback**: Intelligent regex patterns for offline operation\n- **Dual Model Support**: Falls back from GPT-4o-mini to GPT-3.5-turbo if needed\n- **Smart Extraction**: Handles speech-to-text variations (e.g., \"at\" → \"@\", \"dot\" → \".\")\n\n### Form Management\n- **Required Fields**: name, email, phone, address *(age field removed in latest version)*\n- **Real-time Validation**: Tracks completion status for each field\n- **Progressive Highlighting**: Visual guidance through form completion\n- **Missing Field Detection**: Automatic prompts for incomplete information\n\n### Enhanced Features\n- **Comprehensive Logging**: Debug-level logging to `speech_to_form.log`\n- **Health Check**: `/health` endpoint for monitoring system status\n- **Reset Functionality**: Clean form state reset with field re-highlighting\n- **Error Recovery**: Graceful handling of API failures and network issues\n\n## ⚙️ API Endpoints\n\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/` | GET | Main application interface |\n| `/process_speech` | POST | Process voice input and extract form data |\n| `/reset_form` | POST | Reset form to initial state |\n| `/health` | GET | System health check and OpenAI connectivity status |\n\n## 🛠️ Customization\n\n### Adding New Form Fields\n1. Update `REQUIRED_FIELDS` dictionary in `app.py`\n2. Add corresponding HTML input fields in `templates/index.html`\n3. Update the field highlighting logic in JavaScript\n4. Modify extraction patterns (regex or OpenAI prompt)\n\n### Switching AI Providers\nReplace the `extract_information` method in the `FormProcessor` class with your preferred AI service integration.\n\n### Customizing Speech Patterns\nModify the regex patterns in the `_demo_extraction` method to handle your specific speech variations or language requirements.\n\n### Styling and UI\n- Modify CSS in `templates/index.html` for visual customization\n- Update field highlighting styles via `.field-highlight` class\n- Customize status messages and user guidance text\n\n## 🔒 Security Considerations\n\nFor production deployment, implement:\n- **Input Validation**: Sanitize and validate all user inputs\n- **Rate Limiting**: Protect API endpoints from abuse\n- **API Key Security**: Use environment variables and secure key management\n- **HTTPS**: Required for microphone access in production\n- **Authentication**: Add user authentication for sensitive applications\n- **Content Security Policy**: Prevent XSS attacks\n- **Logging Security**: Ensure logs don't contain sensitive information\n\n## 🐛 Troubleshooting\n\n### Speech Recognition Issues\n- **Not Working**: Ensure you're using a supported browser (Chrome, Edge, Safari)\n- **No Permission**: Check microphone permissions in browser settings\n- **HTTPS Required**: Microphone access requires HTTPS in production environments\n- **Network Issues**: Check internet connection for real-time processing\n\n### OpenAI API Issues\n- **Invalid API Key**: Verify your OpenAI API key in `.env` file\n- **Quota Exceeded**: Check your OpenAI account billing and usage limits\n- **Model Unavailable**: App automatically falls back to GPT-3.5-turbo, then regex\n- **Network Timeout**: App gracefully switches to offline regex extraction\n\n### Form and UI Issues\n- **Fields Not Highlighting**: Check browser console for JavaScript errors\n- **Form Not Updating**: Verify Flask server is running and responding\n- **Extraction Failures**: Review logs in `speech_to_form.log` for debugging\n\n### Debugging Tools\n- **Log Files**: Check `speech_to_form.log` for detailed processing information\n- **Health Check**: Visit `/health` endpoint to verify system status\n- **Browser Console**: Monitor for JavaScript errors and network issues\n- **Network Tab**: Inspect API requests and responses in browser dev tools\n\n## 📋 Dependencies\n\n```txt\nFlask==2.3.3\nopenai==1.3.0\npython-dotenv==1.0.0\n```\n\n## 🌐 Browser Compatibility\n\n| Browser | Support Level | Notes |\n|---------|---------------|-------|\n| Chrome | ✅ Excellent | Full Web Speech API support |\n| Edge | ✅ Excellent | Full Web Speech API support |\n| Safari | ✅ Good | Web Speech API supported |\n| Firefox | ⚠️ Limited | May require additional configuration |\n| IE | ❌ Not Supported | Use modern browser |\n\n## 📝 Recent Updates\n\n- **Field Highlighting**: Added step-by-step visual guidance\n- **Email Intelligence**: Improved speech-to-email conversion patterns\n- **Enhanced Logging**: Comprehensive debug logging throughout application\n- **Age Field Removal**: Simplified form to essential fields only\n- **Error Recovery**: Better handling of API failures and network issues\n- **Health Monitoring**: Added system health check endpoint","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliteobject%2Fspeech-to-form","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fliteobject%2Fspeech-to-form","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliteobject%2Fspeech-to-form/lists"}