{"id":28409094,"url":"https://github.com/rohitkumar-tech/toddler-vision-spark","last_synced_at":"2026-02-28T20:01:56.347Z","repository":{"id":292621639,"uuid":"981423430","full_name":"RohitKumar-tech/toddler-vision-spark","owner":"RohitKumar-tech","description":null,"archived":false,"fork":false,"pushed_at":"2025-05-11T05:49:51.000Z","size":202,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-02T13:58:32.520Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RohitKumar-tech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-11T04:22:51.000Z","updated_at":"2025-05-11T05:49:55.000Z","dependencies_parsed_at":"2025-05-11T06:29:46.075Z","dependency_job_id":"66cd1247-0dd7-4d88-b004-33931255227a","html_url":"https://github.com/RohitKumar-tech/toddler-vision-spark","commit_stats":null,"previous_names":["rohitkumar-tech/toddler-vision-spark"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RohitKumar-tech%2Ftoddler-vision-spark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RohitKumar-tech%2Ftoddler-vision-spark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RohitKumar-tech%2Ftoddler-vision-spark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RohitKumar-tech%2Ftoddler-vision-spark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RohitKumar-tech","download_url":"https://codeload.github.com/RohitKumar-tech/toddler-vision-spark/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RohitKumar-tech%2Ftoddler-vision-spark/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":258814562,"owners_count":22762064,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-02T05:19:56.426Z","updated_at":"2026-02-28T20:01:51.307Z","avatar_url":"https://github.com/RohitKumar-tech.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI-Driven Early Detection of Autism in Toddlers Using Multimodal Video Data\n\n## Table of Contents\n\n1. [Project Overview](#project-overview)\n2. [Features \u0026 Clinical Signs](#features--clinical-signs)\n3. [System Architecture](#system-architecture)\n4. [Modules](#modules)\n\n   * [Face \u0026 Pose Extraction](#face--pose-extraction)\n   * [Eye Contact Analysis](#eye-contact-analysis)\n   * [Repetitive Behavior Detection](#repetitive-behavior-detection)\n   * [Gesture \u0026 Language-Delay Proxies](#gesture--language-delay-proxies)\n   * [Social Reciprocity Assessment](#social-reciprocity-assessment)\n5. [Installation](#installation)\n6. [Usage](#usage)\n7. [Data Preparation \u0026 Annotation](#data-preparation--annotation)\n8. [Training \u0026 Evaluation](#training--evaluation)\n9. [Demo \u0026 Integration](#demo--integration)\n10. [Future Work](#future-work)\n11. [License](#license)\n\n---\n\n## Project Overview\n\nThis repository contains a proof-of-concept AI pipeline designed to detect early behavioral signs of Autism Spectrum Disorder (ASD) in toddlers using non-invasive video data. By analyzing visual cues—such as eye contact, repetitive movements, and gesture patterns—the system computes a risk score to flag potential early signs of ASD and support timely clinical follow-up.\n\n**Key Objectives:**\n\n* Identify and quantify measurable visual behaviors linked to early ASD markers.\n* Implement modular computer-vision and machine-learning components for rapid prototyping.\n* Provide an end-to-end demo (live or recorded) with risk output and simple visuals.\n\n---\n\n## Tech Stack\n\n* **Frontend:** React.js, Vite, Tailwind CSS, shadcn/ui for components\n* **Backend:** FastAPI (Python) serving AI modules via REST\n* **Database \u0026 Auth:** Supabase (PostgreSQL, Auth)\n* **AI \u0026 CV:** MediaPipe, OpenCV, custom TensorFlow/PyTorch models for gaze and behavior analysis\n* **Deployment:** Docker for containerization; Vercel or Netlify for frontend hosting\n\n---\n\nThis repository contains a proof-of-concept AI pipeline designed to detect early behavioral signs of Autism Spectrum Disorder (ASD) in toddlers using non-invasive video data. By analyzing visual cues—such as eye contact, repetitive movements, and gesture patterns—the system computes a risk score to flag potential early signs of ASD and support timely clinical follow-up.\n\n**Key Objectives:**\n\n* Identify and quantify measurable visual behaviors linked to early ASD markers.\n* Implement modular computer-vision and machine-learning components for rapid prototyping.\n* Provide an end-to-end demo (live or recorded) with risk output and simple visuals.\n\n---\n\n## Features \u0026 Clinical Signs\n\nThe pipeline targets three primary observable signs:\n\n1. **Reduced Eye Contact** – quantified as percentage of gaze not directed at caregiver or toy during interaction prompts.\n2. **Repetitive Motor Behaviors** – detection of periodic movements (e.g., hand flapping, body rocking) via pose keypoint temporal analysis.\n3. **Social Reciprocity** – assessment of head-turn response to name-calling and frequency of shared-attention gestures (e.g., pointing).\n\nAdditional proxies include gesture rates (pointing or showing) as indirect indicators of early language use.\n\n---\n\n## System Architecture\n\n```\n[Camera Feed]\n      ↓\n [Face \u0026 Pose Extraction]\n      ↓\n┌──────────────────────────┐\n│ Eye Contact Module       │\n└──────────────────────────┘\n      ↓\n┌──────────────────────────┐\n│ Repetitive Behavior      │\n└──────────────────────────┘\n      ↓\n┌──────────────────────────┐\n│ Gesture \u0026 Reciprocity    │\n└──────────────────────────┘\n      ↓\n┌──────────────────────────┐\n│ Feature Fusion \u0026 Class.  │\n└──────────────────────────┘\n      ↓\n    ASD Risk Score\n```\n\n---\n\n## Modules\n\n### 1. Face \u0026 Pose Extraction\n\n* **Tools:** MediaPipe Face Mesh \u0026 Pose, OpenPose (optional)\n* **Output:** JSON files per frame containing 68 facial landmarks + 33 body keypoints.\n\n### 2. Eye Contact Analysis\n\n* **Approach:** CNN-based gaze estimator predicts a 2D gaze vector.\n* **ROI Calibration:** Define caregiver/toy bounding-box; compute `gaze_avoid_percent`.\n\n### 3. Repetitive Behavior Detection\n\n* **Data:** Time series of wrist and torso keypoints.\n* **Analysis:** FFT or autocorrelation to detect peaks in 2–5 Hz range.\n* **Output:** `repetition_score` per clip.\n\n### 4. Gesture \u0026 Language-Delay Proxies\n\n* **Gesture Detection:** Angle-based decision tree to classify pointing or showing.\n* **Metric:** Gestures-per-minute.\n* **Response Latency:** Time-to-look or touch after visual stimulus.\n\n### 5. Social Reciprocity Assessment\n\n* **Name-Call Response:** Detect head orientation within 2 s of prompt.\n* **Metric:** `name_response_rate` (successes/total prompts).\n\n---\n\n## Installation\n\n```bash\n# Clone the repo\ngit clone https://github.com/yourusername/asd-detection-poc.git\ncd asd-detection-poc\n\n# Setup Backend\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n\n# Setup Frontend\ncd frontend\nnpm install\n```\n\n### Supabase Configuration\n\n1. Create a Supabase project and copy the API URL and anon key.\n2. In `frontend/.env`, add:\n\n   ```env\n   VITE_SUPABASE_URL=https://your-project.supabase.co\n   VITE_SUPABASE_ANON_KEY=your-anon-key\n   ```\n3. In `backend/.env`, configure any needed environment variables for database or auth.\n\n---\n\n## Usage\n\n1. **Start Backend API**\n\n   ```bash\n   cd backend\n   uvicorn app.server:app --reload\n   ```\n\n2. **Start Frontend**\n\n   ```bash\n   cd frontend\n   npm run dev\n   ```\n\n3. Open `http://localhost:3000` to access the React app, which streams webcam input, visualizes gaze heatmaps, repetition timelines, and displays the ASD risk gauge.\n\n---\n\n1. **Extract Keypoints**\n\n   ```bash\n   python scripts/extract_keypoints.py --video data/sample.mp4 --out data/keypoints.json\n   ```\n\n2. **Compute Metrics**\n\n   ```bash\n   python scripts/compute_gaze.py --keypoints data/keypoints.json\n   python scripts/detect_repetition.py --keypoints data/keypoints.json\n   python scripts/compute_gestures.py --keypoints data/keypoints.json\n   python scripts/detect_headturn.py --video data/sample.mp4\n   ```\n\n3. **Run Inference Server**\n\n   ```bash\n   uvicorn app.server:app --reload\n   ```\n\n   Access at `http://localhost:8000` to stream webcam and view risk score.\n\n---\n\n## Data Preparation \u0026 Annotation\n\n1. **Scenario Recording:** Capture 30–60 s clips covering name-calling, toy interaction, and free play.\n2. **Annotation Tool:** Use CVAT or Label Studio to mark:\n\n   * Face bounding boxes\n   * Name-call events (timestamps)\n   * Repetitive behavior segments\n3. **Export:** Save annotations in JSON or CSV for training and evaluation.\n\n---\n\n## Training \u0026 Evaluation\n\n* **Train Classifier:**\n\n  ```bash\n  python scripts/train_model.py --features data/features.csv --labels data/labels.csv\n  ```\n* **Evaluate:** Generates accuracy, sensitivity, specificity, and ROC-AUC plots.\n\n---\n\n## Demo \u0026 Integration\n\n* **Frontend Dashboard:** Live webcam view with overlayed gaze heatmap, repetition timeline, and final risk gauge.\n* **Docker:** Optional `Dockerfile` provided for one-command deployment:\n\n  ```bash\n  docker build -t asd-detector .\n  docker run -p 8000:8000 asd-detector\n  ```\n\n---\n\n## Future Work\n\n* Integrate audio-based speech analysis for complementary cues.\n* Expand to include physiological sensors (e.g., eye-tracking glasses).\n* Validate on a larger, clinically diverse dataset.\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frohitkumar-tech%2Ftoddler-vision-spark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frohitkumar-tech%2Ftoddler-vision-spark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frohitkumar-tech%2Ftoddler-vision-spark/lists"}