{"id":28331719,"url":"https://github.com/vladimir-skvortsov/devweek-2025","last_synced_at":"2025-10-18T09:46:32.446Z","repository":{"id":295210085,"uuid":"985190130","full_name":"vladimir-skvortsov/devweek-2025","owner":"vladimir-skvortsov","description":null,"archived":false,"fork":false,"pushed_at":"2025-05-24T07:31:09.000Z","size":97079,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-03T05:38:26.438Z","etag":null,"topics":["ai","hackathon","ml"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vladimir-skvortsov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-17T08:44:23.000Z","updated_at":"2025-05-24T07:31:12.000Z","dependencies_parsed_at":"2025-05-24T08:42:39.749Z","dependency_job_id":null,"html_url":"https://github.com/vladimir-skvortsov/devweek-2025","commit_stats":null,"previous_names":["vladimir-skvortsov/devweek-2025"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/vladimir-skvortsov/devweek-2025","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vladimir-skvortsov%2Fdevweek-2025","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vladimir-skvortsov%2Fdevweek-2025/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vladimir-skvortsov%2Fdevweek-2025/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vladimir-skvortsov%2Fdevweek-2025/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vladimir-skvortsov","download_url":"https://codeload.github.com/vladimir-skvortsov/devweek-2025/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vladimir-skvortsov%2Fdevweek-2025/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260692930,"owners_count":23047526,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","hackathon","ml"],"created_at":"2025-05-26T18:50:16.602Z","updated_at":"2025-10-18T09:46:32.441Z","avatar_url":"https://github.com/vladimir-skvortsov.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# devweek-2025\n\n## Running the Application\n\nThis project consists of a frontend and backend service. You can run them separately or together using the provided shell script.\n\n### Prerequisites\n\n- Python 3.x\n- Node.js and npm\n- pip (Python package manager)\n\n### Running the Services\n\nUse the following commands to run the services:\n\n```bash\n# Run both frontend and backend services\n./run.sh all\n\n# Run only the backend service\n./run.sh backend\n\n# Run only the frontend service\n./run.sh frontend\n\n# Show help\n./run.sh help\n```\n\n### Service Details\n\n- **Backend**: FastAPI application running on `http://localhost:8000`\n- **Frontend**: React application running on `http://localhost:3000`\n\n### Development\n\nThe services are configured with hot-reload enabled, so any changes to the code will automatically restart the respective service.\n\n## List of Used Datasets\n\n### 1. Kaggle Datasets\n1. **`sunilthite/llm-detect-ai-generated-text-dataset`**\n   - File: `Training_Essay_Data.csv`\n   - Description: AI-generated vs human-written essays\n   - Columns: `text`, `generated` (0/1)\n\n2. **`prajwaldongre/llm-detect-ai-generated-vs-student-generated-text`**\n   - File: `LLM.csv`\n   - Description: Student essays vs AI-generated text\n   - Columns: `Text`, `Label` ('student'/'AI')\n\n3. **`thedrcat/daigt-v4-train-dataset`**\n   - Files: `daigt_magic_generations.csv`, `train_v4_drcat_01.csv`\n   - Description: AI detection training data\n   - Columns: `text`, `label` (0/1)\n\n4. **`carlmcbrideellis/llm-7-prompt-training-dataset`**\n   - Files:\n     - `train_essays_RDizzl3_seven_v1.csv`\n     - `train_essays_RDizzl3_seven_v2.csv`\n     - `train_essays_7_prompts.csv`\n     - `train_essays_7_prompts_v2.csv`\n   - Description: Essay prompts and responses\n   - Columns: `text`, `label` (0/1)\n\n5. **`starblasters8/human-vs-llm-text-corpus`**\n   - File: `data.csv`\n   - Description: Human vs LLM text collection\n   - Columns: `text`, `source` ('Human'/'LLM')\n\n6. **Kaggle Competition: `llm-detect-ai-generated-text`**\n   - File: `train_essays.csv`\n   - Description: AI detection competition data\n   - Columns: `text`, `generated` (0/1)\n\n7. **`d0rj3228/russian-literature`**\n   - Format: Text files\n   - Description: Russian literary texts (human-only)\n   - Columns: `text` (all marked human)\n\n8. **`artalmaz31/complex-russian-dataset`**\n   - Format: Text files\n   - Description: Complex Russian texts (human-only)\n   - Columns: `text` (all marked human)\n\n9. **`mar1mba/russian-sentiment-dataset`**\n   - File: `sentiment_dataset.csv`\n   - Description: Russian sentiment analysis data\n   - Columns: `text` (all marked human)\n\n10. **`vsevolodbogodist/data-jokes`**\n    - File: `dataset.csv`\n    - Description: Russian jokes dataset\n    - Columns: `text` (all marked human)\n\n### 2. HuggingFace Datasets\n11. **`shahxeebhassan/human_vs_ai_sentences`**\n    - Description: Short human vs AI sentences\n    - Columns: `text`, `label` (0=AI, 1=human)\n\n12. **`ardavey/human-ai-generated-text`**\n    - Description: Human vs AI text samples\n    - Columns: `text`, `label` (0=AI, 1=human)\n\n### 3. Local Datasets\n13. **`raw/ruatd-2022-bi-train.csv`**\n    - Description: Russian text authenticity dataset (train)\n    - Columns: `Text`, `Class` ('H'=human)\n\n14. **`raw/ruatd-2022-bi-val.csv`**\n    - Description: Russian text authenticity dataset (validation)\n    - Columns: `Text`, `Class` ('H'=human)\n\n15. **`raw/generated.csv`**\n    - Description: Mixed English/Russian generated content\n    - Columns: `Text`, `is_human`, `language` ('en'/'ru')\n\n### Final Dataset Composition\n- Combined dataset removes duplicates based on `text` column\n- Final structure:\n  ```python\n  ['id', 'text', 'is_human', 'lang']  # lang = 'en' or 'ru'","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvladimir-skvortsov%2Fdevweek-2025","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvladimir-skvortsov%2Fdevweek-2025","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvladimir-skvortsov%2Fdevweek-2025/lists"}