{"id":27954093,"url":"https://github.com/pdoup/avoulos","last_synced_at":"2026-05-15T13:05:55.094Z","repository":{"id":129337309,"uuid":"434892807","full_name":"pdoup/avoulos","owner":"pdoup","description":"Big Data Analytics Project - Fall '21 ","archived":false,"fork":false,"pushed_at":"2022-04-18T09:33:06.000Z","size":4029,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-06T03:45:23.240Z","etag":null,"topics":["big-data-analytics","spark"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pdoup.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-12-04T12:17:11.000Z","updated_at":"2022-07-05T17:32:05.000Z","dependencies_parsed_at":"2023-06-25T23:02:51.161Z","dependency_job_id":null,"html_url":"https://github.com/pdoup/avoulos","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pdoup/avoulos","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pdoup%2Favoulos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pdoup%2Favoulos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pdoup%2Favoulos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pdoup%2Favoulos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pdoup","download_url":"https://codeload.github.com/pdoup/avoulos/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pdoup%2Favoulos/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33067500,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-15T11:35:32.926Z","status":"ssl_error","status_checked_at":"2026-05-15T11:35:31.362Z","response_time":103,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["big-data-analytics","spark"],"created_at":"2025-05-07T17:20:06.980Z","updated_at":"2026-05-15T13:05:55.088Z","avatar_url":"https://github.com/pdoup.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Analysis of the Greek Parliament Proceedings '89 – '20\nBig Data Analytics Project - Fall 2021  - [Link to *iMEdD* GitHub repo](https://github.com/iMEdD-Lab/Greek_Parliament_Proceedings)\n\n![pap](https://www.in.gr/wp-content/uploads/2019/11/Vouli_EMEAgr_980x620_02.jpg)\n---\n\n* [X] Task 1 : Given all speeches (for all years) we need to detect the different topics (i.e., thematic areas), most important keywords and how they change across years\n* [X] Task 2 : Given all speeches we need to detect pairwise similarities between parliament members \u0026 detect the top-k pairs with the highest degree of similarity\n* [X] Task 3 : For each member and also for each party we need to detect how the most important keywords evolve across years.\n* [X] Task 4 : Detect any significant deviation (per member, per party or in general) with respect to the speeches before and after the crisis\n* [X] Task 5 : Taking into account all speeches, we need to detect if we can group them in meaningful clusters.Check about the participation of each member in each cluster and    also the participation of each party in the cluster.\n* [X] Task 6 : % of Male/Female positions in the parliament over the years \n\n---\n\n\u003e **How to package and run a spark application**\n\n1. Run ``` package ``` from the ``` sbt shell ``` (IntelliJ)\n2. Once the ``` .jar ``` is created in the ``` target ``` folder run this command once inside that folder\n```bash\nspark-submit \\\n  --class \u003cName of the main class\u003e \\\n  --master local[*] \\  \n  --executor-memory 8G \\\n  --total-executor-cores 4 \\\n  /path/to/examples.jar \u003cadd optional arguments here\u003e\n```\n\n---\n\n### **Some useful links**\n- [Spark ML Library Documetation](https://spark.apache.org/docs/3.0.1/ml-guide.html)\n- [Spark Tutorial](https://www.tutorialspoint.com/apache_spark/index.htm)\n- [Spark Tutorial YT](https://www.youtube.com/watch?v=S2MUhGA3lEw)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpdoup%2Favoulos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpdoup%2Favoulos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpdoup%2Favoulos/lists"}