{"id":27968698,"url":"https://github.com/cyberaula/edvl","last_synced_at":"2026-05-19T07:02:07.414Z","repository":{"id":66688691,"uuid":"319026486","full_name":"CyberAula/edvl","owner":"CyberAula","description":"Educational Data Virtual Lab","archived":false,"fork":false,"pushed_at":"2022-09-15T10:07:26.000Z","size":74686,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-05-07T21:06:32.937Z","etag":null,"topics":["apache-zeppelin","big-data-platform","data","education","fiware","fiware-cosmos","fiware-draco","fiware-keyrock","fiware-ngsi","fiware-orion","human-data-interaction","ipynb","notebook","notebooks","spark","streaming-data","upm","zeppelin","zeppelin-notebook"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CyberAula.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-12-06T12:21:27.000Z","updated_at":"2024-04-27T09:17:19.000Z","dependencies_parsed_at":"2023-07-19T12:45:28.159Z","dependency_job_id":null,"html_url":"https://github.com/CyberAula/edvl","commit_stats":null,"previous_names":["cyberaula/edvl"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAula%2Fedvl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAula%2Fedvl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAula%2Fedvl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAula%2Fedvl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CyberAula","download_url":"https://codeload.github.com/CyberAula/edvl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252954429,"owners_count":21830904,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-zeppelin","big-data-platform","data","education","fiware","fiware-cosmos","fiware-draco","fiware-keyrock","fiware-ngsi","fiware-orion","human-data-interaction","ipynb","notebook","notebooks","spark","streaming-data","upm","zeppelin","zeppelin-notebook"],"created_at":"2025-05-07T21:06:35.332Z","updated_at":"2026-05-19T07:02:07.407Z","avatar_url":"https://github.com/CyberAula.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"#  Educational Data Virtual Lab (EDVL) \n\nThe **Educational Data Virtual Lab** (EDVL) is a component of the **ADA** project that will be used for the delivery of the practical and hands-on part of the Urban Mobility Data Science courses. \n\nIt is based on  **Apache Zeppelin** and the European **FIWARE** platform, in which the specific components of Data Science applied to Urban Mobility will be integrated. \n\n**Apache Zeppelin** is a new and upcoming web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more. It provides data exploration, visualization, sharing and collaboration features  and supports a plethora of languages and technologies.\n\n**FIWARE** is a curated framework of open source components to accelerate the development of smart solutions, which enable the connection to IoT with Context Information Management and Big Data services in the Cloud. Furthermore, it provides standard APIs for data management and exchange, as well as harmonised data models.\n\n## Requirements\n\n* Docker and Docker-compose\n\n## Installation\n\n* Clone this project\n```shell\ngit clone https://github.com/ging/edvl\ncd edvl\n```\n\n* Run the whole scenario\n```shell\ndocker-compose up\n```\n\n* Open the browser in http://localhost:8079 (credentials: admin/password1, user1/password2)\n\n\n## Example notebooks\n\nEDVL comes with a curated set of notebooks that can be use to get started in data science training. They are available in the ``notebook``directory. To run any of the notebooks you just need to:\n\n* Click \"Import note\". Pick a name and choose the \"Select JSON File/IPYNB File\" option\n\n* Choose the notebook that you want to explore from the ``notebook`` directory\n\n* Open the notebook and run all of the chunks one by one in order.\n\n\nBelow is a description of the notebooks available.\n\n### MongoDB with native visualizations\n\nNotebook ``1. ExampleMongo.zpln`` showcases Apache Zeppelin's native visualizations when querying a Mongo database. It can be seen how data can be explored in an interactive way through many graphs and visualizations.\n\n### SparkML\n\nStepping up from mere data exploration,  notebook ``2. ExampleSparkML.zpln`` shows how EDVL can be used for the complete lifecycle of machine learning, from data acquisition and storage provided by FIWARE Generic Enablers, to model training and prediction thanks to the SparkML library.\n\n### MongoSpark with Scala\n\nInstead of directly querying a Mongo database,  notebook ``3. ExampleMongoSpark.zpln`` shows how MongoSpark can be used to query data using the Scala language, and how the data retrieved can be ploted using web visualization libraries.\n\n### Python Pandas\n\nApache Zeppelin supports one of the most common languages for data analysis (i.e., Python). In  notebook ``4. ExamplePandas.zpln``, a common workflow of analyzing a CSV file using Python Pandas is provided.\n\n### Spark streaming\n\nNot only batch analysis is supported, but also real-time. Thanks to Spark Streaming and the FIWARE Cosmos Spark Connector, data can be analyzed as soon as it arrives from the FIWARE Context Broker and plotted in real time using web visualization libraries (``5. ExampleStreamingPrint.zpln`` and ``6. ExampleStreamingGraph.zpln``). \n\n### Legacy Jupyter Notebook\n\nApache Zeppelin allows to import Jupyter Notebooks and reuse existing code. This way, users who are migrating from Jupyter can resume their work immediately. An example is provided in notebok ``7. Jupyter2Zeppelin.ipynb``\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberaula%2Fedvl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyberaula%2Fedvl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberaula%2Fedvl/lists"}