{"id":23053153,"url":"https://github.com/fairdataihub/knowmore","last_synced_at":"2025-08-15T03:32:25.111Z","repository":{"id":38082049,"uuid":"408915198","full_name":"fairdataihub/KnowMore","owner":"fairdataihub","description":"Automated Knowledge Discovery Tool for SPARC Datasets","archived":false,"fork":false,"pushed_at":"2024-05-22T00:41:28.000Z","size":16560,"stargazers_count":3,"open_issues_count":15,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-05-22T22:08:16.886Z","etag":null,"topics":["clustering","codeathon","data","data-science","fair","hackathon","knowledge","knowledge-graph","machine-learning","nlp"],"latest_commit_sha":null,"homepage":"https://fairdataihub.org/knowmore","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fairdataihub.png","metadata":{"files":{"readme":"README.flask.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-21T17:35:10.000Z","updated_at":"2024-05-29T21:22:34.518Z","dependencies_parsed_at":"2024-01-20T23:20:54.690Z","dependency_job_id":"d8e189a9-0405-491d-b493-97cda88691fd","html_url":"https://github.com/fairdataihub/KnowMore","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2FKnowMore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2FKnowMore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2FKnowMore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fairdataihub%2FKnowMore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fairdataihub","download_url":"https://codeload.github.com/fairdataihub/KnowMore/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229890103,"owners_count":18140043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering","codeathon","data","data-science","fair","hackathon","knowledge","knowledge-graph","machine-learning","nlp"],"created_at":"2024-12-16T00:16:43.996Z","updated_at":"2024-12-16T00:16:44.547Z","avatar_url":"https://github.com/fairdataihub.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Overview\n![architecture diagram](/docs/knowmore.osparc-integration.png)\n\n# Prerequisites \nWe recommend using Anaconda to create and manage your development environments for KnowMore. All the subsequent instructions are provided assuming you are using [Anaconda (Python 3 version)](https://www.anaconda.com/products/individual).\n\n# Setup\n\n## Clone repo\nClone the repo and submodules\n```\ngit clone https://github.com/SPARC-FAIR-Codeathon/KnowMore.git --recurse\n```\n## Setup Flask\n### cd into the root folder of this repo\n\nOpen Anaconda prompt (Windows) or the system Command line interface then naviguate to the KnowMore folder\n```sh\n$ cd ./KnowMore\n```\n\n### Setup conda env\n```sh\n$ conda create -n \"knowmore-flask-env\" python=3.6\n$ conda activate knowmore-flask-env\n```\n\n### Install Python dependencies\n```sh\n$ conda install pip\n$ pip install -r requirements.txt\n```\n\n### Setup env vars\nThe environment variables required are listed in the table below along with information on how to get them\n\n\n\u003ctable\u003e\n\u003cthead\u003e\n  \u003ctr\u003e\n    \u003cth\u003eSuggested name\u003c/th\u003e\n    \u003cth\u003eValue or instructions for obtaining it\u003c/th\u003e\n    \u003cth\u003ePurpose\u003c/th\u003e\n  \u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eFLASK_ENV\u003c/td\u003e\n    \u003ctd\u003e\"development\"\u003c/td\u003e\n    \u003ctd\u003eProd or dev\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eSERVER_NAME\u003c/td\u003e\n    \u003ctd\u003e\"localhost:5000\"\u003c/td\u003e\n    \u003ctd\u003eserver url\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e LISTEN_PORT\u003c/td\u003e\n    \u003ctd\u003e\"5000\"\u003c/td\u003e\n    \u003ctd\u003ePort for flask app to listen to\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eCLIENT_URL\u003c/td\u003e\n    \u003ctd\u003e\"http://localhost:3000\"\u003c/td\u003e\n    \u003ctd\u003eSparc-App url\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eOSPARC_TEST_MODE\u003c/td\u003e\n    \u003ctd\u003efalse\u003c/td\u003e\n    \u003ctd\u003ewhether to use test mode, so you don't have to contact osparc to develop your frontend\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eOSPARC_API_KEY\u003c/td\u003e\n    \u003ctd\u003e \u003ca href=\"mailto: support@osparc.io\"\u003e Contact osparc support \u003c/a\u003e \u003c/td\u003e\n    \u003ctd\u003e Sending jobs to osparc\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eOSPARC_API_SECRET\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"mailto: support@osparc.io\"\u003e Contact osparc support \u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003eSending jobs to osparc \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eSECRET_KEY\u003c/td\u003e\n    \u003ctd\u003e\u003c/td\u003e\n    \u003ctd\u003eflask secret key\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n\nEach of them can be set in your conda environment as follows\n```sh\n$ conda env config vars set MY_VAR=value1 MY_OTHER_VAR=value2\n```\n\n### start flask server\n```sh\n$ flask run \n```\n\nor if you require remote access: (NOTE untested)\n\n```sh\n$ flask run --host=0.0.0.0\n```\n\n### View your flask app\nhttp://127.0.0.1:5000/\n\n## Test mode\nWant to develop without starting the osparc jobs? \n\nset env var (to anything other than string 'false')\n```\nOSPARC_TEST_MODE=true\n```\n\nThis will make it so you don't actually contact osparc, but instead receive sample data back. Helpful for debugging frontend without having to wait for osparc job everytime. \n\n## Run in Docker\n### 1) Install Docker\n\n[See Docker's official documentation](https://docs.docker.com/get-docker/).\n\n### 2) Create image and start container\n```\ndocker-compose up -d\n```\n\nNote that this will use the env vars as specified in your .env file. Make sure you still set those before building the image.\n\n### 3) Check Container Status\n```\ndocker ps\n```\n\nShould get something like this: \n\n```\nCONTAINER ID   IMAGE               COMMAND                  CREATED          STATUS         PORTS                                                        NAMES\n368be64c60c1   knowmore_flaskapp   \"/entrypoint.sh /sta…\"   10 seconds ago   Up 9 seconds   80/tcp, 443/tcp, 0.0.0.0:5000-\u003e5000/tcp, :::5000-\u003e5000/tcp   knowmore-flask-web-app\n```\n\nNote that this container is exposing port 5000 (the port where flask is listening) to your host.\n\n### 4) Check and follow logs \n```\ndocker logs knowmore-flask-web-app -f\n```\n\n### 5) Test the endpoints\n- Using browser or `curl` from your host, try out `http://localhost:5000/`.\n    * Response should be: `status: up`\n- Get a sample image: `http://localhost:5000/api/results-images/example-job-id/Plots-PlotID-3.7.png`\n    * Flask should return the image file.\n\n# Deploy\n```\ngit push heroku main\n```\n\n# TODOs\n- use production server, rather than dev server\n- build a new docker image (current one is outdated)\n- upload all files to s3 instead of to local filesystem (especially due to the nature of [Heroku's ephemeral filesystem](https://devcenter.heroku.com/articles/dynos#ephemeral-filesystem))\n\n\n# Debugging\n## Helpful scripts\n### Test connection to osparc\n```\ncurl http://127.0.0.1:5000/api/check-osparc-job/123e4567-e89b-12d3-a456-426614174000\n# example response: {\"error\": \"{\\\"errors\\\":[\\\"project 123e4567-e89b-12d3-a456-426614174000 not found\\\"]}\", \"status_code\": 500}\n```\n\n### Test out the python methods without using frontend\n```\npython3 manual-job-starter.py\n```\n\nTo not create job, but use existing job, pass in a single arg with job uuid of python job (NOT THE MATLAB JOB ID)\ne.g., \n```\npython3 manual-job-starter.py e9012487-5ff1-4112-aa4f-8165915973fa\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffairdataihub%2Fknowmore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffairdataihub%2Fknowmore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffairdataihub%2Fknowmore/lists"}