{"id":25856944,"url":"https://github.com/m42e/mayan-automatic-metadata","last_synced_at":"2025-03-01T18:29:29.252Z","repository":{"id":45256817,"uuid":"224806496","full_name":"m42e/mayan-automatic-metadata","owner":"m42e","description":"A (to be) framework for automatic, external processing of mayan documents for assigning tags and metadata","archived":false,"fork":false,"pushed_at":"2024-07-06T19:36:30.000Z","size":45,"stargazers_count":13,"open_issues_count":5,"forks_count":4,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-02-23T19:16:25.817Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/m42e.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-29T08:06:45.000Z","updated_at":"2024-07-06T19:36:33.000Z","dependencies_parsed_at":"2024-07-06T20:50:41.355Z","dependency_job_id":null,"html_url":"https://github.com/m42e/mayan-automatic-metadata","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m42e%2Fmayan-automatic-metadata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m42e%2Fmayan-automatic-metadata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m42e%2Fmayan-automatic-metadata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/m42e%2Fmayan-automatic-metadata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/m42e","download_url":"https://codeload.github.com/m42e/mayan-automatic-metadata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241407290,"owners_count":19958102,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-01T18:29:28.347Z","updated_at":"2025-03-01T18:29:29.240Z","avatar_url":"https://github.com/m42e.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# About Mayan automatic metadata\n\n\u003e [!WARNING]  \n\u003e This is currently unmaintained, as my use-case is done and I see this as a proof of concept.\n\u003e Feel free to contact me if you like to maintain it.\n\nI wanted to be able to have my documents automatically tagged as I don't like to repeat these kinds of tasks over and over again.\n\nThis MAM *mayan automatic metadata* tool consists of three parts:\n\n- The point where mayan does trigger some work.\n- The worker, which does the processing\n- The pluging *you* have to do on your own. Maybe I can share some at some point in time.\n\n## HowTo\n\nThe prequesites you have to have:\n\n- Running mayan, accessible from the node you run this on, and vice versa (for the webhook)\n- A user in mayan, which is allowed to access the documents and the documents parsed content as well as the OCR content.\n- docker (You may be able to do it outside of docker, but I don't care)\n\nYou could either let MAM have its own docker stack (including network) then you have to publish some ports. Or the prefered way, having it in the same network which is what is described here.\n\nAdd the contents of the docker-compose file to the one you are using for mayan already.\n\n```\nversion: '2'\n\nservices:\n  mayan-mam-web:\n    container_name: mayan-mam-web\n    image: m42e/mayan-mam-web\n    restart: always\n    environment:\n      REDIS_URL: redis://results:6379/\n      \n  mayan-mam-worker:\n    container_name: mayan-mam-worker\n    image: m42e/mayan-mam-worker\n    restart: always\n    environment:\n      REDIS_URL: redis://results:6379/\n      MAYAN_USER: mam-user\n      MAYAN_PASSWORD: secretpassword\n      MAYAN_URL: https://yourinstance/api/\n      USE_GIT_PLUGINS: 0\n      GIT_PLUGINS_URL: https://mygithubuser:\u003capplicationtoken\u003e@github.com/mygithubuser/mam-plugins\n\n```\n\nWe are having twos services here, one is the *webfrontend* (hilarious to call it so, as it only creates tasks for the worker, no real frontend).\nThis offers and endpoint to use in mayan workflows. The endpoint is located at the base url, so from inside the stack: `http://mayam-mam-web:8000/`.\n\nThe following environment variables are relevant:\n\n- `REDIS_URL`: provide a proper redis url for the task queuing\n- `MAYAN_USER`: a user which is allowed to access mayan, this user will read documents, add metadata and tags\n- `MAYAN_PASSWORD`: for authentication a password for the user is required\n- `MAYAN_URL`: The url to your mayan instance. Inlcude the /api/ at the end.\n- `GIT_PLUGINS_URL`: The url, including a possibly required authentication, to the git repository containing your plugin. (Example: \u003chttps://github.com/m42e/mayan-automatic-metadata-plugins\u003e)\n- `USE_GIT_PLUGINS`: If set to 1 the plugin directory will be cleared and the plugins of the git repository specified will be used.\n\nYou could also mount (with docker) the `/app/plugins` directory to a folder of your choice, where you place the plugins.\n\n\n# Trigger it\n\nThe way of triggering is quite simple. Just drop a `POST` or `GET` request to the endpoint with the documentid attached. E. g. `http://mayan-mam-web:8000/345`\nThis will enqueue the task for the worker.\n\n*NB*: This should be done after the OCR content is available.\n\n# The workers job\n\nThe worker receiving the processing request, will get the required information from mayan, read the documents data and applies its strategies to get the values and add tags.\n\nThat's all folks.\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fm42e%2Fmayan-automatic-metadata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fm42e%2Fmayan-automatic-metadata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fm42e%2Fmayan-automatic-metadata/lists"}