{"id":31030154,"url":"https://github.com/eriknovak/wac","last_synced_at":"2025-09-13T22:59:02.200Z","repository":{"id":245179514,"uuid":"727880528","full_name":"eriknovak/WAC","owner":"eriknovak","description":"The Wasserstein distance-based news Article Clustering algorithm","archived":false,"fork":false,"pushed_at":"2024-02-21T07:41:34.000Z","size":15135,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-06-20T09:56:30.005Z","etag":null,"topics":["news-clustering","online-algorithm","transformers","wasserstein-distance"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eriknovak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-05T19:09:02.000Z","updated_at":"2024-06-20T09:56:36.190Z","dependencies_parsed_at":"2024-06-20T10:10:57.496Z","dependency_job_id":null,"html_url":"https://github.com/eriknovak/WAC","commit_stats":null,"previous_names":["eriknovak/wac"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/eriknovak/WAC","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eriknovak%2FWAC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eriknovak%2FWAC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eriknovak%2FWAC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eriknovak%2FWAC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eriknovak","download_url":"https://codeload.github.com/eriknovak/WAC/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eriknovak%2FWAC/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275038248,"owners_count":25394640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-13T02:00:10.085Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["news-clustering","online-algorithm","transformers","wasserstein-distance"],"created_at":"2025-09-13T22:58:16.919Z","updated_at":"2025-09-13T22:59:02.186Z","avatar_url":"https://github.com/eriknovak.png","language":"Jupyter Notebook","readme":"# WAC: **W**asserstein distance-based news **A**rticle **C**lustering\n\nThis project contains the implementation of the **W**asserstein distance-based news **A**rticle **C**lustering algorithm.\nThe algorithm is an unsupervised two-step online clustering algorithm that uses the Wasserstein distance (and distances\nsimilar to it). The two steps are (1) monolingual clustering of news articles and (2) multilingual clustering of events into clusters.\n\nThe articles and events are represented using an SBERT language model, which are fine-tunned for clustering tasks.\n\nThe remainder of the project contains the instructions for running the experiments.\n\n## 📚 Papers\n\nIn case you use any of the components for your research, please refer to (and cite) the papers:\n\n**TODO**\n\n## ☑️ Requirements\n\nBefore starting the project make sure these requirements are available:\n\n- [python]. For setting up your research environment and python dependencies (version 3.8 or higher).\n- [git]. For versioning your code.\n\n## 🛠️ Setup\n\n### Create a python environment\n\nFirst create the virtual environment where all the modules will be stored.\n\n#### Using venv\n\nUsing the `venv` command, run the following commands:\n\n```bash\n# create a new virtual environment\npython -m venv venv\n\n# activate the environment (UNIX)\nsource ./venv/bin/activate\n\n# activate the environment (WINDOWS)\n./venv/Scripts/activate\n\n# deactivate the environment (UNIX \u0026 WINDOWS)\ndeactivate\n```\n\n### Install\n\nTo install the requirements run:\n\n```bash\npip install -e .\n```\n\n## 🗃️ Data\n\nThe data used in the experiments are a currated set of news articles retrieved from the Event Registry and prepared for the scientific paper[^1].\n\nTo download the data run:\n\n```bash\nbash scripts/00_download_data.sh\n```\n\nThis will download the data files and store them in the `data/raw` folder.\n\n## ⚗️ Experiments\n\nTo run the experiments, run the folowing command:\n\n```bash\n# run the experiments\nbash scripts/run_exp_pipeline.sh\n```\n\nThe command above will perform a series of experiments by executing the following steps (the names of the files are listed in the `scripts/run_exp_pipeline.sh` file):\n\n```bash\n# prepare the data examples for the experiment\npython scripts/01_prepare_data.py \\\n    --input_file ./data/raw/dataset.test.json \\\n    --output_file ./data/processed/dataset.test.csv\n\n# cluster articles into events\npython scripts/02_article_clustering.py \\\n    --input_file ./data/processed/dataset.test.csv \\\n    --output_file ./data/processed/article_clusters/dataset.test.csv \\\n    --rank_th 0.5 \\\n    --time_std 3 \\\n    --multilingual \\\n    --ents_th 0.0 \\\n    -gpu\n\n# cluster events based on their similarity\npython scripts/03_event_clustering.py \\\n    --input_file ./data/processed/article_clusters/dataset.test.csv \\\n    --output_file ./data/processed/event_clusters/dataset.test.csv \\\n    --rank_th 0.7 \\\n    --time_std 3 \\\n    --w_reg 0.1 \\\n    --w_nit 10 \\\n    -gpu\n\n# evaluate the clusters\npython scripts/04_evaluate.py \\\n    --label_file_path ./data/processed/dataset.test.csv \\\n    --pred_file_dir ./data/processed/event_clusters \\\n    --output_file ./results/dataset.test.csv\n\n```\n\nThe results will be stored in the `results` folder.\n\n### Results\n\nthe hyper-parameters were selected by evaluating the performance of the clustering algorithm on the dev set. We performed a grid-search across the following hyper-parameters:\n\n| Clustering | Parameter    | Grid Search          | Description                                                                                     |\n| :--------- | :----------- | :------------------- | :---------------------------------------------------------------------------------------------- |\n| article    | rank_th      | [0.4, 0.5, 0.6, 0.7] | Threshold for deciding if an article should be added to the cluster.                            |\n| article    | ents_th      | [0.2, 0.3, 0.4, 0.5] | Threshold for deciding if an article should be added to the cluster (considering the entities). |\n| article    | time_std     | [1, 2, 3, 5]         | The std for temporal similarity between the article and event.                                  |\n| article    | multilingual | [True, False]        | Whether to use monolingual or multilingual clustering.                                          |\n| event      | rank_th      | [0.6, 0.7, 0.8, 0.9] | Threshold for deciding if events should be merged.                                              |\n| event      | time_std     | [1, 2, 3]            | The std for temporal similarity between an events.                                              |\n\n#### Performance results\n\nThe best performance is obtained with the following parameters:\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"1\"\u003e\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eArticle Clustering\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"2\"\u003eCluster Merging\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eStandard\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eBCubed\u003c/th\u003e\n    \u003cth\u003e\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth style=\"text-align:left;\"\u003eVariant name\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003erank_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eents_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003etime_std\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003erank_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003etime_std\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eF1\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eP\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eR\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eF1\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eP\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eR\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eclusters\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMONO\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.5\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e87.00\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.45\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e77.95\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e85.42\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e93.04\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e78.95\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1066\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMONO\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.6\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e69.50\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.71\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e53.63\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e81.08\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e94.14\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e71.20\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1108\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMONO+NER\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.5\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.2\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e85.02\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.52\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e74.77\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e84.78\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e93.51\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e77.54\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1089\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMONO+NER\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.6\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.2\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e67.23\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.12\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e51.14\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e79.72\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e93.80\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e69.32\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1109\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.5\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e92.20\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.55\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e86.62\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e86.67\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e92.94\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e81.20\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1074\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.6\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e74.43\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e98.81\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e59.70\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e81.98\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e94.00\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e72.68\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1112\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n#### Cluster merging assessment analysis\n\nTo evaluate the impact the cluster merging process has on the algorithm’s performance, we compare the WAC algorithm variants to those where the cluster merging phase was not performed. Note that we compare only the WAC\u003csub\u003eMULTI\u003c/sub\u003e variant, as it already generates multilingual clusters during the article clustering phase\n\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"1\"\u003e\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eArticle Clustering\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"2\"\u003eCluster Merging\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eStandard\u003c/th\u003e\n    \u003cth style=\"text-align:center;\" colspan=\"3\"\u003eBCubed\u003c/th\u003e\n    \u003cth\u003e\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth style=\"text-align:left;\"\u003eVariant name\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003erank_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eents_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003etime_std\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003erank_th\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003etime_std\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eF1\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eP\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eR\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eF1\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eP\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eR\u003c/th\u003e\n    \u003cth style=\"text-align:center;\"\u003eclusters\u003c/th\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.5\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e92.20\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.55\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e86.62\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e86.67\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e92.94\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e81.20\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1074\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI/MERGE\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.5\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e56.04\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.71\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e39.12\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e71.14\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e96.98\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e56.17\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e2339\u003c/td\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.6\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.7\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e74.43\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e98.81\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e59.70\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e81.98\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e94.00\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e72.68\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e1112\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd style=\"text-align:left;\"\u003eWAC\u003csub\u003eMULTI/MERGE\u003c/sub\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e0.6\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e3\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e-\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e24.28\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e99.40\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e13.83\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e47.10\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e\u003cb\u003e99.04\u003c/b\u003e\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e31.59\u003c/td\u003e\n    \u003ctd style=\"text-align:center;\"\u003e4675\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\n\n## 📣 Acknowledgments\n\nThis work is developed by [Department of Artificial Intelligence][ailab] at [Jozef Stefan Institute][ijs].\n\nThis work was supported by the Slovenian Research Agency, and the European Union's Horizon 2020 project Humane AI Net [[H2020-ICT-952026]].\n\n[python]: https://www.python.org/\n[git]: https://git-scm.com/\n[ailab]: http://ailab.ijs.si/\n[ijs]: https://www.ijs.si/\n[H2020-ICT-952026]: https://cordis.europa.eu/project/id/952026\n\n[^1]: S. Miranda, A. Znotiņš, S. B. Cohen, and G. Barzdins, “Multilingual clustering of streaming news” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 4535–4544.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feriknovak%2Fwac","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feriknovak%2Fwac","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feriknovak%2Fwac/lists"}