{"id":20723608,"url":"https://github.com/networks-learning/discussion-complexity","last_synced_at":"2025-08-10T22:43:56.992Z","repository":{"id":150974530,"uuid":"159381213","full_name":"Networks-Learning/discussion-complexity","owner":"Networks-Learning","description":"Code for \"On the Complexity of Opinions and Online Discussions\", WSDM 2019","archived":false,"fork":false,"pushed_at":"2019-03-25T11:43:46.000Z","size":935,"stargazers_count":12,"open_issues_count":0,"forks_count":4,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-12T11:54:12.937Z","etag":null,"topics":["complexity","data-science","discussion","online-discussions","opinion-mining","paper","wsdm"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1802.06807","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Networks-Learning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-11-27T18:30:26.000Z","updated_at":"2023-06-13T12:11:02.000Z","dependencies_parsed_at":null,"dependency_job_id":"63ab3c9c-4674-43a7-86c3-8f89ba4eb577","html_url":"https://github.com/Networks-Learning/discussion-complexity","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Networks-Learning/discussion-complexity","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Networks-Learning%2Fdiscussion-complexity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Networks-Learning%2Fdiscussion-complexity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Networks-Learning%2Fdiscussion-complexity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Networks-Learning%2Fdiscussion-complexity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Networks-Learning","download_url":"https://codeload.github.com/Networks-Learning/discussion-complexity/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Networks-Learning%2Fdiscussion-complexity/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269799505,"owners_count":24477643,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-10T02:00:08.965Z","response_time":71,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["complexity","data-science","discussion","online-discussions","opinion-mining","paper","wsdm"],"created_at":"2024-11-17T04:09:13.157Z","updated_at":"2025-08-10T22:43:56.971Z","avatar_url":"https://github.com/Networks-Learning.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# On Complexity of Opinions and Online Discussions\n\nThis is the code accompanying the paper:\n\n\u003e On complexity of online discussions. Utkarsh Upadhyay, Abir De, Aasish Pappu, Manuel Gomez-Rodriguez. WSDM, 2018.\n\nThe code is arranged as various scripts which do a variety of tasks which were used in the paper.\nThe key scripts are:\n\n  1. Determining whether a discussion has a given dimension: [`2d_sat.py`](#2d_sat.py)\n  2. Filling in the comment-voter matrix using:\n    1. our proposed polynomial algorithm: [`SR_fill.py`](#SR_fill.py)\n    2. Z3 embeddings: [`z3_fill.py`](#z3_fill.py)\n    3. our proposed exact rank method: [`low-rank-completion.py`](#low-rank-completion.py)\n    4. soft-impute: [`soft-impute.py`](#soft-impute.py)\n  3. Finding the embeddings for a partial binary matrix using Z3: [`matrix-Z3-embed.py`](#matrix-Z3-embed.py)\n\n## 2d_sat.py\n\n    Usage: 2d_sat.py [OPTIONS] IN_FILE\n\n      Reads data from IN_FILE with the following format:\n\n            comment_tree_id, commenter_id, voter_id, vote_type\n            1, 200, 3000, 1\n            1, 201, 3000, -1\n            ...\n\n         or (in case of real data):\n\n            r.reply_to        r.message_id    r.uid_alias     r.vote_type\n            1         200     3000    UP\n            1         201     3000    DOWN\n            ...\n\n      Outputs a CSV which contains whether the comment/article-tree had a sign-\n      rank of 2 or not.\n\n      Redirect output to a file to save it.\n\n    Options:\n      --dims INTEGER                  What dimensional embedding to test for?\n      --cpus INTEGER                  How many CPUs to use.\n      --timeout INTEGER               Time after which to give up (ms).\n      --real / --no-real              Assume format of real-data.\n      --improve PATH                  Improve the results from the provided file.\n                                      Will only run for `unknown` ids in the file.\n      --context-id / --no-context-id  Use the context_id instead of\n                                      comment_tree_id to group comments into\n                                      matrices.\n      --nrows INTEGER                 Number of rows from the CSV to read.\n      --help                          Show this message and exit.\n\n\n## SR_fill.py\n\n    Usage: SR_fill.py [OPTIONS] IN_MAT_FILE OP_MAT_FILE OP_SC_FILE\n\n      Read M_partial from IN_MAT_FILE and fill in the matrix using SR. The vote\n      at (i, j) will be removed before filling in the matrix. The resulting\n      matrix will be saved at OP_MAT_FILE and the given LOO entry will be placed\n      along with the original vote in OP_LOO_FILE.\n\n      Additionally, the best guess for the Sign Rank will be placed in\n      OP_SC_FILE along with the source node which results in that.\n\n    Options:\n      -i INTEGER                    LOO i\n      -j INTEGER                    LOO j\n      --op-loo PATH                 Output path for the LOO.\n      --seed INTEGER                Seed which was used to create this test-case.\n      --min-avg / --no-min-avg      This flag will cause minimization of the\n                                    average SC instead of worst case SC. Is much\n                                    faster.\n      --transpose / --no-transpose  Whether to transpose the matrix or not.\n      --help                        Show this message and exit.\n\n\n## z3_fill.py\n\n    Usage: z3_fill.py [OPTIONS] IN_MAT_FILE OP_MAT_FILE OP_LOO_FILE\n\n      Read M_partial from IN_MAT_FILE and fill in the matrix using Z3. The vote\n      at (i, j) will be removed before filling in the matrix. The resulting\n      matrix will be saved at OP_MAT_FILE and the given LOO entry will be placed\n      along with the original vote in OP_LOO_FILE.\n\n    Options:\n      -i INTEGER       LOO i\n      -j INTEGER       LOO j\n      --sat_2d TEXT    Is the matrix 2D-SAT w/o LOO?\n      --sat_1d TEXT    Is the matrix 1D-SAT w/o LOO?\n      --seed INTEGER   Seed which was used to create this test-cast.\n      --guess INTEGER  Whether to use the given guess {-1, +1} for doing LOO\n                       prediction; 0 means no guessing.\n      --help           Show this message and exit.\n\n## low-rank-completion.py\n\n    Usage: low-rank-completion.py [OPTIONS] IN_MAT_FILE\n\n      Read M_partial from IN_MAT_FILE and optimize the embeddings to maximize\n      the likelihood under the logit model.\n\n    Options:\n      --dims INTEGER              The dimensionality of the embedding.\n      --seed INTEGER              The random seed to use for initializing\n                                  matrices, in case initial values are not given.\n      --suffix TEXT               Suffix to add before saving the embeddings.\n      --init-c-vecs TEXT          File which contains initial embedding of c_vecs.\n      --init-v-vecs TEXT          File which contains initial embedding of v_vecs.\n      -i INTEGER                  Which i index to LOO.\n      -j INTEGER                  Which j index to LOO.\n      --alpha FLOAT               Bound on the spikiness of M.\n      --sigma FLOAT               What is the variance of (logistic) noise to add.\n      --lbfgs / --no-lbfgs        Whether to use LBFGS instead of BFGS.\n      --loo-output TEXT           Where to save the LOO output.\n      --loo-only / --no-loo-only  Whether to only save the LOO output or whether\n                                  to save the complete recovered matrix.\n      --uv / --no-uv              Whether to impose the alpha constraint on both U\n                                  and V or on U.V^T.\n      --verbose / --no-verbose    Verbose output.\n      --help                      Show this message and exit.\n\n## soft-impute.py\n\n    Usage: soft-impute.py [OPTIONS] IN_MAT_FILE\n\n      Read M_partial from IN_MAT_FILE and complete the matrix using soft-impute\n      method.\n\n    Options:\n      --dims INTEGER              The dimensionality of the embedding.\n      --seed INTEGER              The random seed to use for initializing\n                                  matrices, in case initial values are not given.\n      --suffix TEXT               Suffix to add before saving the embeddings.\n      -i INTEGER                  Which i index to LOO.\n      -j INTEGER                  Which j index to LOO.\n      --loo-output TEXT           Where to save the LOO output.\n      --loo-only / --no-loo-only  Whether to only save the LOO output or whether\n                                  to save the complete recovered matrix.\n      --verbose / --no-verbose    Verbose output.\n      --help                      Show this message and exit.\n\n## matrix-Z3-embed.py\n\n    Usage: matrix-Z3-embed.py [OPTIONS] MAT_FILE\n\n      Read the partial matrix in MAT_FILE and save embeddings for the file to\n      `mat_file.commenters` and `mat_file.voters` file.\n\n    Options:\n      --dim INTEGER      What dimension to use while splitting matrix.\n      --timeout INTEGER  What timeout to use (minutes).\n      --help             Show this message and exit.\n\n\n# Requirements\n\nThese python packages are required:\n\n  - `click`\n  - `cvxpy`\n  - `dccp `\n  - `numpy`\n  - `decorated_options`\n  - `seaborn`\n  - `z3-solver`\n  - `networkx`\n  - `pqdict`\n  - `fancyimpute`\n\nAll of these can be installed using `pip` while some of them are available on\n`conda` (preferred). If using `pip`, the `requirements.txt` file in the `code`\nfolder will be helpful.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetworks-learning%2Fdiscussion-complexity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnetworks-learning%2Fdiscussion-complexity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetworks-learning%2Fdiscussion-complexity/lists"}