{"id":26933184,"url":"https://github.com/xiaohan2012/signed-local-community","last_synced_at":"2025-04-02T09:17:47.951Z","repository":{"id":48130762,"uuid":"236661763","full_name":"xiaohan2012/signed-local-community","owner":"xiaohan2012","description":"Code for paper \"Searching for polarization in signed graphs: a local spectral approach\" (published at WebConf 2020)","archived":false,"fork":false,"pushed_at":"2024-02-03T12:59:45.000Z","size":157880,"stargazers_count":9,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-04-14T18:06:59.803Z","etag":null,"topics":["community-detection","convex-optimization","duality-theory","graph-algorithms","graph-mining","linear-algebra","local-community-detection","paper","polarization","political-polarization","python","research","semidefinite-programming","signed-graph","social-network-analysis","spectral-methods","webconf"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xiaohan2012.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null}},"created_at":"2020-01-28T05:02:08.000Z","updated_at":"2024-02-03T12:58:54.000Z","dependencies_parsed_at":"2022-08-12T19:20:28.893Z","dependency_job_id":null,"html_url":"https://github.com/xiaohan2012/signed-local-community","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiaohan2012%2Fsigned-local-community","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiaohan2012%2Fsigned-local-community/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiaohan2012%2Fsigned-local-community/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiaohan2012%2Fsigned-local-community/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xiaohan2012","download_url":"https://codeload.github.com/xiaohan2012/signed-local-community/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246785481,"owners_count":20833498,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["community-detection","convex-optimization","duality-theory","graph-algorithms","graph-mining","linear-algebra","local-community-detection","paper","polarization","political-polarization","python","research","semidefinite-programming","signed-graph","social-network-analysis","spectral-methods","webconf"],"created_at":"2025-04-02T09:17:46.574Z","updated_at":"2025-04-02T09:17:47.017Z","avatar_url":"https://github.com/xiaohan2012.png","language":"Jupyter Notebook","readme":"# Searching for polarization in signed graphs: a local spectral approach, WebConf 2020\n\nSee the paper ([Arxiv version](https://arxiv.org/pdf/2001.09410.pdf))  \n\n\n# Software dependency\n## install python packages\n\nMake sure you have [conda](https://docs.conda.io/en/latest/) installed.\n\nThen run `conda env create -f environment.yml` to create the virtual environment.\n\nActivate it by `conda activate polar`\n\n\n## Install database (optional)\n\nWe use [postgres](https://www.postgresql.org/) to store results of experiments that are 1) repeated many times and 2) relatively time-consuming to run.\nFor example, results from seeding on real-world graphs are stored in database. \n\nYou don't need database if you only call the API in Python (`core.py`).\nHowever, if you want to reproduce the results in the paper, you need this.\n\n## testing\n\nrun `pytest test*.py` (you might see very few errors e.g., usually 1  from `test_core.py` due to numerical instability)\n\n# API usage \n\na typical example of calling functions in Python is:\n\n```python\nimport networkx as nx\nfrom core import query_graph_using_sparse_linear_solver, sweep_on_x_fast\n# read the graph\ng = nx.read_gpickle('{path_of_graph}')\n\n# compute the optimal x vector\nx, obj_val = query_graph_using_sparse_linear_solver(g, [seeds1, seeds2], kappa=0.9, verbose=0, ub=g.graph['lambda1'])\n\n# sweep on x to find C1 and C2\nC1, C2, C, best_t, best_sbr, ts, sbr_list = sweep_on_x_fast(g, x, top_k=100)\n\nprint('community 1', C1)\nprint('community 2', C2)\n```\n\n# Command line usage\n\n## run queries on graphs\n\ncurrently, two query modes are supported:\n\n- query a single seed: run a batch of single-seed queries on graph: `query_single_seed_in_batch.py`\n  - save commands that query single seed (for parallel computing): `python3 print_query_single_seed_commands.py {graph} \u003e cmds/{graph}.txt`\n- query a seed pair (one node from each polarized side): run a batch of seed-pair queries on graph: `query_seed_pair_in_batch.py\n  - save commands that query seed pairs: `python3 print_query_seed_pair_commands.py {graph} \u003e cmds/{graph}_pairs.txt`\n\nNote that the above two commands requires postgres being installed.\n\n## exporting results from database \n\nuse `export_single_seed_result_from_db.py|export_pair_result_from_db.py`\n\nThe output is in `pandas.DataFrame` format. \n\n## useful scripts\n\n- `augment_result.py`: augment our result by various graph statistics\n\n### pre/post-processing scripts for [FOCG, KDD 2016](https://www.kdd.org/kdd2016/papers/files/rpp0799-chuA.pdf)\n\n- `prepare_data_for_matlab.py`: convert graph to Matlab format for FOCG to use\n- `augment_focg_result.py`: augment FOCG result by various graph statistics\n\n\n## scalability evaluation\n\n- run `scalability_evaluation.py`\n\n# Reproducing the figures/tables in the submission\n\n## data pre-processing\n\n- run `preprocess_graph.py` (remember to change the variable `graph`)\n- or you can use the processed ones under `graphs/{graph}.pkl`\n\n## Figure 1: the motivation plot\n\nrun `intro-plot.ipynb`\n\n## Table 2: graph statistics\n\nrun `graph_stat_table.ipynb`\n\n## Figure 4: synthetic graph experiment \n\nrun the following (it takes ~1.5 hours on a 8-core machine in total):\n\n- effect of noise parameter: `run_experiment_effect_of_eta.py`\n- effect of number of outlier nods: `run_experiment_effect_of_outlier_size.py`\n- effect of number of seed: `run_experiment_effect_of_seed_size.py`\n\nthen, make the plot using `experiment_on_synthetic_graphs.ipynb`\n\n## Figure 5: real graph experiment\n\nYou need to have postgres installed in order to save the results. \n\n**run PolarSeeds**\n\nDo the following for all graphs (word, bitcoin, epinions, etc):\n\n- run `python3 print_query_seed_pair_commands.py {graph_name} \u003e {cmd_list}.txt` to get the list of execution commands\n  - `{cmd_list}.txt` will contain the list of commands to run to get the result\n- run `python3 export_pair_result_from_db.py` to export the data (remember to change the `graph` variable in the script)\n- run `python3 augment_pair_result.py {graph_name}` to add evaluation metric values\n\n**run FOCG**\n\n- before runnin FOCG, preprocess the graphs so they're Matlab-compatible: run `prepare_data_for_matlab.py` (remember to update te `graph` variable)\n- check [this repository](https://github.com/xiaohan2012/KOCG.SIGKDD2016) and run the file `DemoRun.m` in Matlab\n  - make sure the input graphs from previou step are in the right paths\n  - copy the output `.mat` file under `outputs/focg-{graph_name}.mat`\n- run `python3 augment_focg_result.py {graph_name}` to add evaluation metric values\n\n**make the plot**\n\nrun `FOCG-vs-PolarSeeds.ipynb` to make the plot\n\n# Figure 6: case studies\n\n- (a) and (b): run `case-study-overlapping-community.ipynb`\n- (c): run `case-study-distrust-radiation.ipynb`\n\n# Jupyter notebooks along the way\n\nthe following notebooks are records of the thought process and how the project has evolved:\n\n- `signed-laplacian-eigen-value.ipynb`: demo on what the bottom-most eigen vector looks like on a toy graph (to understand better signed spectral theory in general)\n- `proof-of-concept.ipynb`: the very early one that demos how this method works for small toy graphs and some investigation on the effect of kappa\n- `experiment_on_synthetic_graphs.ipynb`: effect of different parameters on synetheic graphs\n- `tuning-kappa.ipynb`: trying to understand better the effect of `kappa` on synthetic graphs\n- `binary-search-on-alpha.ipynb`: binary search on alpha plus conjugate gradient method to solve the program\n- `fast_sweeping.ipynb`: efficient way to sweep on `x` (reduces time cost by orders of magnitudes)\n- `case-study-on-word-graph.ipynb`: manual checking the result on word graph + some visualization\n- `why-constraint-not-tight.ipynb`: for some nodes typically with small degrees, `alpha` tends very close to `lambda_1`, making the constraint not tight\n- `explore-seed-pair-query-result.ipynb`: checking query result on real graphs (some  statistics and viz)\n- `explore-fog-result-on-real-graphs.ipynb`: checking query result by [FOCG, KDD 2018](https://dl.acm.org/citation.cfm?id=2939672.2939855) on real graphs\n- `FOCG-vs-PolarSeeds.ipynb`: comparing [FOCG, KDD 2018](https://dl.acm.org/citation.cfm?id=2939672.2939855) with our method\n- `dig-out-more-communities-on-word-graph.ipynb`: find out more polarized communities on \"word\" graph\n- `case-study-overlapping-community`: demo on overlapping communities in \"word\" graph\n- `case-study-distrust-radiation.ipynb`: case study of distrust radiation\n- `intro-plot.ipynb`: plots of the motivational example\n\n# Misc\n\n### notes on sbatch ([Aalto Triton](https://scicomp.aalto.fi/triton/) ulitity script)\n\n- edit `sbatch_query_single_seed_in_batch.sh`  and make sure to update the following:\n  - `--array=1-{n}`, where `n` is the number of commands to run (use `wc -l {cmds_path.txt}`) to get that number\n  - `graph=\"{graph_name}\"`: set the graph name accordingly\n  - in addition, number of cpus, memory requirement, max running time can be set\n- submit the job by `sbatch sbatch_run_queries_in_batch.sh\n- the same applies to the other query mode, corresponding to file `sbatch_query_pairs_in_batch.sh`\n\n# To cite this paper\n\n```bibtex\n@inproceedings{xiao2020searching,\n  title={Searching for polarization in signed graphs: a local spectral approach},\n  author={Xiao, Han and Ordozgoiti, Bruno and Gionis, Aristides},\n  booktitle={Proceedings of The Web Conference 2020},\n  pages={362--372},\n  year={2020}\n}\n```\n\t  \n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiaohan2012%2Fsigned-local-community","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxiaohan2012%2Fsigned-local-community","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiaohan2012%2Fsigned-local-community/lists"}