{"id":19007716,"url":"https://github.com/mirong1707/social_graph","last_synced_at":"2026-06-19T08:31:09.304Z","repository":{"id":160990196,"uuid":"500607997","full_name":"Mirong1707/Social_Graph","owner":"Mirong1707","description":"📊 Exploring socially strongly connected components","archived":false,"fork":false,"pushed_at":"2023-06-05T13:20:31.000Z","size":36,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-21T14:49:25.261Z","etag":null,"topics":["gephi","graphs","python","social-network"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Mirong1707.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-06T22:09:42.000Z","updated_at":"2024-02-02T16:53:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"69ba3c67-2d74-40b5-92ca-536e9ef49170","html_url":"https://github.com/Mirong1707/Social_Graph","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Mirong1707/Social_Graph","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mirong1707%2FSocial_Graph","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mirong1707%2FSocial_Graph/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mirong1707%2FSocial_Graph/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mirong1707%2FSocial_Graph/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Mirong1707","download_url":"https://codeload.github.com/Mirong1707/Social_Graph/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mirong1707%2FSocial_Graph/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34523982,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gephi","graphs","python","social-network"],"created_at":"2024-11-08T18:39:09.132Z","updated_at":"2026-06-19T08:31:09.248Z","avatar_url":"https://github.com/Mirong1707.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Social_Graph\nThis is social networking research 👥. \u003cbr/\u003e\u003cbr/\u003e\n[Paresing from social network](#parsing)\u003cbr/\u003e\n[Build graph of friends](#building)\u003cbr/\u003e\n[Graph visualization](#vis)\u003cbr/\u003e\n[Graph analyzing](#analyzing)\u003cbr/\u003e\n[Strongly connected components](#strong)\u003cbr/\u003e\n[Algorithm description and correctness](#algo)\u003cbr/\u003e\n[Histograms](#histogram)\u003cbr/\u003e\n\nPurpose: To find, cluster and classify the strong connectivity components in a graph of friends from a social network using Python and Gephi. \u003cbr/\u003e \u003cbr/\u003e \u003cbr/\u003e\n\n\n\n\u003ca name=\"parsing\"\u003e\u003c/a\u003e\n### Paresing from social network\nIn order to parse your friends, we've generated an easy-to-analyse json file. The following is one version of parsing VKontakte.\n\n\n\n\n\n\n```python\nimport requests\nimport json\n\nall_id = []\n\nfor i in range(13):\n    print(i)\n    res = requests.get(f\"https://api.vk.com/method/groups.getMembers?group_id=45\u0026offset={i * 1000}\u0026count=1000\"\n                       f\"\u0026access_token=4b32289d4b32289d4b32289d1d4b40471c44b324b32289d15e80a8caa89b73d1d2a5579\u0026v=5.107\")\n    all_id.extend(json.loads(res.text)['response']['items'])\n\nwith open('ids.json', 'w') as f:\n    f.write(json.dumps({'ids': all_id}))\n```\n\u003ca name=\"building\"\u003e\u003c/a\u003e \n### Build graph of friends\nNext the json was converted into a handy two files. One has a list of all the vertices of the graph, the other describes an edge on each line, with two vertices. Next the code that forms the text file.\n```python\ntxt = \"test.txt\"\nf = open(txt).read()\nans = open('rebra.txt', 'w')\narr = []\nflag = False\nk = 0\ncheck = 1\nfor i in f:\n    if not (ord('0') \u003c= ord(i) \u003c= ord('9')):\n        if k != 0:\n            if check == 1:\n                ans.write(str(k) + ' ')\n            else:\n                ans.write(str(k) + '\\n')\n            check *= -1\n        k = 0\n        flag = False\n    if not flag and ord('0') \u003c= ord(i) \u003c= ord('9'):\n        flag = True\n    if flag and ord('0') \u003c= ord(i) \u003c= ord('9'):\n        k = k * 10 + int(i)\nprint(arr)\n\n```\n\n\u003ca name=\"vis\"\u003e\u003c/a\u003e \n### Graph visualization\nNow a table of vertices and edges has been generated in an xlsx file for further analysis. The table generation code will be given below, as it is quite similar when solving the two problems. These tables can now be exported to gephi. This is a graph visualizer. This is what the whole graph looks like. Immediately you can notice a few giant vertices (the size of a graph vertex is proportional to its degree). These are the pages of the two teachers https://vk.com/id238683 and https://vk.com/id133883. They are closely related to the organization of various extracurricular activities in the school, and consequently communicate a lot with all classes in the school, so there are so many different connections. Janina Yadova, for example, gave us the opportunity to get into this internship. \n![image](https://user-images.githubusercontent.com/31445859/172258062-c489ede5-cd30-4a30-ae55-b4eb10cb6f66.png)\n\n\nImmediately you can see that the nodes with the most connections are usually the teachers, and thus it is not difficult to distinguish them from the general mass. \nNow I have done a little investigation with my pens. As a physical model is applied to the graph, it highlights communities, they look like clustered vertices. I chose a small sub-graph, and characterised each group by contacting some of its members and finding out how they were connected to our school.\n![image](https://user-images.githubusercontent.com/31445859/172258080-7ed0d39b-d37e-4a63-89bd-7c8b51cc42f8.png)\n\n\u003ca name=\"analyzing\"\u003e\u003c/a\u003e \n### Graph analyzing\nA person's class I determined directly by asking everyone on facebook, under the pretext of research. And so, under group 1 in the photo is the community of current grades 11-1 and 11-2. They are so closely related, as they were intermingled over the course of their studies. That is to say, the former when we moved from grade 7 to 8, we were roughly sorted by interests (maths or physics), but the old connections remained. And by and large, the two classes are pretty close, speaking as a student of one of them. Under numbers 2 is 11-3 grade, 3 is 11-5 grade, 4 is 11-7 and in question are 3 people from 11-4. Their class has sparred badly as many of the pages are either unsubscribed to the group or closed. As a result, only a few people are surrounded by other classes from the parallel, and the rest are either smeared amongst other classes or are not in the graph at all. As you can see, the communities are formed quite clearly, and manually separating them is not a problem. \n\n\u003ca name=\"strong\"\u003e\u003c/a\u003e \n### Strongly connected components\nBut it's rather long and impractical to allocate everything by hand. Therefore, an algorithm was invented for selecting a class based on the id of one of its members. That is, the component of strong relatedness to which the member in question belongs is selected. Here is the code that does this.\n```python\nimport requests\nimport json\n\n\ndef findFriend(id, da):\n    ss = set()\n    for i in d[id]:\n        ss.add(i)\n    ds = {}\n    for i in ss:\n        ds[i] = 0\n        for j in d[i]:\n            for k in ss:\n                if k == j:\n                    ds[i] += 1\n                    break\n\n    anss = []\n    for i in ss:\n        anss.append((ds[i], i))\n    anss = sorted(anss, reverse=True)\n    k = 1\n    for i in anss:\n        if not da.get(i[1]) is None:\n            da[i[1]] += k\n            k *= 0.95\n\n\ndef GetNameById(user_id):\n    res = requests.get(f\"https://api.vk.com/method/users.get?user_ids={user_id}\u0026\"\n                       f\"access_token=4b32289d4b32289d4b32289d1d4b40471c44b324b32289d15e80a8caa89b73d1d2a5579\u0026v=5.107\u0026lang=ru\")\n    res = json.loads(res.text)['response'][0]\n    return res['first_name'] + ' ' + res['last_name']\n\n\nf = open(\"rebra.txt\").read().split(\"\\n\")\nd = {}\nst = set()\nfor ind, i in enumerate(f):\n    a = 0\n    b = 0\n    j = 0\n    flag = False\n    while j \u003c len(i):\n        if i[j] == ' ':\n            flag = True\n            j += 1\n            continue\n        if not flag:\n            a = a * 10 + int(i[j])\n        else:\n            b = b * 10 + int(i[j])\n        j += 1\n    st.add(a)\n    st.add(b)\n    if d.get(a) is None:\n        d[a] = []\n    if d.get(b) is None:\n        d[b] = []\n    d[a].append(b)\n    d[b].append(a)\nid0 = 420403096\nsuspected = set()\nfor i in d[id0]:\n    suspected.add(i)\nds = {}\ndans = {}\nfor i in suspected:\n    dans[i] = 0\nii = 0\nfindFriend(id0, dans)\nans = []\nfor i in suspected:\n    ans.append((dans[i], i))\nans = sorted(ans, reverse=True)\nfor i in ans:\n    findFriend(i[1], dans)\n    if ii \u003e 10:\n        break\n    ii += 1\n\nans = []\nfor i in suspected:\n    ans.append((dans[i] / (len(d[i]) + 5), i))\nans = sorted(ans, reverse=True)\njj = 0\nfor i in ans:\n    print(i[0], GetNameById(i[1]), jj)\n    jj += 1\n    if jj \u003e 30:\n        break\n\n\n```\n\u003ca name=\"algo\"\u003e\u003c/a\u003e \n### Algorithm description and correctness\nTo describe the algorithm in a nutshell, we simply take all vertices with which the vertex in question is connected, and then distinguish the first 30 members that are closely related to each other. But also this algorithm is applied to several other vertices from the friends of the considered one, in order to exclude false connectivity components, with which the considered vertex is connected by just a couple of edges.\nSo, I applied this algorithm to my page, and got the following result. It is 90% correct. But also our class teacher (although it may be considered as a class member), and a couple of people from 11-2, as I said before that we have close social ties with them.\n![image](https://user-images.githubusercontent.com/31445859/172258100-a2ae4a3e-0c3c-456b-a5cb-fd99a92a015f.png)\n\n\u003ca name=\"histogram\"\u003e\u003c/a\u003e \n### Histograms\nNext, a histogram has been made. I now attach the code that generates the xlsx table I referred to above.\n```python\nfrom openpyxl import load_workbook\n\nwb = load_workbook('t1.xlsx')\nsheet = wb.get_sheet_by_name('test')\nf = open(\"rebra.txt\").read().split(\"\\n\")\nans = []\nd = {}\ndd = {}\nst = set()\nii = 0\nfor ind, i in enumerate(f):\n    ans.append([])\n    a = 0\n    b = 0\n    j = 0\n    flag = False\n    while j \u003c len(i):\n        if i[j] == ' ':\n            flag = True\n            j += 1\n            continue\n        if not flag:\n            a = a * 10 + int(i[j])\n        else:\n            b = b * 10 + int(i[j])\n        j += 1\n    if(a \u003e b):\n        a, b = b, a\n    if dd.get((a, b)) is None:\n        dd[(a, b)] = 1\n        st.add(a)\n        st.add(b)\n        if d.get(a) is None:\n            d[a] = 0\n        if d.get(b) is None:\n            d[b] = 0\n        d[a] += 1\n        d[b] += 1\nst1 = set()\nd1 = {}\ncheck = 0\nfor i in st:\n    st1.add(d[i])\n    check += 1\n    if d1.get(d[i]) is None:\n        d1[d[i]] = 0\n    d1[d[i]] += 1\nans1 = []\niii = 0\nfor i in st1:\n\n    ans1.append([i, d1[i]])\n    sheet[f'A{iii + 2}'] = i\n    sheet[f'B{iii + 2}'] = d1[i]\n    iii += 1\nans1 = sorted(ans1)\nprint(ans1)\n\nwb.save('t1.xlsx')\n\n```\nThe constructed table stores two columns. The first has the number of links and the second the number of nodes with that number of links. A graph has been plotted using this data.\n![image](https://user-images.githubusercontent.com/31445859/172258118-6ea3dcba-9373-4498-a374-bda3e10d38db.png)\n\nIt is hard to speculate why the distribution is the way it is. It is especially strange that there are so many nodes with so few connections. In my opinion it can be explained by the fact that people interested in our school sign up for groups. But the interest rarely develops into the fact that a person or his/her children come here and they forget to unsubscribe from the group. It is worth remembering that the first point is missing on the graph, for 0 connections, where there are about 2500 nodes, which confirms the above theory\nIn the same way a study of the age dependence of the participants in the group was conducted. The method of forming the table is the same. Here is the resulting graph of dependence. It is clear, that it is not exact, as many specify incorrect age (more than 100 years), but in the average dependence makes sense.\n![image](https://user-images.githubusercontent.com/31445859/172258133-60b67dc4-e6ac-4ea5-99d9-7444517c686d.png)\n\nThis concludes the study. Thank you for your attention.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmirong1707%2Fsocial_graph","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmirong1707%2Fsocial_graph","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmirong1707%2Fsocial_graph/lists"}