{"id":15093311,"url":"https://github.com/netesf13d/conversations-analysis","last_synced_at":"2026-01-04T17:09:09.589Z","repository":{"id":255563840,"uuid":"850047723","full_name":"netesf13d/conversations-analysis","owner":"netesf13d","description":"Load and analyze Facebook Messenger and Whatsapp conversations","archived":false,"fork":false,"pushed_at":"2024-10-21T21:25:59.000Z","size":6377,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-27T12:11:27.187Z","etag":null,"topics":["analytics","conversations-analysis","facebook","messenger","statistics","whatsapp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/netesf13d.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-30T19:11:33.000Z","updated_at":"2024-10-21T21:26:03.000Z","dependencies_parsed_at":"2024-09-06T03:27:18.645Z","dependency_job_id":"fbdaea5c-86bb-47dd-a405-0f9fc33eed1a","html_url":"https://github.com/netesf13d/conversations-analysis","commit_stats":{"total_commits":21,"total_committers":1,"mean_commits":21.0,"dds":0.0,"last_synced_commit":"a9a22787361b45429ff8d8965747ee4944a3887c"},"previous_names":["netesf13d/conversations-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netesf13d%2Fconversations-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netesf13d%2Fconversations-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netesf13d%2Fconversations-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/netesf13d%2Fconversations-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/netesf13d","download_url":"https://codeload.github.com/netesf13d/conversations-analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244959283,"owners_count":20538624,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","conversations-analysis","facebook","messenger","statistics","whatsapp"],"created_at":"2024-09-25T11:21:11.356Z","updated_at":"2026-01-04T17:09:09.556Z","avatar_url":"https://github.com/netesf13d.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# conversations-analysis\n\nA package to analyze conversations, from file loading to data vizualization. You can get insights about the various participants activity, their usage of medias, words, reactions, and many more. The conversation manipulation and data analysis is essentially agnostic to which messaging application was used (apart from data import).\n\nMessaging applications currently supported are:\n* Facebook messenger archives both in JSON and HTML format. support is partial for HTML (media info is not loaded).\n* Whatsapp archives in text format. These, however, suffer from several limitations:\n  - Reactions are absent from the exported archive (a whatsapp feature)\n  - Some components of text archives are locale dependent: date formatting, joined and missing files texts tokens. I set those I could, but yours is likely to be missing. Feel free to ask for an update.\n\n\n## Detailed overview\n\nThe package features:\n* Conversation loading and instanciation from files (subpackage `conversations`)\n  - From Facebook Messenger archives. Support for both JSON and HTML archives, although it is only partial for HTML.\n  - From Whatsapp text archives\n* Conversation manipulation and data extraction (subpackage `conversations`)\n  - Search in conversation messages, filter the messages\n  - Extract media information\n  - Extract raw conversation statistics\n* Analysis of conversation data (module `analysis`)\n  Centered around pandas dataframes manipulation. Some standard procedures are implemented.\n  - The sum of the various entries\n  - The time-binned sum of the various entries\n  - The rolling sum of the various entries\n* Data vizualization (module `plot`)\n  Plotting functions adapted to the three different analysis functions.\n  - Pie charts for the simple sums\n  - Bar plots for binned sums\n  - Stack plots for rolling sums\n\n\n## Example usage\n\nFor a thorough study of an example dummy conversation, two Jupyter notebooks are available [here](https://github.com/netesf13d/conversations-analysis/tree/main/examples), which can be easily adapted to your archives files. Nevertheless, using the package to get quick analytics report on the conversation is easy!\n\n### Importing a conversation and exporting data for analysis\n\nYou can download your conversations archive from [https://accountscenter.facebook.com/info_and_permissions/dyi/](https://accountscenter.facebook.com/info_and_permissions/dyi/). Prefer the JSON format for a better support. After unzipping, your conversations are located in `facebook-\u003cyourname\u003e\u003crandomnumber\u003e/your_activity_across_facebook/messages/inbox/`. Just provide the path to your conversation to instanciate a `MessenerConversation`.\n```\nfrom conversation_analysis import (MessengerConversation, ConversationStats,\n                                   pie_plot, bar_plot, stack_plot)\n\n# conv_paths is the list of paths where your conversation archive is located\n# it may be distributed over multiple archive files, for example\nconv_paths = ['facebook-user1234_archive2017/your_activity_across_facebook/messages/inbox/myconversation',\n              'facebook-user1234_archive2018/your_activity_across_facebook/messages/inbox/myconversation']\nconv = MessengerConversation.from_facebook_json(conv_paths)\nmessages_data = conv.messages_data()\n```\n\nExport messages statistics and use them to instanciate a `ConversationStats` object, which provides easy to use methods to compute various quantities as shown hereafter.\n```\ntimestamp = messages_data.pop('timestamp')\ngroup = messages_data.pop('sender')\nmessages_stats = ConversationStats(timestamp, group, data=messages_data)\n```\n\n\n### Getting global conversation stats\n\nThe `ConversationStats.sum` method, as its name suggests, sums the various quantities to get global statistics, suitable to plot in a pie chart.\n```\ndf = messages_stats.sum() # pandas DataFrame\nfig, axs = pie_plot(df)\n```\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/netesf13d/conversations-analysis/blob/main/examples/figures_and_data/msg_pc_participant.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\n### Getting hourly messages stats\n\nThe `ConversationStats.binned_sum` method does the same as the above, excepts that it first bins the data according to the selected post date binning entry. The example here shows binned data by `'hour'`, but it ccould also be by `'day_name'`, `'month_name'`, `'year'` and so on. Such data is suitably represented as a bar plot.\n```\ndf = messages_stats.binned_sum(binning_entries=('hour',), groups=None, timespan=None)\nfig, ax = bar_plot(df['has_content'])\n```\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/netesf13d/conversations-analysis/blob/main/examples/figures_and_data/msg_bp_participants_hour.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\n### Time evolution of the number of messages sent\n\nThe `ConversationStats.rolling_sum` method provides a more advanced processing of the data by windowing a rolling sum. This can be used to get fine grained info on the time evolution of various statistics, well represented with a stack plot.\n```\ndf = messages_stats.rolling_sum(sampling_freq='5D', window_size=10,\n                                win_type='gaussian', win_args={'std': 2})\nfig, ax = stack_plot(df['has_content'], baseline='wiggle',\n                     timescale='day', xlabel_strftime='%Y-%m')\n```\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/netesf13d/conversations-analysis/blob/main/examples/figures_and_data/msg_sp_participants_whole.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\n### Computing word counts statistics\n\nFinally, you can compute other stats easily: reaction usage, media usage, who receives reactions, etc. The functions above can be easily adapted to any quatity that you can think of. For example, here is how to compute and plot word count statistics.\n```\nfrom conversation_analysis import word_count_dataframe\n\nwd_counts = conv.word_counts(groups=None, casefold=True, remove_diacritics=True)\nwords = ['road', 'parameter', 'astronaut', 'media', 'strongly', 'call']\nword_count_df = word_count_dataframe(wd_counts, words)\nfig, axs = pie_plot(word_count_df)\n```\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/netesf13d/conversations-analysis/blob/main/examples/figures_and_data/word_pc.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\n## Dependencies\n\n- [numpy](https://numpy.org/)\n- [pandas](https://pandas.pydata.org/)\n- [matplotlib](https://matplotlib.org/)\n\n\n## Notes\n\nThe typing annotations in the code are by no means rigorous. They are made to facilitate the understanding of the nature of various parameters.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetesf13d%2Fconversations-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnetesf13d%2Fconversations-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnetesf13d%2Fconversations-analysis/lists"}