Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/suji04/chat_entropy_analysis
A simple python script to find and compare WhatsApp chat entropy
https://github.com/suji04/chat_entropy_analysis
data-science entropy python3 shannon-entropy whatsapp
Last synced: about 1 month ago
JSON representation
A simple python script to find and compare WhatsApp chat entropy
- Host: GitHub
- URL: https://github.com/suji04/chat_entropy_analysis
- Owner: Suji04
- Created: 2019-05-05T05:10:21.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-05-06T14:38:36.000Z (over 5 years ago)
- Last Synced: 2024-10-24T10:07:26.646Z (about 2 months ago)
- Topics: data-science, entropy, python3, shannon-entropy, whatsapp
- Language: Jupyter Notebook
- Size: 140 KB
- Stars: 3
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Chat_Entropy_Analysis
A simple python script to find and compare WhatsApp chat entropy.## Formula
Here I'm using Shannon equation to find word based chat entropy of whatsapp messages(with 3 of my friends).
![](https://github.com/Suji04/Chat_Entropy_Analysis/blob/master/shannon_equation.jpg)
## How to use?
1) Download this notebook
2) Export your whatsapp chats(without media) as a .txt file. You can find the export option in whatsapp(top-left corner --> more options).
3) Save the .txt files in the same directory as that of the notebook.
4) Rename the files as 'friend A.txt', 'friend B.txt', 'friend C.txt'.
5) Put your username in the code at the right place(please see the notebook).
6) Now just see the results and share with your friends!
## How to interpret?
In the context of Information Theory Entropy is a measure of Information contained in a message. Higher entropy indicates higher information. The results of this project can give you an idea about how much information you or your friend provide while chatting. However this is an oversimplified model so it may not produce the correct result in complicated situations (e.g. use of different laguages, too much use of different words with same meaning etc.)
## My Results
![](https://github.com/Suji04/Chat_Entropy_Analysis/blob/master/download.png)
## Help
If you have any new idea about this please contribute by pull request or contact me. If you liked this repo star it!