Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nating/whatsapp-stats

A program for visualising data from a WhatApp chat.
https://github.com/nating/whatsapp-stats

statistics visualize-data whatsapp-chat

Last synced: 3 months ago
JSON representation

A program for visualising data from a WhatApp chat.

Awesome Lists containing this project

README

        

**WhatsApp does not endorse or sponsor this project.**

# WhatsApp Stats [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)

A web app for visualising data from a WhatsApp chat.

The current version can be used [here](https://geoffnatin.com/whatsapp-stats) for chats exported from iOS.

## Overview
WhatsApp offers users the ability to export chats as text files.
These text files' format make it easy to parse, and to gather interesting data from them.

## Terminology
To gain insight into the statistics of a chat, it is necessary to define some terms.

|Term |Meaning|
|---------------------|-------|
|Chat |A chat of messages between two or more users.
|Member |A participant in a chat.
|Member Group |A set of participants in a chat.
|Message |An image, audio file, or text sent into a chat.
|Image |An image sent into a chat with or without text.
|Audio |An audio file, from a voice recording or otherwise, sent into a chat.
|Video |An video sent into a chat with or without text.
|Location |A location, sent into a chat.
|Document |A document sent into a chat with or without text.
|Text |Some simple text sent into a chat.
|Conversation |A series of messages sent into a chat without a **significant delay** between them (must be defined more clearly).
|Lit |A conversation is said to be *'lit'* for a given time period if its average delay between messages for that time period is less than **40 seconds** (research needed).
|Response |**Definition Needed** (A message that is being sent because another message had been sent in before. Difficult to define with time because a user might not respond for days.)
|Conversation starter |A message is said to be a *'conversation starter'* if it is the first message sent after a **significant delay** since the last conversation in the chat.
|Conversation finisher|A message is said to be a *'conversation finisher'* if it is the last message of a conversation.
|Seen |A message is said to be seen after **24hrs** (research needed) have passed since it was sent into the chat.
|Activity |The creation of a chat, members leaving a chat, members joining a chat, or any type of message into a chat.

## Queries

Queries about the data in the whatsapp chat are the heart of the project. Implementing each query to find figures is easy. The difficult/time consuming part is to integrated the query into the app by representing the data in an easy-to-read nice way.

Here are some examples of information about a chat that a user may be interested in:

**Integrated**
* Message counts of members of a chat
* Image count of members of a chat
* Audio file count of members of a chat

**Implemented**
* Message count of a chat
* Image count of a chat
* Audio file count of a chat
* Video count of a chat

**Ideas**
* Average message count per day/month/year
* Emoji count of members of a chat
* Conversation starters/finishers count of members of a chat
* Conversation count of members of a chat
* Conversation counts of member groups
* Time of the day/week/year chat is most active
* Time of the day/week/year a member is most active
* Chat activity before vs after a member is added/leaves
* What were the most lit conversations about
* Chat activity over time
* Member activity over time
* Member who gets the most/least responses
* Average number of responses to a conversation starter of members of a chat

## Parsing chat activity

### iOS
A WhatsApp chat's text file is made up of lines of activity. Each activity is represented by its own line and takes the form:
```
dd/mm/yy, hh:mm:ss:
```

Activities can take different forms:

We can see that *Messages* always begin with ``, while other types of activity may not.

Activity |Form in chat exported without media
----------------------|---
|Member addition |`' was added'` or `' added you'` or `You were added`
|Member removal |`' was removed'` or ` removed you`
|Member leaving |`' left'` or `'You left'`
|Encryption message |`'Messages you send to this group are now secured with end-to-end encryption.'` or `'Messages you send to this chat and calls are now secured with end-to-end encryption.'`
|Admin change |`'You're now an admin'`
|Number change |`' changed from ‪''‬ to '‪` or `'changed from ‪''‬ to '‪'`
|Group chat creation |`'You created the group "''"'` or ` created this group`
|Change of subject |`' changed the subject to "''"'` or `'You changed the subject to "''`
|Location |`': location: https://maps.google.com/?q='','`
|Change of icon |`' changed this group's icon'` or `'You changed this group's icon'`
|Image |`': '`
|Audio |`': `
|Video |`': `
|Document |`': '`
|Contact |`': '`
|Text |`': '`

### Android
Chats exported from Android have a different format, so parsing them is different. I have not ventured as far as implementing the parsing of chats exported from Android, as I have an iPhone and only really attempted this project because a casual interest. If you wish to expand the repo to deal with chats exported from other Operating Systems, I encourage you to make a pull reequest!