Data visualization of Telegram messages (encrypted chats)
We are going to visualize the timestamps of messages in the Telegram database. This also includes encrypted chats as we analyze the SQLite database of the app.
Obtaining the database
We pull the database of Telegram using the ADB tool. You can read here how this tool works and how to set it up. Make sure your phone is rooted and you set Root access to ADB only. Then you can restart ADB using adb root. Finally you can pull the database to your current working directory using:
The timestamps.csv should contain now a single 1. The following queries show how to query the timestamps for encrypted and non encrypted chats.
Non encrypted chat
For non encrypted chats you can use:
SELECTdateFROMmessagesWHEREuid=(SELECTuidFROMusersWHEREnameLIKE'%lower case name of person%');
For encrypted chat we first need to query an other uid. The new uid is longer as it is bit-shifted by 64 to the left. In order to find the chat messages for encrypted chats, we first undo this in the query and then find the corresponding user in the enc_chats table.
SELECTdateFROMmessagesWHEREmid<0ANDsubstr(printf('%X',uid),0,length(printf('%X',uid))-7)=(SELECTprintf('%X',uid)FROMenc_chatsWHEREnameLIKE'%lower case name of person%');
There are also the data-blobs with column name data which are not further discussed here. Maybe an other post will take a deeper look at the actual message content.
We can UNION and sort the results now to get ta complete overview of the timestamps.
TL;DR (show me the command!)
SELECT * FROM (
SELECT date FROM messages WHERE uid = (
SELECT uid FROM users WHERE name LIKE '%lower case name of person%'
SELECT date FROM messages
WHERE mid < 0 AND substr(printf('%X', uid), 0, length(printf('%X', uid)) - 7) = (
SELECT printf('%X', uid) FROM enc_chats WHERE name LIKE '%lower case name of person%'
) ORDER BY date;
Creating a data visualization
The following script allows you to plot the data using pandas and matplotlib with the kxcd style:
You can find more information on the dflab blog about reversing the Telegram database. Thanks for sharing the knowledge! The following section was the key to this post:
if “mid” is negative, “uid” is negative or positive and has a length of about 19 characters – the interlocutor’s data can be find in the “Enc_chats” table by the converted “uid” value. To find the appropriate contact in the “Enc_chats” table, you need to convert the decimal “uid” number to hexadecimal, then cut off the last 8 zeros from the received number and convert the eight-digit hexadecimal number back to decimal;