In 2018 more than 145 billion messages were exchanged each day via WhatsApp, Messenger and text messages. Thanks to our smartphones, iPads, computers, and always renewing message services (emails, SMS, MSN Messenger, WhatsApp, Messenger...) we have an increasingly long individual history of messages. Those messages are our past and reflect our history.

What if we could print a book with all those messages?

However we rarely take the time to look back through them. Let's be honest: scrolling and scrolling Messenger is not a fun nor convenient way to browse memories. The more you go back in time, the longer it takes and the slower the app becomes, until you give up. The same goes for WhatsApp or your text messages. Your history is hardly accessible.

Printing to better remember

What if we could create a book with all those messages? Like a photo book you print of your holidays, but with all the messages you exchanged with somebody? You could let it in your living room, grab it at anytime, open it at a random page and start remembering. It wouldn't need a battery nor an internet connection.

Actually, you can, multiple services exist for this. Almost. Because they don't allow you to merge messages from different services, nor do they take emails, you have little control on the layout (which is very questionable), and if you have many messages it will create multiple volumes 🙅.

Example of book from Zapptales
Example from zapptales: the book is far away from a sober novel but looks more like a photo book. And it's limited to one application at a time (WhatsApp here).

13 years of exchanges in one book

I have cousin (Juliette) whom I am very close, and we always exchanged a lot: first through letters, then with MSN Messenger and emails, and since we have iPhones through text messages or Messenger. My idea was to print all those messages in one single book, mixed and sorted by date. No existing service is allowing to mix sources nor accept MSN Messenger or emails. So I decided to edit the book by myself 🤓.

1️⃣ — Gathering all messages

I created a sqlite database, with a table messages, basically with the fields date, fromMe (is this message from me or my cousin), text and medias (array of paths to images/videos). My goal: to fill this table from all the sources we used. And then I would create a PDF from this centralized data.

Letters

It was the easiest: once I remembered where I had stored them, I just had to scan them, and I manually created entries in the database with the date of the letter and the path to the picture of it.

Emails

I used a first tool (imapbox) to export all my emails from the three email accounts I have been using in the past. Then I parsed the results with a node script: I discarded emails not from or to my cousin (who has also been using multiple addresses I needed to identify), and saved in the database the emails we exchanged.

I also had to do some cleaning, like removing our signatures or the previous exchanges your email client usually appends when you answer to an email.

MSN Messenger

The longest thing for MSN Messenger messages has been to retrieve them in old-old-old backups of previous computers I still keep on my Synology. Once I had the files, they are just XMLs, so I parsed them with a custom node script and inserted them into the database.

I just had an issue with exchanges I encrypted with some MSN plugin, and whose I completly forgot the password. Impossible to get them back 😔. I wish I used a password manager at this time.

(Facebook) Messenger

This one was easier: just ask Messenger/Facebook for your data, and you get a JSON file wich all the messages you exchanged as well as the photographs 🙌.

WhatsApp

We don't use it with my cousin, so I haven't looked into it.

Text messages

I had multiple sources: my current iPhone, old encrypted iPhone backups, and very old non-encrypted iPhone backups.

Those last ones were the easiest: basically you just take the file named 3d0d7e5fb2ce288813306e4d4636395e047a3d28 (🤷‍♂️) and it is a SQLite database. I just had to check the structure, the read them with a custom node script and insert the messages (and their attachements, which are other files in the backup whose filename is a hash of their original name 🤯) in the global database.

For my current iPhone (which is only backuped into iCloud) and old encrypted iPhone backups, I used the trial of iPhone Backup Extractor, just to get the SMS folder (which contains the SQLite database and the attachements), and then processed as before (except the table structure was different so I couldn't directly reuse the script 🙄).

Screenshot of my Google Activity showing multiple searches around iPhone backups
You can see the difficulites I had to find the right tools and understand backups logic (which changed sometimes around iOS 6) from my Google Activity 😬

2️⃣ — Creating the PDF

Now I have 9,800 messages, and I need to create an A5 PDF (the format I chose), with a rough maximum number of pages around 900 (after a quick benchmark, I didn't find any editor willing to print bigger books, and I didn't want a super thin paper either).

I thought of using LaTeX, which is designed to create atom-perfect PDFs, but I wanted advanced things like the ability to have in the footer of each page the date of the first and last message on this page. The point of LaTeX is that it handles the page layout, so you don't know what messages are on which page. I also wanted to control precisely how images were displayed. Those are probably feasable with LaTeX, but it didn't seem easy, and I don't have an advanced knowledge of it.

The left page shows the date of the first message, and the right page the date of the last message, allowing you to navigate easily to a given date.

So, I decided to create an HTML version of the book, using React, and then to export it as a PDF using a web broswer (they are experts in this).

The complete workflow. book.pdf contains QRCode for attachements, pointing to an URL served by the express server at the bottom.

Rendering a book in HTML

Being able to display in the footer of each page the date of the first and last message means I need to know exactly which messages are on which page. I can't let the browser handle the page layout. I must compute myself the page-breaks.

First, each message is rendered, with the right CSS style (font family, font size, and especially a max width in cm), and its height is measured in pixels in JavaScript (with divRef.clientHeight).
Note: for emails, which can be especially long, it's a bit clever: they are splitted in multiple parts (on \n) and each part's height is measured, so an email can be spread on two pages. All other messages are rendered in one block. I could do something like for emails, or even compute the line breaks manually so a single sentence can be spread on two pages, but as most of the message are short, it isn't worth it.

Then, knowing the height of each message, and the maximum height of a page, I can compute which messages can go on which page 🎉. The date of the first and last message can finally be displayed in the footer.

My book version: on the right my messages, on the left those from my cousin. Each new day is display with a centered title. Each message has just the time displayed on the right or the left, with another font style. On the bottom: the page number (centered), the date of the first message (on the left page), and the date of the last message (on the right page).
Emails are rendered a bit differently because they have a title. But all other messages (MSN Messenger, SMS, Messenger) have the exact same style. Note the attachement and the QRCode associated on the top.

Special case: attachements

Images are simply rendered with a maximum height. For videos, I used ffmpeg to extract an image, which was displayed with a ▶️ icon. Other attachements aren't displayed. However, as soon as there is one attachement, after rendering the images, a unique QRCode to this message is generated and displayed, pointing to a quick Express website I created for the occasion: it displays the list of attachements (photos, videos, audio, pdf…) of the given message.

You can see pictures from a SMS of my cousin, with the QRCode (blurred) next to them.

3️⃣ — Printing the book

This part was far more easier than the previous ones. Maybe the difficulty was to find someone able to print such a big book. I chose Pumbo.

Once I had my book in Chrome, I just had to "File > Print > Save as a PDF", and upload it to Pumbo website. The final PDF had 750 pages and a size of 300MB 😅. But Pumbo is used to files of this size.

What was remaining was the cover: you just need to create a PDF with the right size and the right margins (which depends on the number of pages and the type of paper but Pumbo explains everything).

Conclusion

Creating this book took me multiple evenings of work and week-ends, over 1.5 month. I wish it was easier, not just to create the PDF, but to gather the past data over the previous 13 years. Old iPhone backups, emails, very old MSN Messenger files, WhatsApp, Facebook Messenger, all are using different formats, stored at different places and sometimes very difficult to find.

Today, two versions of this book exist: one for me, with the messages I sent aligned on the right, and one for Juliette, with her messages aligned on the right. I offered her version as a gift for her 25 birthday, a few months ago. This work is for her ❤️.