In 2018 more than 145 billion messages were exchanged each day via WhatsApp, Messenger and text messages. Thanks to our smartphones, iPads, computers, and always renewing message services (emails, SMS, MSN Messenger, WhatsApp, Messenger...) we have an increasingly long individual history of messages. Those messages are our past and reflect our history.
What if we could print a book with all those messages?
However we rarely take the time to look back through them. Let's be honest: scrolling and scrolling Messenger is not a fun nor convenient way to browse memories. The more you go back in time, the longer it takes and the slower the app becomes, until you give up. The same goes for WhatsApp or your text messages. Your history is hardly accessible.
Printing to better remember
What if we could create a book with all those messages? Like a photo book you print of your holidays, but with all the messages you exchanged with somebody? You could let it in your living room, grab it at anytime, open it at a random page and start remembering. It wouldn't need a battery nor an internet connection.
Actually, you can, multiple services exist for this. Almost. Because they don't allow you to merge messages from different services, nor do they take emails, you have little control on the layout (which is very questionable), and if you have many messages it will create multiple volumes 🙅.
13 years of exchanges in one book
I have cousin (Juliette) whom I am very close, and we always exchanged a lot: first through letters, then with MSN Messenger and emails, and since we have iPhones through text messages or Messenger. My idea was to print all those messages in one single book, mixed and sorted by date. No existing service is allowing to mix sources nor accept MSN Messenger or emails. So I decided to edit the book by myself 🤓.
1️⃣ — Gathering all messages
I created a sqlite database, with a table messages, basically with the fields date, fromMe (is this message from me or my cousin), text and medias (array of paths to images/videos). My goal: to fill this table from all the sources we used. And then I would create a PDF from this centralized data.
It was the easiest: once I remembered where I had stored them, I just had to scan them, and I manually created entries in the database with the date of the letter and the path to the picture of it.
I used a first tool (imapbox) to export all my emails from the three email accounts I have been using in the past. Then I parsed the results with a node script: I discarded emails not from or to my cousin (who has also been using multiple addresses I needed to identify), and saved in the database the emails we exchanged.
I also had to do some cleaning, like removing our signatures or the previous exchanges your email client usually appends when you answer to an email.
The longest thing for MSN Messenger messages has been to retrieve them in old-old-old backups of previous computers I still keep on my Synology. Once I had the files, they are just XMLs, so I parsed them with a custom node script and inserted them into the database.
I just had an issue with exchanges I encrypted with some MSN plugin, and whose I completly forgot the password. Impossible to get them back 😔. I wish I used a password manager at this time.
This one was easier: just ask Messenger/Facebook for your data, and you get a JSON file wich all the messages you exchanged as well as the photographs 🙌.
We don't use it with my cousin, so I haven't looked into it.
I had multiple sources: my current iPhone, old encrypted iPhone backups, and very old non-encrypted iPhone backups.
Those last ones were the easiest: basically you just take the file named 3d0d7e5fb2ce288813306e4d4636395e047a3d28 (🤷♂️) and it is a SQLite database. I just had to check the structure, the read them with a custom node script and insert the messages (and their attachements, which are other files in the backup whose filename is a hash of their original name 🤯) in the global database.
For my current iPhone (which is only backuped into iCloud) and old encrypted iPhone backups, I used the trial of iPhone Backup Extractor, just to get the SMS folder (which contains the SQLite database and the attachements), and then processed as before (except the table structure was different so I couldn't directly reuse the script 🙄).
2️⃣ — Creating the PDF
Now I have 9,800 messages, and I need to create an A5 PDF (the format I chose), with a rough maximum number of pages around 900 (after a quick benchmark, I didn't find any editor willing to print bigger books, and I didn't want a super thin paper either).
I thought of using LaTeX, which is designed to create atom-perfect PDFs, but I wanted advanced things like the ability to have in the footer of each page the date of the first and last message on this page. The point of LaTeX is that it handles the page layout, so you don't know what messages are on which page. I also wanted to control precisely how images were displayed. Those are probably feasable with LaTeX, but it didn't seem easy, and I don't have an advanced knowledge of it.
So, I decided to create an HTML version of the book, using React, and then to export it as a PDF using a web broswer (they are experts in this).
Rendering a book in HTML
Being able to display in the footer of each page the date of the first and last message means I need to know exactly which messages are on which page. I can't let the browser handle the page layout. I must compute myself the page-breaks.
Note: for emails, which can be especially long, it's a bit clever: they are splitted in multiple parts (on \n) and each part's height is measured, so an email can be spread on two pages. All other messages are rendered in one block. I could do something like for emails, or even compute the line breaks manually so a single sentence can be spread on two pages, but as most of the message are short, it isn't worth it.
Then, knowing the height of each message, and the maximum height of a page, I can compute which messages can go on which page 🎉. The date of the first and last message can finally be displayed in the footer.
Special case: attachements
Images are simply rendered with a maximum height. For videos, I used ffmpeg to extract an image, which was displayed with a ▶️ icon. Other attachements aren't displayed. However, as soon as there is one attachement, after rendering the images, a unique QRCode to this message is generated and displayed, pointing to a quick Express website I created for the occasion: it displays the list of attachements (photos, videos, audio, pdf…) of the given message.
3️⃣ — Printing the book
This part was far more easier than the previous ones. Maybe the difficulty was to find someone able to print such a big book. I chose Pumbo.
Once I had my book in Chrome, I just had to "File > Print > Save as a PDF", and upload it to Pumbo website. The final PDF had 750 pages and a size of 300MB 😅. But Pumbo is used to files of this size.
What was remaining was the cover: you just need to create a PDF with the right size and the right margins (which depends on the number of pages and the type of paper but Pumbo explains everything).
Creating this book took me multiple evenings of work and week-ends, over 1.5 month. I wish it was easier, not just to create the PDF, but to gather the past data over the previous 13 years. Old iPhone backups, emails, very old MSN Messenger files, WhatsApp, Facebook Messenger, all are using different formats, stored at different places and sometimes very difficult to find.
Today, two versions of this book exist: one for me, with the messages I sent aligned on the right, and one for Juliette, with her messages aligned on the right. I offered her version as a gift for her 25 birthday, a few months ago. This work is for her ❤️.