💾 Archived View for dctrud.randomroad.net › gemlog › 20151021-fixing-email-archives.gmi captured on 2022-04-29 at 11:40:13. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2022-04-28)
-=-=-=-=-=-=-
GMail Import Dates & MIME Multipart Problems
I have a collection of email that has followed me around since the mid-2000s, moving between web hosts, GMail, Outlook.com and iCloud as they've added useful features, I've moved from Android to iOS etc. I've come across 2 problems as i've moved mail around over the years:
When you move from another IMAP account into GMail, using the GMail online account import tool, everything looks great. Unfortunately when later copying mail somewhere else (e.g. iCloud) the received dates may show up incorrectly.
To fix:
1) Download email from Google Takeout - gives you an mbox file containing all mail. Unfortunately this loses any folder structure, but never mind.
2) Examine the mbox file. The messages imported by gmail will have additional headers, inserted at the top of each mail, e.g.
From 1245753982836402098@xxx Sat Aug 25 12:06:17 +0000 2007 Delivered-To: xxxxx@gmail.com Received: by 10.107.187.193 with SMTP id l184csp149097iof; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) X-Received: by 10.107.170.32 with SMTP id t32mr30219550ioe.173.1442879371734; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) Received: from 303668833448.apps.googleusercontent.com named unknown by gmailapi.google.com with HTTPREST; Mon, 21 Sep 2015 19:49:31 -0400 Received: from web38814.mail.mud.yahoo.com (209.191.125.105) by spam2.34sp.com with SMTP; 25 Aug 2007 13:13:01 +0100
The original Received header shows the message was received /25 Aug 2007/. Unfortunately clients other than gmail will display the /21 Sep 2015/ date in the Received headers added by gmail.
To fix this remove the Gmail headers from each message in the mbox file. Can be accomplished with some creative regex e.g. in sublime text. The headers vary a little between imports I've seen. The fixed mbox file can then be imported into a mail client.
After using a number of different clients to move mail around over several years (Thunderbird, Outlook, OSX Mail, Windows Live Mail) I've often seen some MIME multipart emails, with HTML and attachments become broken. The message is moved or copied but will no longer display properly - I see the raw source of all the MIME parts, cannot view attachments etc.
The problem turns out that somehow the clients or servers inserted spurious /Content-Type/ headers, above the original /Content-Type/ header for the MIME mail. The added header prevents the message being read correctly.
An example:
From: xxxxxxx <xxxxxx@gmail.com> Sender: <xxxx@gmail.com> To: "xxxxxx" References: <df696fdc-b801-4be8-bcb0-f3d59169dcba@SwitchService> Date: Wed, 11 Sep 2013 16:19:50 -0500 Message-ID: <06460AE9-3D64-4422-88A2-C05D869A22BC@xxxxxxxxx> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 15.0 Content-Language: en-us X-Google-Sender-Auth: 1IItZObPtrZmxbo-5dealV_naxQ Content-type: multipart/alternative; boundary="B_3521739430_567641463" > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3521739430_567641463 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: 7bit
In an email client I see all the source, from the original MIME Content-Type header...
Content-type: multipart/alternative; boundary="B_3521739430_567641463"
... which is being overridden by the header further up, which has crept in at some point:
MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit
Once this block is removed the message is correctly recognized as MIME multipart, and displays properly. This can also be fixed in an mbox file with some find and replace.