Working with a .pst file in Linux

Lately, I’ve been toying with the idea of using Linux as my primary workstation, as opposed to Windows 10. I’ve always liked Linux, but I primarily work in Microsoft technologies – most-importantly for my dayjob!

However, it dawned on me that I use a computer in different ways. For example, I have “Johnny Homeowner”/typical end-user type of things I do – that is: check e-mail, use facebook and twitter, watch youtube videos, browse the web, etc. For those things, those don’t require Microsoft Windows. In fact, even when my main workstation is Windows, I only use that to connect to virtual machines where my development workstation is setup. For end-user type things, and for connecting to other operating systems, your host machine could kind of be anything… so could it be Linux?

With all of that said, I started to see how I could replace the functionality that I use every day. I was, and continue to be shocked how easy this was! In fact, I am now down to a short list of nice-to-have apps I’d like to find. One of the bigger (or most important to me) was dealing with archived e-mail in an old .pst file.

My approach to personal e-mail:
For the past several years, I would use inbox zero with Microsoft Outlook and a Live account and store all of my archived e-mails under “Reference”. This was until last year sometime when my live.com mailbox became corrupted. That turned into a multi-day frustration-fest. There were two big outcomes from that: 1) they couldn’t fix my mailbox, so I had to abandon it, and that e-mail address and 2) while working with Microsoft PSS, they instructed me to temporarily copy all of my archived messages to a .pst file on my hard drive (within Outlook). I later learned that you can move messages OUT of an IMAP folder into a .pst, but you cannot move them from a .pst, to an IMAP folder.

So, for better or for worse, all of my old e-mail (just things I need/want to keep) was now in a separate .pst. That gets backed-up nightly and I’ve continued to use it. Instead of having my archive stored in my Live/Outlook account online, it was now in an external .pst file on my computer.

What about .pst files on Linux?
Well, the .pst file format is proprietary to Microsoft. I don’t know of any e-mail products or add-in’s that let you use it. However, there are several utilities that let you pull messages out of the .pst. That’s what I did.

First, I installed this utility called readpst:

$ sudo apt-get install readpst

Then, using this command-line:

$ readpst -o ~/ArchivedMessages -D -j 4 -r -tea -u -w -m ./ArchiveBackup.pst

this extracts the messages (keeping the folder structure) and creates files in .msg, and .eml format. The .eml format is significant because I’ve been using Mozilla Thunderbird for mail, and I can double-click on an .eml file (which represents a whole e-mail, intact, with attachments) – and it opens as a Thunderbird message – which I can forward, copy, print, etc.

So now instead of seeing a folder structure in Outlook with the messages, I now have a directory structure on Linux where it’s one file per message (with the attachments MIME-encoded within).

How do you search/forward those messages?
For searching, I feel compelled to do it the Linux-y way: from the command-line. I can easily use “grep” to recursively look in all of those messages for a piece of text. For example:

$ grep -T -I -o -H -n -r "..........very.........." .

You can go look up each one of those arguments if you like – I just open the man page:

$ man grep

and slowly build up my command. In this case, if I were in the ArchivedMessages folder, I could run this command and bring back every place where I had the word “very” – and it returns up to 10 characters on the left and 10 characters on the right (those are the dots). There are several ways to do this, this is just one.

EmailGrep

This returns the file name, the line number where these match, and the keyword with 10 characters on either side of that keyword.

Now, doesn’t this smell like it should be a script? Of course! So, I’ll likely refine this. Maybe make a “findmail” where you pass in the keyword, and the script does the rest, sorts it, and gives you a pretty outcome.

I’ll also probably create a script to rename the files. Maybe like FromLine_ToLine _Subject_Date.eml – that way I could easily search on obvious things like that, and be able to tell what is in the file – right from the file system without opening the e-mail.

As far as viewing, printing, forwarding an old e-mail (like, for legal purposes), as discussed I can simply double-click the .eml file and it opens in Mozilla Thunderbird as if it were any other message. From there I can forward or print just like it was any other e-mail!

The future:
I’m not terribly excited about this .pst approach, but it pretty-much works. Instead, one of my upcoming projects is I’m going to figure out – once-and-for-all – how to host a simple IMAP and SMTP server on Linux and I’m going to host my own e-mail. I’ll pull in all of my e-mail accounts to this one mailbox, which I can then manage within IMAP. Since I will own it, it will be more secure, it can be backed-up, and I should be able to have one central store for all e-mail I need to keep across all e-mail accounts.

If Hillary Clinton can bring up a private mail server, why can’t I – I’m an IT Professional for crying out loud!!

…that’s for another day. In the meantime, I’m definitely satisfied that I have access to, and can search my old .pst archive. That’s what I needed for now.

Bottom line:
Again, as I experiment with using Linux as my “Johnny Homeowner”, regular end-user computer, I am shocked at how easy this transition was. Ubuntu even supports my fingerprint reader for logging in! So not only are the must-have’s covered, even the nice-to-haves are too!

So, with this e-mail worry behind me, I think that’s the last of al ALL my requirements. So when I asked “could you switch from Windows to Linux?” – my answer at this point is: yes. Now, I still definitely have a need for Microsoft Windows, especially for .NET development – but for my regular end-user functionality, and just connecting to other machines (via RDP, SSH, and VNC), Linux has totally sold me!

Posted in Computers and Internet, General, Infrastructure, Linux, Organization will set you free, Professional Development, Security, Uncategorized
10 comments on “Working with a .pst file in Linux
  1. […] never did find a good way to parse through my old messages (which I did need to do a few times). I wrote about using the Linux command line for this, but even then, it’s not really ideal because you view those messages in a raw […]

    Like

  2. Ray says:

    works great; thanks!

    Like

  3. Emerson says:

    You van use a thunderbird addin and import the result to Firebird. The name is import export tool

    Like

  4. Anony Mouse says:

    Evolution opens / imports pst files

    Like

  5. Here are my installation instructions for Recoll and for converting several Outlook PST files for enabling Recoll search on them:
    For me, the default Recoll installation supported PDF (test this!), DOCX, TAR, ZIP etc.

    sudo add-apt-repository “deb http://archive.canonical.com/ $(lsb_release -sc) partner”
    sudo apt-get install recoll antiword
    recoll

    1. First line is probably not required: it adds partner installation repository.
    2. Antiword is optional. It is needed to support older .doc files.
    3. Enable following symbolic links and the root directory from Recoll Preferencies if necessary.
    4. Create cron job for Recoll indexing using the GUI or make it to start on every login.
    5. Change the Recoll setting in preferences from English to All languages if appropriate for you.
    6. Start the indexing, at least for me it was surprisingly fast and didn’t use all resources so I was able to continue using the laptop.
    7. I have found one bug from Recoll so far: if you search for file name with “PST”, it doesn’t find it even though it is in uppercase. “pst” works and it finds both uppercase and lowercase names.
    8. See more about recoll from https://www.lesbonscomptes.com/recoll/features.html

    If you wish to add support for Outlook PST files, then you need to execute the following as well.

    sudo apt-get install readpst
    mkdir ~/PST
    find -L ~ -name “*.pst” -print | awk “{ printf “%s%s %s%s%s %s\n”, “mkdir ~/PST/”, $1, “; readpst -o ~/PST/”, $1, ” -D -j 4 -r -tea -u -w”, $1 }” > /tmp/myPstFiles
    cat /tmp/myPstFiles
    chmod 755 /tmp/myPstFiles
    /tmp/myPstFiles

    1. Change root directory from ~ to / if necessary in the find command.
    2. My find script has a bug in it: it creates too long directory structure now. But it was easier for me to modify the temp file manually than to find a fix to this. Main target was that this will work for several PST files and it does that.
    3. See more about Readpst from http://www.five-ten-sg.com/libpst/rn01re01.html and https://blog.robseder.com/2015/08/29/working-with-a-pst-file-in-linux/

    Like

  6. Brooke says:

    Run your own server! There are plenty of free/open source things do to this out there. My favorite was the home/trial license for CommunigatePro (www.stalker.com). Fully fledged mail server, chat, VOIP and more for free! Just import your mail, connect via IMAP to your CGP server and drop the IMAP folders into it. It saves your mailboxes in flat text files so it’s easy to move/grep on the server too!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Archives
Categories

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2 other followers

%d bloggers like this: