2 min read

Export your OSX AddressBook to Office365 and fix names

I am currently playing with a few Office365 test accounts. When looking for a way to copy my OSX Address Book entries to Office365, I ran into a few issues.

Exporting and importing

The first issue I encountered is that Office 365 is only able to import CSV files (at first sight), and my Mac's address book won't export to a CSV file. There is a workaround involving a spreadsheet, but let's not go there as it an ugly trick.

A less ugly trick, is to generate a huge VCF file, containing all your contacts, in vCard format. By selecting all your contacts, and dragging them to a folder, AddressBook generates such a massive VCF. You can then open that VCF file in Outlook, which will import all contacts for you.

Standards and multiword family names

When doing a test-import of my contacts using the above trick, I quickly noticed something odd: quite a few family names weren't right. My wife was listed as "Sofie Bogaert", while her real name is "Sofie Van den Bogaert". What was going on here?

Multi-word family names are quite common in Dutch, probably even more common than in French or German. A lot of my contacts with long family names, only listed the last part of their family name in Outlook.

When taking a peek at my wife's vcard in the address book export, I noticed that her family name was listed as:

N:Bogaert;Sofie;Van den;;

I took a peek at the vCard RFC, and noticed that the contact's full name can be composed of multiple parts in the vCard:

Special note: The structured property value corresponds, in sequence, to the Family Names (also known as surnames), Given Names, Additional Names, Honorific Prefixes, and Honorific Suffixes.

For some reason, as part of a previous data-migration, a few of my contact with multi-word family names, had their family name split up in parts, and all but the last part was saved as an "Additional name" in my Mac's address book. I never noticed this, because my Mac and iPhone render the name correctly (which isn't surprising, as Apple was one of the early members of the Versit Consortium which created the original vCard standard in 1995), but it seems Office365 and Outlook seem to ignore all parts except the family and given names.

A bit of Ruby to the rescue...

To fix these problems I first split the monolithic VCF file in multiple parts (I used vcf-split), and then applied the following magic to parse the vCards, look for additional names and prefixes/suffixes, and "embed" them in the name fields. The output of the script is a new, monolithic vcf file, ready to be imported into Outlook.

require "rubygems"
require "vcardigan"
require "pp"

open('all.vcf', 'a') do |outfile|

  Dir["*vcf"].each do |f|
    puts "Processing #{f}"
    data = File.read(f)
    vcard = VCardigan.parse(data)
    n = vcard.n.first.values
    # per RFC 6350 https://tools.ietf.org/html/rfc6350
    # The structured property value corresponds, in
    #  sequence, to the Family Names (also known as surnames), Given
    #  Names, Additional Names, Honorific Prefixes, and Honorific
    #  Suffixes.
    newn = n
    if not n[3].empty? ## Honorific Prefixes (eg: Dr.)
      newn[1] = n[3] + " " + newn[1]
      newn[3] = ""
    end
    if not n[2].empty? ## Additional Names
      newn[0] = n[2] + " " + newn[0]
      newn[2] = ""
    end
    if not n[4].empty? ## Honorific Suffixes. Eg: MSc
      newn[0] = newn[0] + ", " + n[4]
      newn[4] = ""
    end
    # Before: N:Lastname;Firstname;Von;Dr.;Msc
    # After: N:Von Lastname, Msc;Dr. Firstname;;;
    vcard.name = newn
    outfile.puts vcard
  end
end