Backup of David's Livejournal

Happy Birthday, Me. I got you data portability!


My list of addresses has made its way from a physical address book, to a Palm Pilot, to Microsoft Outlook, to Google Contacts to the iPhone Contacts app. Along the way, each of the transitions has played fast and loose with the mappings of the individual fields.

Nowadays, the normal way for a Windows user to export contacts from an iPhone is to sync between the iPhone and Microsoft Outlook, then export from Outlook to a CSV file. I hate having to go through that middleman.

I prefer to take stronger ownership of my own data, and have settled on Google Contacts as the primary place for my data. There are a few reasons. I really like the open "group" (tagging) feature of contacts. I like that they have a public API for accessing and manipulating contacts. But most important is the ability to easily export and import contacts. As Joel Spolsky noticed ten years ago, a good way to get me to try a service, is to make it easy for me to change my mind and leave the service.

By default only the contacts group named "My Contacts" will sync with the iPhone when you set up an Exchange sync between the two end points. That suits my purposes just fine.

I've taken some time to groom my contacts list for Google Contacts. Here are some notes from that experience:

When you export contacts, use the Google CSV format. Your contacts will be exported to a UTF-16 file, and all the special characters you use will be retained. If you choose Outlook CSV format, then the file generated will be 8-bit regardless of the characters used in your contact list, and characters that don't map to 8-bit characters will be changed to question marks. So for 安室奈美恵's sake, choose Google CSV format.

Most people edit their CSV files in a spreadsheet editor. That's fine, but I don't trust my own eyes and hands to get everything right, so I prefer to do batch editing programmatically.

If you want to do some batch processing of the CSV file in Python, here are some snippets. These snippets have been pared down to the essentials, and don't represent good coding practices.

To read in the Google CSV:
# unicode_csv_reader and UnicodeWriter are provided
# in the documentation from the cvs module.

f = unicode_csv_reader(codecs.open( 'google.csv', 'r', 'utf-16' ))

headings = f.next()
col = {}
for i in range(len(headings)):
    col[headings[i]] = i

rows = []
for row in f:
    rows.append(row)
If you want to print out a list of contacts sorted by last name:
rows.sort(key=operator.itemgetter(col['Family Name']))
Beverly Howard suggests that contacts with no "Name" field but only a "Company" field won't sync with Outlook. You could check for that:
if row[col['Organization 1 - Name']] and not row[col['Name']]:
    # Ensure these get synced across all devices, some don't!
Finally, after you've made your batch changes, you can write them back to a UTF-16 file like so:

out_file = open( 'google_out.csv', 'wb' )
google_out = UnicodeWriter( out_file, encoding='utf-16' )
google_out.writerow( headings )
google_out.writerows( rows )
It's been a long time coming, but I'm glad I've got more of my data in a place where it's easy for me to get it out and manipulate it any way I like.

Comments

 pastilla on Dec 30th 2010 at 1:57 PM
"So for 安室奈美恵's sake, choose Google CSV format." Firefly swearing! Yay!

 dblume on Dec 30th 2010 at 5:03 PM
Oh! I wish I'd thought of the Firefly swearing! You're right, it looks just like it. I actually chose a real Japanese name with the intention, "for her name's sake, don't let Outlook turn it into question marks, choose the format that'll retain the correct characters."

 sjonsvenson on Dec 30th 2010 at 8:17 PM
Happy B-day. I have lost enough contact data over the years. I converted back to an ink database with paper based storage. I never have a data-conversion problem with that. I also keep addresses and contacts on a computer. But in a plain HTML file.

 dblume on Dec 30th 2010 at 8:24 PM
Thank you! It's hard for me to reconcile you keeping addresses both on paper and in an HTML file. Surely, the paper should just be a copy of the file on disk, and the file on disk should be easier to maintain than an HTML file.

 sjonsvenson on Dec 30th 2010 at 9:03 PM
I used to keep my contacts on my Psion and convert them to a PC nearby whenever I felt like moving things. On PCs I used various formats but with most programs moving data from one version to the next was just as difficult and annoying as moving to a different program. I also always kept a backup on paper, in a filofax. When my Psion finally, after almost 15years, broke down the paper version was all I kept. It is the main contact list. The beauty of that is that my brother can update that just as well, and his wife can -and does-. There is also the interesting fact that the list was started by my mother, somewhere in the sixties. And some of the original data is still there. My HTML file is the backup these days.

 dblume on Dec 30th 2010 at 10:37 PM
You know, this only convinces me to create a cronjob that'll sync all the contact information across my family's individual accounts. As each of us updates individual contacts, the rest of can benefit from the new information.

 halophoenix on Dec 30th 2010 at 8:35 PM
Every now and again you make a post that really tickles that part of me that used to write code - the one that's been sleeping so long it's forgotten its memory. XD This is one of those posts. :D

 dblume on Dec 30th 2010 at 8:49 PM
Thanks! That's the sort of response that keeps me going with these technical posts. I'm really glad you "got it." All of a sudden, your data is back in your hands. You can do what you want with it. (I just made a handy new Christmas Card List for the wife, for example.)