Internet Security
Beware: I am a real neophyte when it comes to internet security. Having said that, I couldn't have fared any worse than Sony Pictures. They lost 1,000,000 plain-text passwords when a SQL injection vulnerability was discovered. I've been protecting against that attack since 2005. (At the part, "Is the password secure?" is where I say the passwords aren't stored in plain text. SQL injections have been the subject of security jokes for a long time, too. Ah, Little Bobby Tables.)
There have been and continue to be large breaches of personal data on the internet. Nathan Yau shares an infographic of the largest data breaches of all time.
My immediate family and I need a way to keep each other up to date with our changed account info and ID numbers. We need a solution that meets the following usability criteria:
- Accessible anywhere, from any device. It has to be practically just one click away.
- Trivial, memorable URL. We may be typing it directly into the URL bar.
- Always up-to-date. Any change made from anywhere is accessible immediately from any other client.
If it's not that easy to use, it won't be used, and there'd be no point in making it. On the other hand, it has to have the following security criteria:
- Accessible anywhere, from any device. It has to be secure even over a public wifi network.
- Secure from remote client attacks. It has to handle attacks over the internet.
- Secure from local attacks. It has protect against disgruntled hosting company employees.
With all that in mind, I've decided to roll my own information vault. Here are some goals and notes from that venture:
Be A Low Value Target
My first line of defense is that my information vault is just for me and my family. This'll never store enough data of real value to make it a target for the economics of it. I might get attacked, but it'd only be for the idle challenge of it.
Block Direct Access of Data Files
Move data files off the server, even though they're encrypted, or into directories tightly controlled by permission settings and .htaccess instructions. Test both attacks. If your encrypted files can fall into your attacker's hands, they can try a local brute force attack. (More on that below.)
Use HTTP Secure
For any data that is accessible, use HTTPS. This is the first line of defense if you want your data accessible over a public wifi network.
Unique and Long Master Password
Force your users to use a long random, impossible-to-guess master password. Prevent any sort of social attack: No names, dates, or places. In my case, since I'm the creator of the tool, I can do this.
Use a Hard-To-Compute Hash for the Master Password
Related: Do not store the master password anywhere. And the salted hash you use for it should be secure. Refer to this wikipedia article on cryptographic hash functions to see relative weaknesses of the functions. I've considered throwing in with a hashing algorithm that adapts to faster hardware to frustrate brute-force attacks.
Don't Store any Data in Plain Text
This is a defense against a local attack from someone who can obtain file-level access, like a company employee with admin access.
Sony Pictures stored private data in plain text format, and thus enabled this interesting analysis of passwords in the Sony Pictures security breach. Consider your encryption algorithm carefully. I used AES, but am keeping my options open. I can change my backend at any time.
Limit Cookie Scope
Limit your HTTPS cookie scope with morsels like max-age, httponly, domain, path and secure morsels.
While you're at it, it doesn't hurt to salt cookie and session data with an identifier associated with the request. In Python you could use os.environ['REMOTE_ADDR'].
Protect Against Javascript / SQL Injection
Know what kinds of attacks can be performed. Encode characters that have special meaning for the languages you use, like the quotes characters, <, >, and &, among others. In Python, the bare minimum you'd use is cgi.escape for that, but you'd want to use other functions depending on where/how the data is travelling or being displayed.
Analyze and Act Upon Suspicious Activity
It's not enough that your server is passively logging each access. Your site needs to analyze recent activity and take action (like email you or ban certain origins) when preset triggers are tripped.
Keep Protecting
Security is not a product, but a process." --Bruce Schneier, author of "Applied Cryptography"
This blog entry may have already has fallen out-of-date with new measures I've taken to protect our information vault.
If I'm missing a vector of attack, or you have some practical advice for me, I'd appreciate hearing from you.
Go Left To Read Older Entries
Michael Heilemann said it succinctly two years ago when he explained that pagination navigation should have newer stuff on the right, and older stuff on the left.
He wrote:
Consider a blog like a diary. You start writing on the first page and then go towards the right. And since the first page of a blog is the latest entry, to go to the older entries, you have to press the arrow that points to the left.
Left = Old.
He's got it right, and Wordpress (among others) has it wrong. I love Wordpress, and in order to help Wordpress users fix the code in their themes, here's some sample code that corrects Wordpress's default pagination navigation:
For pages with single entries:
<div class="left"><?php previous_post('« %','','yes') ?></div>
<div class="right"><?php next_post(' % »','','yes') ?></div>
And somewhat confusingly, for pages with multiple entries you should have the following:
<div class="left"><?php next_posts_link('previous', 0) ?></div>
<div class="right"><?php previous_posts_link(' newer', 0) ?></div>
This intuitive idiom is important enough to get right, despite the fact that Wordpress's backwards implementation has gained some traction.
About My Lifestream
I'm really proud of my lifestream. Originally I got the idea from Jeremy Keith. (And I use a subset of his style. I intended to use my own style, but I simply love his, and I don't have any design skill.) A lifestream is an aggregation of your user activity feeds from across the internet. Essentially, it can be thought of as an automatic online diary. It writes itself.
I think I can be thought of as a late early-adapter. I thought I had a lot of original ideas as I made my lifestream, but it turns out that more often than not, somebody else had already implemented one of the ideas. Happily, no one seems to have made all the same decisions as me, so my effort wasn't wasted. For me, my lifestream really is the best lifestream ever! Here's why:
The Best of Both Worlds
Jeremy implements his as an aggregation of RSS and Atom feeds with no persistent storage of previous entries. So, as newer entries are made, the oldest entries are lost forever. His lifestream is always only the most recent few entries. Jeff, on the other hand, implements his with APIs, so he has access to the complete history of entries for any account. I maintain mine with feeds, but I imported my entire history from many accounts. My lifestream is huge, and spans years, even though I just started it a couple of months ago.
Also The Best of Both Worlds
Jeremy's lifestream is handy, because it never becomes unwieldy. It'll always be about the same size. Jeff Croft's and Emily Chang's persist every entry and thus continuously grow. They paginate their lifestream. You can view page 234 out of 399, for example.
I decided that 98% of the time, I'm only interested in something I wrote down in recent memory. Say, the last four weeks. So I made that the index page of my lifestream. Just the 28 most recent days of my online activity. It make for a nice, small page.
But the other 2% of the time, I'm searching for something older, or I'm feeling nostalgic. So I put my entire lifestream on one page, too. Sure, it's big, and I'll never browse it from a phone, but modern web browsers are perfectly capable of downloading it and rendering it, and will be able to do so for years to come. The entire history really has the same appeal to me as being able to search through a diary.
Even if I decide to paginate it eventually, it'll be easy, the backend will facilitate that.
The Details Matter
Since I provide my entire lifestream on one page, I also made sure to include the year for dates that precede this year. (Eg., October 5th, 2006. Note that that uses the intra-page anchor, another important detail.)
My lifestream has a discoverable RSS feed too.
But you know what? Nobody'd want a feed of a lifestream that constantly updates for individual entries. That's one thing that really bothers me about sweetcron feeds. They're just too noisy. Update, update, update!
So the RSS feed for my lifestream only provides weekly updates. That's what I'd really want from a lifestream feed. Just some sort of nice regular overview of all the activity over a certain period of time. And its permalinks are intra-page links into the huge complete history page.
Some of the accounts that I include in my lifestream don't support user activity feeds. For example, YouTube's feed for each user's Favorited videos doesn't have "date-favorited" information associated with it. Since I wrote my own lifestream engine, I was able to work around that problem. I doubt that most lifestream services like FriendFeed would go to the lengths I did in ensuring that I get exactly the information I want, regardless of whether or not the site's feed or API supports it.
It Helps Me Find Things
Searching for things half-remembered turns out to be pretty successful at the lifestream. I sometimes don't know if I posted a link to delicious, or if I plurked it.
It Encourages Me To Write More Clearly
I always think twice before I write a clever title to a tweet, plurk, or blog entry. I realize now that I may well be searching for that entry in the lifestream later, and the lifestream may only have the title. (The lifestream also contains actual content from the entries, but the content isn't presented in the web pages. So maybe the content will be searchable too, eventually.)
Cleverness is out. Accessibility and searchability are in when you have a persistant searchable lifestream. Now, I strive for clarity in my titles.
I also stopped services that cross-post from one service to another. Having the lifestream made the idea of cross-posting even more redundant. If my livejournal friends don't want to see my tweets, I won't force them to with LoudTwitter.