Brian Slesinsky's Weblog

Sunday, 25 Feb 2007

Should We Support OpenID?

OpenID is a new way of logging in to websites that's been getting a lot of attention lately. After discussing it with some folks at work and reading Tim Bray's post, it seems likely that lots of people are trying to figure out whether it's any good. So am I. Since I haven't read the OpenID standards very closely and OpenID 2.0 is still in development, I'll concentrate on the big picture. Hopefully I won't get too many of the details wrong.

The security provided by OpenID seems very similar to email verification, so I'll start by discussing email.

What does email verification actually do? Suppose you're running a social website. (Millions of people do; the simplest social website is a blog that allows comments.) Everyone in this situation has to think about whether and how to authenticate people who post comments. If you require email verification before posting a comment, what have you gained? Anyone can get an unlimited number of throwaway email addresses, so you aren't stopping anyone from posting. Email verification by itself does nothing to build up trust or stop spammers; there is no reason I should trust an anonymous user more just because I think they gave me one of their email addresses.

Why do we trust email verification if it's so weak? One reason is that it that it makes email addresses actually useful for sending email. If I don't validate an email address then there is a higher risk of spamming someone with whom I have no relationship. For websites that automatically send email to their users, this is pretty important.

But there's another benefit that applies even for websites that rarely email their users: email verification makes it harder for one user to impersonate another without his or her consent. There's nothing to prevent users from sharing accounts if that's what they want to do, but most users don't want to, and hacked email accounts are rare enough in practice that the security is often good enough.

As a result, email addresses act something like a signature. If you see the email address of a friend on a social website, you probably assume that your friend has an account there. The reason this works at all is because we trust that the social website does email verification, has reasonable security, and doesn't forge users' posts. Despite giving security experts nightmares, this is a pretty reasonable assumption provided that nothing is important is at risk.

To avoid confusion with real digital signatures and appease the experts, I'll call this property a pseudo-signature.

Like a signatures on paper, a pseudo-signature is only useful if you know the person who signed the document and you know what their signature looks like. (Since many of us don't have any signed letters from our friends and have probably never seen their signatures, written signatures seem to be gradually becoming security theater, but that's another story.) For an email pseudo-signature to be useful, you have to recognize it as the email address of someone you know.

How does trust work with OpenID? Although it uses digital signatures internally, OpenID (at least in early versions) doesn't implement digitally signed documents. However, like email, it provides usable pseudo-signatures, in that if you trust the website you're reading and it authenticates posts using OpenID, it is reasonable to assume that the posts there were written by the person who owns the displayed OpenID.

At first, signing a post with an OpenID seems much less useful than signing it with an email address because we don't know our friends' OpenID's. But the solution, assuming OpenID's are worth the trouble, is the same as for email addresses. An unknown email address is also meaningless, but we learn email addresses from our friends and put them into address books. If OpenID takes off, we will need to learn our own OpenID's, add them to our email signatures and business cards, learn those of close friends, and put the OpenID's of everyone else we meet into our address books. Websites will need to have OpenID fields in their user databases and to display them next to posts. This is why making OpenID's at least as memorable as email addresses is so important.

OpenID is getting some early hype, but it has a long campaign ahead of it because it's a new kind of personal identifier that everyone will have to learn. Just as there was a time when nobody had an email address except us computer geeks, most of today's users don't have an OpenID and don't know why they should have one. (They probably shouldn't have to learn yet, because standards are still being worked out.) It's going to take a while to reach the point where we see OpenID's everywhere, if it happens at all. It's a high hurdle, but unlike most security proposals, OpenID's have a killer app, so I wouldn't count them out.

What about insecure OpenID providers? Banning authentication via some OpenID providers is sort of like banning email verification for some domains. It might be somewhat useful to create blacklists of known bad domains, but new domains can be so easily created that it doesn't help much, and the email address of someone you don't recognize is nearly meaningless anyway.

On the other hand, whitelists of good OpenID domains could be quite useful when they define groups of users. For example, if you want to allow access to all employees of a company, you could grant access to OpenID's that come from that company's OpenID server. (This implies that it might be useful to have a new type of DNS entry to point to a domain's definitive OpenID server, similar to the MX entry for email servers.)

We hope our friends choose good email providers and their email accounts don't get hacked, but it's really up to them to practice good security. Similarly, it will be up to our friends to pick good OpenID providers. (At this point the security experts are probably thinking, "Lord help us," but maybe it's not so bad. Since the only purpose of an OpenID provider is to verify id's, and it's in people's best interests to keep their id's from being hacked, maybe there's some hope that trustworthy companies will win.)

Requiring https to post is sort of like refusing to accept your friends' email unless they use PGP. That makes it sound impossible, but since OpenID is still new, maybe it's workable. It would be as if PGP were around from the beginning and everyone used it, so non-encrypted email was suspect. Requiring encryption is certainly going to be easier now than later, and it was so successful for commerce sites that few people would now enter a credit card number into a website that doesn't have https, so it's worth a try.

This is a big change, so why should we do it? The main reason OpenID was invented is to make it easy to sign a post at a social website without doing the email verification dance, and without having to create yet another throwaway account with the same throwaway password as at dozens of other sites you visited once. It would be great if OpenID could entirely replace email verification for websites that have no other reason to ask for an email address, and eliminate passwords along with the usual lost-password recovery procedure, at least for throwaway accounts. Then email verification would only be needed for websites that actually plan on sending email.

Naive implementations of OpenID providers will have some pretty severe problems (such as phishing), but this seems easier to fix than email verification, and assuming there's no fatal flaw in the protocol, users who care about security can take action by choosing secure providers and recommending them to their friends. There's no reason an OpenID provider has to train its users to enter their passwords into easily-phishable HTML forms, or even have passwords at all; if your OpenID is important to you then you'll want to pick a very secure OpenID provider that uses card keys or something like that.

If you choose a good provider and log in using an alternate method so that there's no password prompt, the redirect to an OpenID provider becomes a request for your approval to pseudo-sign a document. The website doesn't need your approval because it could easily forge your signature just by putting your OpenID on a web page, but reputable websites will ask in order to avoid posting false messages.

Furthermore, it seems fairly easy to add real digital signatures to the protocol: the requesting website sends a dialog prompt, which the OpenID server displays for your approval and signs if you click "yes." The requesting website could then prove to a third party that the person with a certain OpenID clicked "yes" to a certain dialog prompt on some date. I'm not sure how much such proof would be worth if people get into the habit of clicking "yes" to lots of OpenID requests, but it seems like it's worth something.

How many OpenID's is the average person expected to have? I assume that at the very least, we will have different OpenID's for work and home, just as we do now for email. Employers will want their employees to have OpenID's on a company-controlled server, so that they can be confident in its security and revoke an ID when an employee leaves. Employees will want to have at least one personal OpenID on some other server that won't go away when they change jobs, and probably more if they have some activities that they don't want associated with their main personal ID. (And who doesn't?)

How will this affect privacy? If you use the same OpenID everywhere then the web sites who who have it can correlate their records and exchange information about you. This seems bad, but it's no worse than using the same email address everywhere. You can still create throwaway OpenID's, just as it's possible to create throwaway email addresses.

What about search? If your OpenID starts appearing everywhere then it obviously makes vanity searches and other people searches much easier. Those of us with unusual names have been living with this for a while already. It could become more of an issue if new ways of searching become possible using OpenID's.

How can businesses use OpenID? People will probably assume that, as with email, OpenID's that are associated with a company's domain name belong to employees of that company. Companies will have to make rules about what their employees can do with a company ID. It's interesting that these rules can probably be enforced much more easily than with email addresses because the company OpenID server can probably refuse to verify OpenID requests that are against the rules. I expect that there will be some employees (such as product evangelists) who are authorized to speak for the company and can use their company ID anywhere, while others are only allowed to use their company ID on certain company-approved external servers. Also, there will probably be company ID's that are associated with a function rather than a particular person.

Signing a document that appears on a company's own web servers with a company ID provides little benefit because we already assume companies control documents on their own servers. But they might be useful for integration with vendors. Suppose a company wants to outsource an employee benefit site to a vendor. If the company is running an OpenID server on the Internet and its employees know how to use it, the company can create a new subdirectory on their OpenID server containing alternate id's for employees that are allowed to access the vendor's website, and give the vendor the URL for that subdirectory. Assuming the vendor and the company trust each other to run secure servers, this should be sufficient to allow single sign-on without any custom development, and the subdirectory would work sort of like a group. As a result, the employee might have multiple company ID's, each for a different purpose, and might not even know or care how many they have, since some of them are used only for compartmentalizing communication between companies.

With single sign-on solved, the tricky part then becomes exchanging databases of employee attributes. This is sort of like individuals managing their address books and sending updates to their friends. OpenID is not an address book protocol, but it can easily integrate with with any address book because a person's OpenID is just another field. By not solving the whole problem in one protocol, we can cleanly separate the problems of authentication versus authorization and allow them to evolve separately. (Or alternately, by using the subdirectory trick with anonymous id's, we can make it harder for a vendor to correlate employee id's with other databases. This might be useful for allowing employees to use a vendor site without revealing too much to the vendor.)

At this point we've gone far beyond OpenID's original purpose, as a way for a blogger to use the URL of their blog as an identifier when he or she posts on another blog. OpenID can still be used that way and it will probably be its first killer app, but it seems likely to evolve into an Internet-scale identity protocol that's used for purposes far beyond its original purpose. As a protocol, the distinction between OpenID and other attempts at Internet-wide authentication is likely to be that it builds on top of https in a REST-style way, so it inherits the strengths and weaknesses of the web, for better or for worse. On the whole, building on top of https seems like a much better idea than building on smtp and other mail protocols, so it looks like a win.