Sunday, May 4, 2014

Identity 2.0

Is it just me, or has the Internet been turning into a really awful place?  The privacy abuse, surveillance, and censorship alone make it painful to contemplate the future, or even envision solutions.  We're like the proverbial boiling frogs, and the heat's turning up so high we can't help notice it, but yet we still don't jump out of the pot.

I was thinking about how ridiculous and depressing this was, when I suddenly realized that maybe we're not looking at the right problem.  The real root problem -- for all of it -- is identity.  And that's actually much easier to fix.  And we can do that ourselves.

See if this makes sense.

In the digital world, we’ve been split into a zillion shards of data, which are stored and traded by people we don’t know, and who continuously harvest it in pursuit of profit.  Our personal data has literally become the raw material for the bad behaviors we see.  We've lost control of our identities.

But maybe the problem isn’t that anyone is taking away our identities.  Maybe the problem is that we’re willingly giving them away... to strangers.  And when you put it that way, it's obvious that cannot ever end well.

So... What if all our stuff just remained private?  What if it were visible only to ourselves and those we choose to share it with?  What if all our interactions took place directly between us and our network, without the need for any third party services at all? 

Well, we could have social networks without Facebook, tweeting without Twitter, photo sharing without Instagram, email without Gmail, and IMing without, well, whatever multiple networks we all use today.  There’d be no need for YouTube or Tumblr or Pinterest or WhatsApp or SnapChat or Dropbox or any of it.

What if we could simply bypass these services altogether, and do all that creating and sharing privately amongst ourselves?  What if it was just… us?

I’ll tell you what would happen.  We’d take back our digital identities.  With this simple flip in perspective, we'd gain active command and control of our digital lives.  At the same time, we'd end all of the problems that result from giving our most personal and valuable stuff away to strangers.

We’d each retain ownership of everything we create, and be in precise command of what is shared, with whom, and under what terms.  Since we’d possess all our data (both things we create and things that are shared with us), we’d gain the ability to view and search and present everything in ways that simply don’t exist today.  And as an incidental outcome, we’d collectively create a whole new identity framework, one that would drive major innovation anywhere security and privacy are important.  Which is everywhere.

In this post I will propose a model that, I believe, can achieve this vision.  Its core component is a new atomic element for the Internet, the cloudspace.

The cloudspace interoperates seamlessly with today’s Internet, while adding a missing layer of personal privacy.  It supports every feature of every cloud service or social network, yet improves upon all of them in fundamental ways.  And like the World Wide Web, it’s free (free as in beer, and free as in open), so there's no owner, and no one can ever extract a tariff from the people who rely on it.  At the same time, it provides vast opportunities for innovation and even monetization -- just not in any way that involves seeing our private stuff.

And because it must be, Digital Identity 2.0 is a model that’s completely opt-in, at the beating heart level.  Anyone can join, and nobody can stop anyone else from joining.  The benefits start the moment there are two participants, but grow exponentially, in proper network effect fashion, with each person who adopts it.

To understand the proposal, first you need to understand how badly digital identity is screwed up today.

Identity 1.0

In a nutshell: the world has completely botched the implementation of identity in the digital world.  We're still using the same login/password model, unique to each service, that predates the Internet.  Is that really an accident?  Think how far we've come in so many other areas.

This seemingly prosaic annoyance is actually the root cause of many of our biggest problems.  Because we accept a 1960s-era identity model, control has been effectively surrendered to the people who provide these Internet services -- even though we instinctively know that they are more dependent on us than we are on them.

Because it seems out of our individual control, we accept all these awful problems as the price of creating and sharing content.  Every day we suppress this resentment as we spray many shards of data across multiple apps and services, where we explicitly give strangers control over our content, and allow them to monitor our actions.  (Prime example: Facebook has been able to track us across the Internet since its last terms of use update; in 2015 they'll be able to track us physically 24/7 via GPS.) 

When you consider all this, it becomes obvious that the only real solution is to stop letting those apps and services see what we do in the first place.  But clearly that won’t happen if it means foregoing all the things that these services enable us to do -- all the posting and tweeting and sharing and IMing and emailing.

Here's the dirty little secret.  None of that stuff we sell our souls for is magical, or even remotely hard, from a technical point of view.  Virtually all of it is defined by open standards and/or established conventions.  The only leverage is our need to hang where our friends are hanging, and our collective illusion that we need Facebook (or whomever) to do that.

Drop that illusion and things change in an instant.  For the people violating our privacy, spying on us, or trying to control what we see (hi Zuck), the nightmare scenario is a simple one.  If we grab control of our own identities, we will starve the Internet of the very content that it needs in order to abuse us.

And let me be crystal clear about this.  Once a cloudspace framework is in place, Facebook is obsolete.  Nobody needs it any more, and I suspect most people will be more than happy to escape its clutches.  In fact, obsolete is ANY service that relies on user data to profit: Gmail, Twitter, LinkedIn, Dropbox, Instagram, SnapChat, YouTube, Tumblr, Pinterest, Uber, etc., etc., etc.

Because, when we can do all this stuff ourselves, privately and securely, what's the value of those services going forward?  If they adapt fast enough, there may be a way to retain some partial value in directory or orchestration services, but good on them if any can make it worth our money.  On the other hand, any profit model built solely on seeing our private stuff is well and truly borked.  And that's a good thing.

Identity 2.0

To visualize the changes described above, consider this simple diagram.  (Click to embiggen.)

Today is Digital Identity 1.0.  All interactions take place between us and some cloud service, which then completes a corresponding transaction with our intended party.  That’s what lets these services see what we do -- they insert themselves as middlemen.  Then, through contractual terms of use, we are rendered subservient.  And we must remain that way to keep using those services.

With Digital Identity 2.0, everything transacts directly between the parties, with no middleman required.  The difference is the green circles, which represent each user’s cloudspace, and automatically handles all interactions without compromising privacy or security.

Technology Architecture

The model requires three primary components; all three are net new solutions, but are built on existing technology -- some of it only recently available with the emergence of the "API Economy."

The main component is the cloudspace, which is simply a digital identity database -- a private “lockbox” for all the content you create, and where you manage sharing when you choose to do so.  It's a personal data vault: it's like your Documents folder, plus your social interactions, plus everything you generate in the future.

Once the cloudspace is in place, two new categories will complete the picture: cloud hosting to handle the database interactions; and apps to manipulate and present the data.

1. Cloudspace
The primary component is a standard JSON database -- a modern, cloud-aware database, to be sure; and we’ll take full advantage of its capabilities.  But it’s just a data bucket, like your device hard disk or Dropbox or Google Drive.  And since it’s just a single computer file, it’s compatible with any technology or platform.

Each user instantiates his or her own cloudspace; the signup requires only an email address.  What happens behind the scenes is what's different. 

The service creates a private data store for each user, encrypted so only the user can see.  Everything inside is manageable, via any tool written to the APIs.  This becomes your personal filespace, for all the things you currently store locally or on a network drive -- files, photos, videos, music, etc.  It also houses all the social or collaborative content you create -- your posts, tweets, IMs, email, etc. -- and orchestrates all the stuff that is shared by other users.

Perhaps most importantly, it's the destination for the coming explosion in personal data that will be generated by the "Internet of Things" -- all the GPS, Fitbit, Nest, home automation tools, etc. that will proliferate in the next few years.  That stuff is now scattered everywhere and it's growing worse; shouldn't it all be someplace only you can see?

Unlike all the drives and backups we need today, this single data repository can grow with you over your complete lifetime, since it’s cloud hosted and managed.  And because the cloudspace is standardized and self-contained, moving between cloud hosting services is easy and fully automated.

Through your content and interactions, your cloudspace forms the authoritative digital representation of your identity.  It grows and changes with you -- just like a real identity.  It's a single place to manage your digital identity; you're in complete control.

The database itself contains no application logic; it really is just a container.  But it has some quite useful data features, including the ability to sync efficiently/differentially across multiple copies, and a rich API set to expose its contents securely to other databases and applications.  It will have a radically extensible schema to support virtually any data type, now and into the future.  And it will have a ridiculously long private key that will prevent its encryption from getting cracked by anything short of a future quantum computer -- yet still be upgradable to keep up with such advances over time. 

There are a couple other critical elements to the design of the identity database.  First there’s a certificate that identifies you as the owner of the database, to other users and to applications.  It requires no third party certificate authority, because like your offline identity, its validity is proven over time based on your activities and relationships (e.g., a cloudspace that pays all your bills is pretty sure to be you).  It works because it always represents you -- just as a real-life identity does.

The other critical element is your contact list, which in this model becomes your social network.  Much like getting “friended” on Facebook, someone can request to be added to your contact list.  Since this is an automated process between the two users’ cloudspaces, the contact can be stored (and continuously synched) complete with metadata describing membership in public/private groups and other unique constructs, as well as preferences concerning communication and sharing.  And since your social network is now in a place where it’s completely under your control, it’s easy to fine-tune your personal groups for easy sharing, in a way that works the same across the different interaction modes. Because of these factors, the cloudspace model will peg the EFF’s Secure Messaging Scorecard.

The database itself must be open source, perhaps derived from Apache CouchDB or another mature player in that space.  Open source code is critical for this component, to eliminate the possibility of “back doors” that can hide within closed source software, and to assure that all APIs are known.  Its open source nature keeps the critical storage component of the cloudspace from ever falling under the control of anyone who can extract tariffs.

With this infrastructure in place, other opportunities to improve digital interactions appear.  For example, if you are like me, today your electric bill appears as an email notification, and you go to the company’s site to pay.  Then you get an email confirmation.  The transaction takes place fully on the company’s site, and they retain all the information, not you.  Sure you can always go and review your records there, or save the emails, but you'd need to do the same for your car payments, mobile phone, and for all other specific customer relationships you’ve accreted over the years.

With your cloudspace, the electric company could simply share the bill with you (companies can have identity managers too), and you could review and push payment in your cloudspace's UI -- theoretically the vendor wouldn't even need your payment account info.  Then everyone has a verified and complete record.  On your side, you could view all your bill payments in one place, or even integrate with financial apps automatically.

2. Cloudspace Hosting Services
For your cloudspace to be useful, it must communicate with other cloudspaces, and that can only take place in the cloud.

It will require the development of a new service type, but one that’s little different from Dropbox or Google Drive.  The big change is these cloud services mimic a file system, but your cloudspace does that for you.

The host only knows you have one file, and must simply support the API orchestrations.  This is how cloud is changing technology, as today leading cloud services handle billions of API calls every day.  It’s all about making the authenticated connections, at speed and at scale.

Like the cloudspace database, there will be an open source cloud hosting app, probably based on OpenStack and Docker, so anyone could offer it.   Service options could range from fee-based to ad-supported to free, but will compete on the speed and reliability of their API processing.  A power user might happily pay a modest price for high performance.  A casual user might accept more latency, or (non-profiling) ads, for a free service, while still gaining all the identity benefits.

In some cases the APIs will be used to actually deliver data (e.g., email and IM), where in others they might use links or pointers (e.g., video and file sharing).  The critical part is that all of these activities take place within the context of each user’s authenticated identity and assigned permissions, and are encrypted end to end.

Imagine one big change: with some simple directory/aggregation services (an opportunity vector for ecosystem players), Youtube would be obsolete. People would just post their videos to their cloudspaces and specify the sharing as public, and the cloud service does the rest.  This would be especially attractive to bands or other organizations, who could decide how their content is presented (e.g., with/without ads), and establish direct relationships with their viewers.

That’s the primary role of the cloud hosting services.  Note that most services will also play in the app component space, discussed next, with device and/or browser-based capabilities for both content management and administration.  Notably, this will likely become your browser home page -- a highly customizable aggregation of everything important to you personally.

3. Apps
Equipped with this private digital space, you will need apps to manage all the posting and sharing and IMing and emailing -- equivalents to tools and services you use now on the public Internet.  Since the cloudspace really only has storage capability, the operational capability must be provided separately.  As you will see, this is a strength of the model, as it allows for universal platform support, and provides dramatic differentiation possibilities.  What now requires a complete service, with redundant database/hosting/sharing capabilities, instead only requires an app -- because the hard part is already done.  It’s especially appropriate for the mobile device space, where rich native client apps are in high demand.

It’s through apps (including the hosting service app discussed above) that you will interact with your cloudspace.  Apps may also range from free to ad-supported to commercial.  Like the cloud hosting component, if someone delivers value that people are willing to pay for, there’s a well-understood business model to use.

When you log into your cloudspace, your app(s) will interact with the content according to specific permissions you grant.  Apps provide the user interface, and abstract the functionality inherent in the cloudspace APIs.  In this way, you could choose an app based on lots of different factors and preferences.  For example, you might want an app that comprehensively manages your identity across different modalities (e.g., social, email, IM).  Or you might prefer targeted apps for a particular function (e.g., editors for files you create and store in your database).  Or you might want different apps doing the same things across mobile and desktop devices.  Since they’re all using the same data source, it’s completely flexible.

Similarly, the cloudspace also contains APIs and other constructs that are especially useful in its role as an identity management tool.  For example, it contains the aforementioned certificate services for authentication and encryption.  It also has a live friend/contact list, to richly manage relationships and groups.  And it has a facility for managing multiple aliases, so we may present as different users to the Internet (or as anonymous), yet see everything in one view on our side.

Other innovation opportunities open up simply because the data is in one accessible place.  For example, instead of being stuck with whatever sorting algorithm a service wants to force on its users  (“Top Stories,” anyone?), you could filter and tweak your social feed based on an app’s innovations in this area (please a “mute user” button!).  Or you might use an app that combines all content into one feed, creating a truly “universal inbox.”

Because all the components are cloud-aware, apps have a lot of flexibility.  For example, an email app may request a local copy of all email content, while a social network app might prefer to leave its data in the cloud and access it remotely, allowing the cloud service to pull all the data together for presentation.  That option exists in the cloudspace's APIs, which can create and sync subset copies of content based on the data requested.

The cloudspace also delivers state-of-the-art security, supporting multiple levels of permissions.  The ability to read or post something, in a specific app, could be controlled by a simple password (or fingerprint).  Escalated rights, perhaps with 2- or even 3-factor authentication, might be required for full data views, configuration changes, or content deletions.  These capabilities are also built into the cloudspace, not the apps you use, but would create great flexibility in app design.

Finally, since the apps can surface any data stored in the cloudspace, you gain capabilities you seldom see now on any public service, let alone all the services in one view.  It's the personal panopticon.  You will finally be able to find that old post that you made and want to comment on again.  And the ability to do personal analytics across your complete data set, privately, can deliver personal value without the profiling you get on the public Internet.

You could even apply digital rights management (DRM) to the content you share, something that's loathsome in the way the DMCA defines it, but extremely valuable when everyone's an equal.  You could delete something in your cloudspace with the assurance that it will be deleted everywhere, or you can prevent forwarding/resharing, etc.


If this post contains one insight, it's this: the only way to keep our identities from being abused is to keep our identities private.  But to do that, we need to change the fundamental nature of digital identity.  Fortunately, that may be relatively easy to do -- at least when compared to individually addressing all the identity-based problems we suffer today.

With the cloudspace, I have attempted to describe one possible solution to the digital identity problem -- one that, if it actually works, will truly restore power to the user, and solve most of our privacy and surveillance problems.  That's huge by itself.  But it will also greatly mitigate other issues like censorship (everything is encrypted), spam (every user is authenticated), passwords (you only need one), platform inconsistencies (native apps can all use the database), and system crashes (all your data is in the cloud, backed up).

I wrote this because, like many people, I am extremely troubled -- no, offended -- by these basic, but seemingly intractable Internet problems.  I just feel that there’s got to be a better way.  To me, the technical challenges seem solvable -- although like all product managers I have at times been guilty of underestimating development complexities.

I also understand that there are powerful corporate and government interests who don’t want all your stuff to “go dark” to them.  And I don’t underestimate just how hard they would fight.  Just the disruption to existing advertising and monetization frameworks would be huge.  But I also think that, with this approach or any other that enables direct, private, encrypted interactions, there’s not much anyone could do to stop it. 

And really, if Facebook or any of the other disintermediated companies were to be smart about it, they’d leverage existing skills and insights to profit from the app and/or cloud hosting opportunities, conceding that their business model must pivot away from profiting from our personal information.  But if I was to bet, they'd probably be outmanoeuvred by someone who does it better, and/or gets there faster.

So that's my simple proposal to fix the Internet.  I do hate to appear un-humble, but in this case, yeah, the ambition is that big.

I published this as a blog post for peer review -- the foundation of science, even among crank bloggers.  But even if this approach is proved unworkable, I’d like to hear what people think -- either publicly here, or privately via links at the top.  Most of all, I want to contribute to the discussion that we need to have about how identity should work in the digital era.

Arthur Fontaine
May 2014