[JOE]: Can you sit in the same chair?
[This episode is sponsored by Frontend Masters. They have a terrific lineup of live courses you can attend either online or in person. They also have a terrific backlog of courses you can watch including JavaScript the Good Parts, Build Web Applications with Node.js, AngularJS In-Depth, and Advanced JavaScript. You can go check them out at FrontEndMasters.com.]
[This episode is sponsored by Hired.com. Every week on Hired, they run an auction where over a thousand tech companies in San Francisco, New York, and L.A. bid on JavaScript developers, providing them with salary and equity upfront. The average JavaScript developer gets an average of 5 to 15 introductory offers and an average salary offer of $130,000 a year. Users can either accept an offer and go right into interviewing with the company or deny them without any continuing obligations. It’s totally free for users. And when you’re hired, they also give you a $2,000 bonus as a thank you for using them. But if you use the JavaScript Jabber link, you’ll get a $4,000 bonus instead. Finally, if you’re not looking for a job and know someone who is, you can refer them to Hired and get a $1,337 bonus if they accept a job. Go sign up at Hired.com/JavaScriptJabber.]
[This episode is sponsored by Rackspace. Are you looking for a place to host your latest creation? Want terrific support, high performance all backed by the largest open source cloud? What if you could try it for free? Try out Rackspace at JavaScriptJabber.com/Rackspace and get a $300 credit over six months. That’s $50 per month at JavaScriptJabber.com/Rackspace.]
[This episode is sponsored by Wijmo 5, a brand new generation of JavaScript controls. A pretty amazing line of HTML5 and JavaScript products for enterprise application development in that Wijmo 5 leverages ECMAScript 5 and each control ships with AngularJS directives. Check out the faster, lighter, and more mobile Wijmo 5.]
CHUCK:
Hey everybody and welcome to episode 148 of the JavaScript Jabber Show. This week on our panel, we have Jamison Dance.
JAMISON:
Hello friends.
CHUCK:
Tim Caswell.
TIM:
Hello.
CHUCK:
I’m Charles Max Wood from DevChat.TV. And this week we have two special guests. We have Matt Asher.
MATT:
Hello.
CHUCK:
And Dann, your last name is not on Skype.
DANN:
[Laughs] Sorry. Dann Toliver.
CHUCK:
Awesome. Do you guys want to introduce yourselves really quickly?
DANN:
Sure, yeah. I’m Dann. I’m the lead architect on EveryBit. I’m a partner at Bento Box and Bento Miso. I make things and give talks and run a bunch of meetups.
MATT:
And I’m Matt Asher. I’m the founder of EveryBit. By background is in print publishing and statistics. And then I made the move to online publishing and web development in the late 90s and have only looked back occasionally.
JAMISON:
[Chuckles]
CHUCK:
Awesome. So, you’re on today to talk about I.CX and EveryBit. Do you want to give us a little bit of background before we get going?
MATT:
Sure, no problem. So, the germ, the seed of the idea goes back a long way. But the actual development of the project began about a year ago. And we started it off by bringing together a whole bunch of developers in a Kickstarter way and trying to bang out some kind of version of a platform for decentralized publishing that would be secure and allow people to basically selfpublish and share their content in a peer-to-peer way over the web browser. And we’ve stayed more or less true to that vision over time in terms of building it up. And I’ll let Dann give a brief introduction to the architecture behind the system.
DANN:
[Chuckles] Yeah, that’s quite a setup. [Chuckles]
DANN:
The brief introduction to the architecture behind the system. So, we’re trying to build a platform where applications rather than competing to overcome the network effects of large players in say the social networking space but we could apply this to any space, these applications can share a common substrate of data and user management. And can compete in this ecosystem based on their technological properties and their features, and how much they really add value to their end users rather than competing based on how many users they can lock in. It shares a lot in philosophy with, I actually listened to the interview you guys did I guess a couple of months ago with Ward Cunningham.
CHUCK:
Mmhmm.
JAMISON:
Sure.
DANN:
On the Smallest Federated Wiki. So, I listened to that last night and was amazed actually at how much his philosophy in thinking about that and the kinds of problems that he’s trying to solve mirrors his philosophy and the vision that we have for the future.
JAMISON:
So, I read the white paper and it was really well-written. You lay out the thesis that these internet companies that have sprung up started by being open. And as they’ve grown larger and more successful it’s more beneficial for them to close off their ecosystem. So, EveryBit is kind of a reaction against that. Is that accurate?
MATT:
It’s definitely informed by the view that I see of essentially a move, a strong move, over the years towards decentralization in publishing and more liberty, more personal autonomy, and ease of dissemination.
And in recent years, what we’ve seen is exactly that. That the companies that have managed to win, becoming gatekeepers and becoming these central places where people publish their stuff, are now beginning to realize that while this puts them in a position where they can benefit, [where] they can do things in order to shape people’s news feeds or lock people in or out of the system in order to benefit themselves financially. In essence recreating in a different way some of the old traditional structures that were there for publishing where you had to, in order to have other people view your stuff you had to go through a publisher. And you had to go through that process and it had to be vetted. And then somebody got a cut every which way, the book sellers, and there were royalties paid here and there all along the line.
And the great thing about the internet was it became a place where anyone could publish anything and you could go from one reader to potentially billions in a very short span of time. But then it is a reaction then to that reassertion of these companies in a role of gatekeepers and intermediaries between ourselves and our audience.
JAMISON:
So, that sounds very inspiring. How on earth do you make money?
MATT:
[Chuckles] So, there are a couple of facets to the business model here.
JAMISON:
I guess the better question is how am I sure that this will continue to exist?
MATT:
Right. So, we’re self-funded right now and comfortably self-funded at the moment. We are eventually looking to bring on extra investors to have the more leverage for doing other projects, bringing on other developers. But also to get more people involved in the project. But the business model and the reason that we think this is going to continue is because there’s a very long-term play there on the usernames themselves. So basically, maybe I could back up and talk a little bit about the pieces of the system. Is that okay?
JAMISON:
Oh, for sure.
MATT:
And then talk about how that fits into the broader system and the reasons why you might think this will actually survive.
JAMISON:
Yeah, that sounds great.
MATT:
Great. So, the [chuckles]…
DANN:
You guys are really starting off with the hard questions. [Laughs] [Chuckles]
DANN:
We were expecting you to start asking technical questions.
JAMISON:
Oh no, I…
DANN:
And you started off by asking the business model questions.
JAMISON:
I want to get to technical questions, but it just sounded like something someone would do for… I didn’t see how someone would be benefitted from it.
DANN:
Yeah, yeah.
JAMISON:
And that makes me worry that it’s not going to exist.
DANN:
Right. And I’m actually really glad that you guys brought this up right off the bat. And yeah, I may jump in and answer after Matt answers also, because there’s a complicated answer to this question that touches on the technical aspects of the system and also touches on our motivations for doing this. And the way that we’ve shaped our incentives to mirror the incentives of the applications that are built on this and their end users. So, it’s a really good question. I’m glad we’re getting this out of the way off the bat. [Chuckles]
MATT:
Yeah, so let me just dig right in then to the pieces of the system. And then I can get to how we structured things. So, the system has content. We call them puffs. It has usernames. And it has a protocol, a system for secure communication. And the pieces of content, they’re static. And they are signed by users. And we use this signature both as an identifier for that piece of content and also to authenticate the users.
So, essentially when you think about trying to do something in a decentralized context you have a few things that you have to worry about. You don’t have someone to vouch for a user and that they really did create the piece of content. You have to ensure that that content hasn’t been mutated, changed, or deleted without some kind of notification along the way. And you have to think about how you’re going to maintain that content as private when you’re passing it around a decentralized network and make sure that only the intended recipients can see it.
So, we’ve put together these pieces in a way and we’ve decided on static content, static signed content, so that these pieces of content can be stored really anywhere. They can be passed around without worry and that anyone else on the system can authenticate that piece of content. And they can link to it in a sense without worrying that it’s going to mutate or change over time. Or they can in fact chain together pieces of content in order to build up a record or something similar to say a contract over time that’s built up with these pieces that can’t be mutated along the way.
JAMISON:
Where are the pieces stored? Are they stored centrally or are they stored on each of the clients?
MATT:
Eventually they’re stored on each of the clients. Right now we have a dual system going on. We have both storage on a server and we also have storage that we’re using in local storage on the individual client. Actually, I’ll let Dann talk a bit about that.
DANN:
Yeah. So, the ultimate vision for this platform is fully decentralized. And we’ve set it up in such a way that once it achieves that ultimate vision of fully decentralized and fully distributed, we essentially don’t have any control over it at that point. So, we’ve got a roadmap and we’re building it out to the point where we can release it essentially into the wild. So, in asking about the sustainability you’re really in a way asking about the sustainability of two different things. One is the company that is behind this. And the other is the platform itself.
And the platform, once it achieves its targeted goal, is self-sustaining. It doesn’t really need anything else. And in fact, in a way if we’ve designed it correctly, then it’s not only that we can’t control it anymore or shut it down. It’s that nobody can. Essentially, it should be self-sustaining in the same way that the internet is. And then the question about the company is that the company has reserved a block of usernames from the username space in the same way that one would reserve say some domain names from the domain name space and we’ll be selling those to sustain the development of this project as it moves forward. But it’s an open source project as you guys know. So, there’s also that side of the development. Does that answer your question, kind of, Jamison?
JAMISON:
Yeah, yeah.
DANN:
Your first question, not your second one about where things are stored. [Chuckles]
JAMISON:
Sure, sure. I like how you mentioned the internet, because it does seem in a way similar. It’s this utility layer of a way to pass data around and route things to people, just built on top of the internet already.
DANN:
That’s right, yeah. So, we envision a future in which applications can share the same username space and data propagation space. And so, offline first or client only, there’s a lot of different names for this kind of application model where you’re essentially building your entire application in the browsers' space and you’re not connecting to a server, really. Or maybe you’re connecting to something like Firebase or one of these other services that allow you to do some storage. So, we’re doing that, but at a distributed level essentially, so that applications can share the same username space. And as more users come into that username space, it continues to grow that plot of ground that these applications are, I don’t know. I’m visual so I’m envisioning this ecosystem. And applications are growing in this ecosystem. And the substrate of that, the soil, is composed of the shared username space and shared data pool that they’re drawing from.
TIM:
I think I’m starting to grasp this. I came in a little late here.
DANN:
Right, yeah.
TIM:
So the storage, is this kind of like a distributed hash table type storage?
DANN:
Yeah, exactly. So, the user records…
JAMISON:
So, do you want to define what a distributed hash table is first of all? There might be people listening that don’t know what that means.
TIM:
Oh yeah, that’s a good idea. I’ll let you guys do it since you’re implementing it. [Laughter]
DANN:
Sure thing. Should we do that first, is the question. So, there are two different layers of storage, right? One is the data storage and one is the user records. All of our data is immutable. The unit of data that’s being passed around in this system is signed by the private key of the user that created that data. So, if I receive a chunk of data in the system and I look at that piece of data, I can check whether the signature matches the public key of the user who created it. And if it doesn’t, then I know that it’s been changed somewhere along the way.
This is really important because we don’t have a central server that’s serving as a trusted third party that tells me that you, Jamison, are the one who created this piece of data and that I have received it in an unmodified way. I’m looking at the data that you actually created. We don’t have that, once we go into the decentralized, distributed version of this. So, there is no trusted third party. And in fact, your data is getting to me via many other browsers, potentially, going through WebRTC in this network to get from you to me. And any one of those could change it along the way. So, we need a system in place to be able to detect changes if they happen along the way. So, signing the data gives us that.
It also means as a side benefit, that the data is immutable. And that means that we can avoid a huge portion of the hard, horrible, problems that happen in distributed computing where things are being updated and you have to propagate those updates across the network while there are partitions that are possibly happening, right? So, we can avoid that because the data doesn’t get updated. So, this is at the data level a much easier problem.
To get back to your question about what a DHT is, a distributed hash table is a way of solving a slightly different type of problem, which is that we want to take a bunch of data, distribute it over a bunch of nodes, and have each of the nodes contain just a slice of that data. So, we’re going to have a ton of data in the system. And we would like to be able to efficiently retrieve that data from the node that contains it. So, one way that we can do that is to take a hash of the data. And as new nodes come into the system, so this is a tricky part of the DHT. Let’s say we have a bunch of nodes already. We’ve already taken a hash of the identifier of the nodes. And then we match the first few bits of the hash of the node identifier with the first few bits of the hash of the data to figure out which node contains it. Does that make sense so far?
TIM:
Mmhmm.
DANN:
Okay. And then the next step is I have a bunch of connections to people in this network, let’s say approximately log(n) connections. And I can find the person that I’m connected to on this network, or the node that I’m connected to, that is closest to the address of the node that I’m actually looking for.
So, this is a key property here. I’m not connected to everyone on the network directly. I’m only connected to a very small subset of nodes. And then I can find the one that’s closest to the one that I’m looking for. And I can send a message to that node and say, “Hey, I’m looking for this piece of data at this node. Can you help me find it?” And then they will send a message to the node that is closest to the node that we’re ultimately looking for. And if you think of this as… this is easier if we have pictures, right? Because then you can draw. I’m describing the Chord Algorithm. And you can actually draw the chords on the circle, right?
So, with log(n) hops, essentially we’re diving by two every time we do one of these hops. So, with log(n) hops we can get to the node that we’re looking for. And it receives the message that I’m looking for this information. And then it sends me back the information along that same pathway.
Does that make sense so far?
JAMISON:
It does. So…
[Chuckles]
JAMISON:
I think I missed this part earlier. There’s a property on the key that encodes some idea of locality so you know? How do you know which nodes you talk to are closer to where the data is located?
DANN:
How do you know which nodes you talk to are close to where the data is located? Yes.
JAMISON:
So, if I’m a node in the hash table…
DANN:
That’s right.
JAMISON:
And I don’t have the data you’re looking for, I need to know where to go to get you closer to it.
DANN:
Yeah. So, in the simplest form of this, we’re going to number each of the nodes. So, every node has a number associated with it. And then say there are 1024 nodes in our system, right? So, that gives us 10 bits of information that can be used to encode the identity of each node, right?
JAMISON:
Mmhmm.
DANN:
Okay, so we can use that 10 bits of information as an address for that node. We can make a hash of the data that we’re looking for. And then we know that the data that we’re looking for is going to be stored at the node that matches the first 10 bits of that hash that we’ve taken. And we know this because anytime we’re creating data we send that data to that particular node whose hash matches the hash of, whose identifier rather matches the first n bits of the hash. In this case, the hash is actually the signature. So, we already have a string, a bitwise string that is uniquely identifying that piece of data. So, we don’t have to hash it again. We can just look at the signature of it and send it to the appropriate place.
JAMISON:
Sure.
MATT:
So in a sense, each node on the network is responsible for storing a certain subset of the total amount of data that’s shared on the network. And everybody knows what the rules are for figuring out which nodes should be in charge of which pieces of data.
JAMISON:
And to be clear, is this the system that EveryBit uses to distribute the data across clients? Or the eventual system you want to use, I should say?
DANN:
Yes. So, they system that we’re describing right now is the system that, or is very similar at least to a system that EveryBit will eventually be using when it is in its fully distributed form. So, we’ve done some limited use case, limited tech demos early on. One of the first things we did as we were building this was to build out a system that used the P2P network exclusively for transferring and storing data to get a sense of how that would work and what we would need to accomplish in order to take that to its full form. That was using a complete graph. So, in a complete graph you don’t have to worry about where to get the data because you always know where to go to get it.
JAMISON:
It means everyone’s connected to everyone else, right?
DANN:
Exactly. Everyone’s connected to everyone else. If you’re using WebRTC, this isn’t really sustainable. So, we got somewhere between 10 and 20 connections on that order and it worked great. But it’s not really sustainable at the level of 2 million connections, [chuckles] right? You can’t be connected to all 2 million other nodes on the network.
JAMISON:
Sure.
DANN:
So, there are definitely, there’s a lot of work left to do on the road to the fully distributed form. We need to set up this mesh network where you’re only connected to a certain number of nodes on the network. We need to have the DHT infrastructure on top of that. And then we have to deal with the fact that the nodes in our DHT are neither fully trusted not entirely available, right? So, we have to have a lot of duplicate information. For any particular node, that node is actually going to be log(n), so say 100 different clients. And some of these clients will be headless browsers. They’re not all people connecting using their browsers. Some of them will be on the server, running perpetually. And they’ll be highly available. And there may be a system in place for rewarding them.
But what we have right now is the core kernel that works in a non-distributed way but doesn’t take advantage of that centralization. So, we’ve worked really hard to make sure that the API that we’re building is adaptable to either a centralized version or a decentralized version. And that we won’t have to change the core API significantly as we move down this road to decentralization.
MATT:
Yeah.
JAMISON:
Yeah, that was my next que-… oh, go ahead.
MATT:
Yeah, I was just going to jump in there and talk a little bit about how we did that. And also mention that we do have peer-to-peer tools in EveryBit.js right now. And people can take advantage of those. In terms of what we’ve done here to make sure that this should work as seamlessly as possible, we’re not storing any additional metadata about the content. So, we do have this data stored on our server that gives good availability and we don’t have to worry right now that the flow of nodes in and out of the network might make data unavailable. But we don’t have any additional metadata that we’re storing about the data that is special or private in any way. So, you can pass around these pieces of content right now in a peer-to-peer way without worrying that you’re getting an incomplete picture.
And we also have mechanisms for doing the trusts systems so that everything including the username system and the records and all of the updates are signed, so that anyone else on this peer-to-peer network can verify it. And in effect what we’ve got is individual ledgers or chains that you have for example for the username records. So, anyone can trace these back to their origination and verify that everything is above board once we’re fully in the decentralized mode.
JAMISON:
So, just to clarify this. Does this mean that if I were using your servers now, it shouldn’t affect the way I build things if you eventually move to distribute all this data to clients?
DANN:
The API that you’re using as you build applications on this should be the same, regardless of whether the backend is centralized or decentralized.
JAMISON:
Sure.
DANN:
Having said that, the API is versioned of course, so that we can expand it as we start to understand more of the issues involved in doing this, right?
JAMISON:
Yeah, that makes sense.
DANN:
Yeah. But from the testing that we’ve done and from the design perspective, we’ve been working really hard to design this in a way that is adaptable to the distributed setting. And that limits us in some ways. And we’ve accepted those limitations and tried to make them as positive as they can be. [Laughs] Oh dear.
MATT:
It’s a challenge.
DANN:
It is a challenge, yeah.
TIM:
I have a question. So, your DHT or whatever you’re using to split things up. What are the identifiers for your objects? Is it just literally a hash like the hash used in the signature? Or is there some other identifier generator for each item?
DANN:
Yeah, it’s literally the signature.
TIM:
Okay. So, it’s also content-addressable in a way. I guess if you knew their public key.
DANN:
That’s right, yeah. Yeah, yeah. The DHT is really for determining the public key of the user that you are interested in verifying a message from. Does that make sense? Or sending a message to, right?
TIM:
Okay.
DANN:
Yeah. Someone mentioned Space Monkey. [Chuckles]
DANN:
So yeah, this is very similar in a lot of ways to things like Space Monkey or MaidSafe or any of these systems that allow you to distribute data in a private, secure way across the cloud. But our vision goes a little bit further in that we’re also tackling the user management problem and sharing users between applications.
JAMISON:
So, this is incredibly fascinating. It sounds like some really interesting mad science. If I’m a humble JavaScript developer and I write my Angular or Backbone or Ember, I’m writing basic web applications. Why do I care about this? Or is this for a different audience?
MATT:
No, it’s absolutely for that audience. I think you should care because what you want to do as a developer is focus on the core strengths and differentiators of your particular application, the things that make it more awesome than others, the GUI, the user experience, the rich client that you’re building out. What you don’t necessarily want to worry about is implementing another username system. You don’t want to worry about enticing people to sign up for yet another account. And you don’t necessarily want to worry about the hassles of data storage, especially when it comes to scaling up your application. So, by using EveryBit.js, well we hope it’s a nice, clean API that you can use to access functions that are related to users and publishing content.
But you also, as we move more and more towards the peer-to-peer model here, get to piggyback on your users for storage and for CPU. Part of what we’re doing here is wrapped up in a vision where the client is king, in a sense. And most of the processing and thinking and storage actually happen on individual users’ devices. So, as a developer you might like the idea that as your network of users grows, as your client base grows, instead of you having to bring on more and more servers and scale up and worry about issues related to databases or databases even worse across multiple servers and maintaining that state, instead you can rely increasingly on your users to provide that. To provide the storage and in a sense the thinking that powers your network.
JAMISON:
Sure, that makes sense.
CHUCK:
I kind of want to change tactics a little bit and talk a little bit more about the nuts and bolts technology-wise. Let’s say that I want to build an app around this and take advantage of some of the things you’re talking about. Where am I getting started? Is it Node.js? If it’s in the browser, are there frontend technologies that you’re using?
DANN:
The platform itself is frontend agnostic. So, it’s going to be running in the browser. The idea is that you are essentially building your entire application, or as much as you can in the client itself, right? And then you may have some server-side that you’re connecting to. You may have third-party APIs that you’re connecting to. But in terms of how you’re using the platform, it’s entirely frontend architecture agnostic, in the sense that instead of making an API call back to the server, an Ajax call to collect some data, you’ll be making that call to the EveryBit platform. So, it’s as though you actually have the server component of your application injected into the client in some sense. And it’s doing all of the user management and all of the data management for you.
So, to give you something in particular, you can say EveryBit.postMessage and you give it some message content. And then it will send that message out onto the network, rather than making an Ajax call for example that sends some information down to the server and having it inject that into a database, which then is a message in a sense that other people who connect to that database can see. What EveryBit will be doing is actually sending that message out to the users that it is addressed to directly. Does that make sense?
CHUCK:
Kind of. I’m…
DANN:
[Chuckles]
CHUCK:
I’m still wrapping my head around the way that this is distributed and the way that it works. So, when you say that it sends the information back to EveryBit, that’s not a centralized service, is it?
Or is it?
DANN:
Yeah. Let’s try to separate something here. As the application developer, you’re going to include EveryBit.js.
CHUCK:
Okay.
DANN:
In the application that you’re building on the client side. So, what gets sent up to your end user’s browser is your application and also the EveryBit.js platform. And then your application is going to make calls into the EveryBit.js platform to do all of its user management and all of its data management. So far, so good?
CHUCK:
Mmhmm.
DANN:
And then what the EveryBit platform connects to is essentially whatever it’s connecting to. So, at the moment it’s connecting to a centralized server. And if you send out a private message for example, it gets encrypted on your client, sent to the central server. And then when the recipient of that message comes on and opens up their client using your application, it’s going to, the EveryBit platform is going to connect to that central server, pull up the private message, and decrypt it in their browser. So, that centralized server knows in a way nothing about the private information that’s stored on it. It can’t decrypt it. Your keys aren’t stored there. We don’t know anything about it.
In the decentralized form, it’s going to send that message out onto the network and it’s going to be stored exactly the same way, but on client machines that are out on the network, and also on headless browsers that are serving as long-term highly available propagators of information on the network.
TIM:
So, once it’s distributed, as a client you’ll be storing other people’s data in your browser?
DANN:
That’s correct.
TIM:
How significant do you think that will be on storage and performance?
DANN:
Yeah, it’s a really good question. So at the moment, local storage limits are something like five to 10 megabytes unless you put in special things into your browser to extend that, right?
TIM:
Oh, local storage is terrible. It [inaudible]
DANN:
Yeah, local storage is awful. Right, right. So, there’s a bunch of tradeoffs in terms of storing things in the client and how much you can store there. And exactly what things you want to store there, right? So for, let me answer your question slightly indirectly. [Chuckles]
TIM:
Okay.
DANN:
By walking a little more into the actual nuts and bolts of the technology, which I think is where we’re trying to head anyway. Part of the EveryBit platform is this idea that if we’re going to be doing user management and data management for users, we really should as much as possible allow the application itself that’s using the EveryBit platform to determine the meaning of that data that’s being passed around, right? So, this is in… there’s a tension here between allowing the application to determine the meaning of the content and having a fixed structure for that content that the platform itself can manipulate.
So, we’ve resolved this tension by having a fixed structure at the upper level of the object, the message object. So, it has a set of seven fixed properties that have to be there. And there can’t be any extra properties at that level. But one of those properties is the payload. And inside of the payload you can put extra properties. This is where you put the metadata about the information that you’re publishing. So, you in your application can determine the meaning of that metadata. And you can also determine by determining the meaning of that metadata, you can determine the relationships between data. And you can determine the importance of data.
So, there are two things that you, or there’s more than that, but there’s two things that are relevant to this conversation that you are injecting into the platform from your application. One is adding new relationships. So, this is a little function that you’ll write that says given two messages, there is a relationship between them… how do I explain this succinctly. [Chuckles]
So, to explain it succinctly you have to understand that there’s a graph database in the platform. So, it’s a client-side in-memory graph database. And the messages that are active in the client at that moment, the ones that it knows about, are part of that graph database. They’re nodes in it. So, as an application you can determine what the relationships are in a very particular way where you’re actually adding edges which have properties to this graph database, and then walking the graph database to get the messages out that you are interested in.
TIM:
Right.
DANN:
So, the other side of this is that you’re also adding heuristics that based on the metadata and based on the relationships between messages in your system, determine how important that particular message is to your application. And then there’s a resource allocator component of EveryBit that uses the available storage space that you have on that browser. So, in the case of local storage it’s five to 10 megabytes maybe. As an application you’ve asked the user to give you some file system space, maybe you’re using IndexedDB as a backing store. But we also have to think about RAM. How much RAM is going to be used by the client?
So currently, the two storage mechanisms that we have in place are local storage and RAM. And both of those have caps on them. I think RAM is currently set to maybe 200 megabytes and local storage is set to five megabytes, just for testing purposes. So, the resource allocator determines which messages are going to be stored in those places and which ones are going to be dropped off when it goes to store them. Does that make sense?
TIM:
Yeah.
DANN:
Okay.
TIM:
I’m not quite sure how that interacts with the distributed hash table where you tell someone, “Hey, you should store this because this is your location.”
DANN:
Yeah, yeah. That’s a good question.
TIM:
I guess worst case you’ll just have to host hardware that augments the network.
DANN:
Yes. Part of the idea is that as you’re building up an application, initially there will be, almost certainly there will be some server costs associated with that where you have for example headless browsers running on servers that have access to a lot of space that are making sure that no messages get dropped on the floor. As you build your application out and as it grows, rather than scaling in a way that is… where your server costs is essentially linear in the number of users (but the number of users that you’re acquiring ideally is exponential, so your server cost is increasing exponentially), you can actually start to wind some of those servers down because you have availability for the messages that your application is dealing with.
So, there’s an interesting property here where rather than building an application and having it grow really fast and having that kill your business because you haven’t found a way to monetize it and you haven’t bene able to get investments in time, and you run out of runway which is constantly shortened as you get more users, you have the opposite property where more users actually make your application more powerful and give you access to more data and more computational processing ability.
TIM:
Okay. I have a slightly different question. Where is this library served from?
DANN:
Yeah, it’s a good question. So, currently it’s…
TIM:
Because if that’s tampered, the whole thing’s insecure.
DANN:
[Chuckles] Right.
MATT:
[Chuckles] Right, and we do have that over HTTPS. And right now I guess there’s a couple answers to that question. The library itself is on GitHub. So, anybody can go on there. And if you trust GitHub in general, then you can download it from there and include it in your application. As far as the API calls go, those are done over HTTPS to our server. Right now the specific API calls for example to look up username in the database and get their public key. But that’s a good question and we are structuring and trying to structure, continue to structure things in a way so that you will be able to build up the consensus you need in a decentralized world to trust that the place that you’re going to get the public key of a user is, that that record can be verified and traced back to its root. You can trace back the signatures all the way back to the beginning.
TIM:
Right. That’s the root of difficulty of all public key. You get the GPG model and the SSL model and they’re both hard.
DANN:
Yeah, definitely.
JAMISON:
I don’t know what you’re talking about, Tim. Can you explain those things? [Laughter]
TIM:
Sure.
JAMISON:
I know what SSL is. I don’t know what you mean by the GPG versus SSL model.
TIM:
Or PGP, whatever. I’ll give you the high level. So, the SSL model is there are essential authorities that everyone just trusts, even though they shouldn’t. And it’s a hierarchy. So, if you want to make an HTTPS site, you buy a certificate from someone that’s trusted. And the browsers just ship a list of all these certs they trust. It’s a very, very centralized model but extremely convenient.
The other side is it’s just a distributed mesh of friends who know each other and there are these algorithms for how much you trust your friends. And to get on this system, you have to meet someone in real life and exchange keys. And as you build your network, you’re able to trust more people via your friends. So, both models have issues. But there’s, as far as I know that’s the two main models of getting public keys.
MATT:
We are ultimately hoping to do something that we’re tentatively calling proof of presence here, where the consensus is built up across the whole distributed network by in essence each person pointing out the other people in the environment that they trust, because they’ve built up trust over time by providing records that validate and continuing to maintain a ledger.
So, if you think about a system like bitcoin where you have one centralized ledger that everybody is contributing to, and the way that the consensus is maintained is through, essentially through CPU or I guess GPU and now I guess async power. So, whoever has the most hash power in essence or the combination, the decentralized combination of people with hash power, is the driving force for maintaining these centralized, not centralized, this individual single ledge with the entire history of state of the universe.
And what we’re trying to build up here is a system in which every individual record both is a chain unto itself with a starting point but also embeds some piece of information about the rest of the state of the network in it, so that you are able to have these individual threads or these individual chains of content that you can trace back and that are internally consistent. But inside each chain that particular individual ledger is also responsible for storing a piece of information about the rest of the network as a whole. So, you end up with all of these individual chains. But inside them you have embedded information about other ones.
And then that is how you can build up a consensus without having to rely on, say mining or this one gigantic multi-gigabyte ledger that everybody has to download and then have their computer grind through to come up with the current state of the universe by processing every single transaction in the history of time.
TIM:
Yeah, that doesn’t scale too well.
DANN:
[Chuckles] Right, yeah. As we’re seeing actually, the bitcoin ledger is up to 30 gigabytes or something at this point. And we certainly don’t want everyone to download 30 gigabytes into their client. This is untenable. And to go back to the two models that you stipulated. What we’re really hoping to achieve is a decentralized SSL model where everyone who’s using the system can completely trust the user records because they’re self-verifying in a particular way. So, you don’t have this network of trust that you have with PGP or GPG.
This is an open research problem. And a lot of what we’re doing right now is on the cutting edge of research that’s happening. And ideally there will be things coming along over the next six months or year or year and a half that we can take advantage of. We’re poised to take advantage of all of the research that’s coming out on distributed systems, on cryptographic user management, in these ways. But if this fails, and the research doesn’t appear and we’re not able to make those breakthroughs, we also have fallbacks in place.
So, one way to fall back is to have a set of trusted centralized servers that are serving the user record space in the meantime, while we’re trying to solve this hard problem of fully distributed user records.
TIM:
Right. I’ve been against this problem a lot. I worked at a peer-to-peer place for a while. And we had this weird hybrid system where we say, “Well, if you trust the GitHub API or the Facebook API, you just pick something you trust and then it’s all signatures from there.”
DANN:
Yeah, yeah, right.
TIM:
Kind of a hybrid system where you bootstrap off the existing world.
DANN:
Exactly, yeah. And we’re essentially playing both sides right now. On the one hand we’re saying this is a hard messy problem and here are ways that people are solving it right now. And we can use those to do this right now. And on the other hand we’re saying ideally someone will come along with a really good solution for this. And then we can just plug it directly into the system and go.
MATT:
To jump in here, there are precedents for doing this. It’s not like this is completely uncharted territory. There are ways that people have set up for verifying identity in a decentralized context. There are projects like Namecoin and others. They do just generally tend to rely on very large ledgers.
TIM:
Right.
MATT:
And mining. And that’s what we ultimately want to avoid. To pick up too on this idea of selfverifying, the idea with this system is that when a person registers a username, as long as that username doesn’t already exist within the system then they are signing their own initial record and entering it into the general record of overall names. So, we’re creating a system where once people want to add these new usernames, they don’t necessarily have to ask permission first and they self-sign it. So, either you trust that username or you don’t. But everything after that initial signature is going to have to be signed with a key that can be verified against that initial record.
TIM:
So, my other concern is you’re in a browser. And people, when I was working on this I got so many complaints that said it’s a lost cause. You can’t ever guarantee the JavaScript implementing your algorithms is safe. And just saying it’s over HTTPS, we know that that can be backdoored.
DANN:
Right, so…
TIM:
And if you backdoor the code running on the client, you can just put in code that sends all their private, unencrypted data to you or whatever.
DANN:
Yes. There’s always a tradeoff between your trust of the system that you’re using and ease of use. So, the most trusted system is going to be crypto-code that you download, you read the source of it, and you compile it yourself, right? And if you’re really paranoid and you’re worried that your compiler is doing some kind of interesting tricks, then maybe you also compile the compiler on a fresh machine or something. So, that’s on the completely paranoid side of the spectrum. And you really, really want to ensure trust.
We’re honestly on the completely opposite end of the spectrum. The use case here is for users who want private communication but don’t have the ability to go out and install things from source themselves. So, I want to send, if I’m the average user and I’m not a developer, I want to send some private photos to my family. I don’t want these to leak all over the internet. Where do I put them? If I put them on a third-party service like Apple or Facebook or something, they have the unencrypted form sitting on their server. So, any employee there or anyone who manages to hack into it is getting that unencrypted form and they can see it. This is not a good situation, right?
So, what we’re offering is better than that, because you’re encrypting on the client and you’re decrypting on the client. And everything else after it leaves your machine is only seeing the encrypted form. But there are certainly trust issues that need to be managed here in a careful way. And as far as I know, there’s no silver bullet for this. If one emerges, we will definitely put it into the system immediately. But we’re…
TIM:
And, yeah. [Chuckles]
DANN:
We’re playing a tricky tradeoff with these trust issues. And I think anybody working in this space, as you mentioned, anybody working in web-based crypto is playing a tricky tradeoff with trust issues. And we’re in the same space.
TIM:
One idea I had is some sort of cache invalidation trick with app cache where it’s impossible to get an updated version of the code and you just have to version your URLs or something.
DANN:
Yeah, yeah, yeah.
TIM:
So that if you’re, so if the web server ever got compromised, they couldn’t send down malicious code to clients who had the cached version.
DANN:
Right.
TIM:
It will only affect new clients.
DANN:
Right, yeah.
TIM:
Which greatly reduces the incentive to hack it.
DANN:
Right [chuckles], exactly.
TIM:
You can’t protect everyone.
DANN:
Yeah.
TIM:
But if you reduce the incentive enough, then…
DANN:
We’ve also thought about maybe having a browser plugin that serves just to verify the hash of the EveryBit platform so that if…
TIM:
Oh, like it, yeah.
DANN:
Yeah, yeah. So, but you can see where this is going. The user has to pay a little more price in order to get a little more security, right? And so, you’re always walking back and forth along this continuum of ease of use versus trust in some sense, right? But there are lots of things that we can do to push it more toward the trusted side. But they reduce ease of use in some sense.
JAMISON:
So, I’d like to ask you a little bit about implementing these cryptographic primitives in JavaScript. Is that something you’ve done or have you used some other libraries?
DANN:
We’re using another library for that. We’re using…
TIM:
Which one?
DANN:
Yeah, we’re using bitcoinjs-lib.
TIM:
Okay, I haven’t seen that one.
DANN:
Yeah. So, we’ve actually built out system in such a way that we have an abstraction layer between the rest of the platform and the cryptographic library that we’re using. And we’ve done this intentionally so that we can change out the cryptographic library if new ones come along that are better than the one that we’re using. And so, we can switch versions without huge amounts of pain. Crypto-code is funny because it’s really easy to write crypto-code. If you’re comfortable with the mathematics and you’ve got a decent big int implementation that runs reasonably fast, once you can do modular arithmetic…
TIM:
[Chuckles] That’s the key.
DANN:
That’s the key, yeah exactly. But once you can do modular arithmetic on big integers, they crypto algorithms aren’t hard. But unlike almost all of the other code that we write, just writing code that works and tests fine isn’t good enough. It’s all of the other things that you have to be worried about. It’s the timing attacks and all of these side-channel attacks and everything else. So, we’re relying on people who really think about all of the different ways that this can be attacked and try to write their code in a way that prevents that, to write that code for us. Because…
MATT:
And people who have a very strong financial incentive to make sure that that code is secure, because they have thousands, millions of dollars’ worth of bitcoin wrapped up in the transactions that are being signed using this same library.
TIM:
Right.
DANN:
Yeah. So, we’ve tied ourselves to the bitcoin chain. Or sorry, not to the chain, [chuckles] to the bitcoin elliptic curve in a sense. And so, progress that’s made on the bitcoin side, we can immediately pull into this on a cryptographic level, since we’re using the same chain that they are. In particular, things like key management. This is a really tricky issue. And there’s a lot of research going on right now in key management. And we’ll be able to pick up on all of that and reap the benefits of it.
JAMISON:
I don’t know what key management means.
MATT:
By that, I think what Dann is getting at is that there are, so we don’t use passwords in our system in the sense that all of the content that is being distributed in this system is signed on the client side. There are no passwords that are sent over the network. And all of the authentication is done by signing these pieces of content. Even updating the username records you sign a piece of content that says, “Update my record in this way.” and then anyone on the network can validate this signature.
So, what you have are these keys, these private keys, that you are storing. And this is a system that’s a little bit trickier in some ways than passwords. It’s something that people aren’t used to. And there are security concerns with this. So, people within the overall bitcoin ecosystem have come up with hardware devices that will store your keys. So, everything from a watch or a little dongle or whatever it is that often can be air gapped. And that way, you can take your private keys with you wherever you go. And you don’t have to necessarily store them in a way that’s insecure, in say a plain text file on your computer.
These advances I think are paving the way for the kind of authentication that people would very much I believe like, which is that they don’t have to worry about passwords at all, that their device and maybe this is a device that’s completely air gapped, comes up with a private key for them. And then it generates the public key for that. And then that becomes part of your user record. But the private key never leaves your, it never leaves your wrist, say. Or it never leaves some kind of protected partition space. And there is lots of work going on in that ecosystem to provide these kinds of solutions.
Our own system has various levels of keys that we’ve implemented so that people are able, and this goes to the issue of trust and how much you want to trust the application that you are downloading to your computer and using in your browser. We have a system where we have various levels of keys so that even if the lowest level of key is maliciously taken, you can still reset that using a higher level of key that you’re only going to be inputting in, basically in case of emergency.
JAMISON:
Sure.
TIM:
You have backup keys.
MATT:
Yes, you have backup keys and particularly you’ve got a hierarchical system where you can have these levels of trust. So, I trust this website with this particular default key, this lowest level of key, to publish content on my behalf. But I don’t trust them to manage my username record certainly. And that is combined with the username system is hierarchical itself.
So, it’s very easy for a developer to create sub-users. And I don’t necessarily have to give away my top level user to a website or give away to kingdom, in a sense. I don’t have to give away full access to this top level user that I’ve created in order for this website to be able to use this shared system. We can create on the fly sub-users. And then if I am no longer happy with what that website is doing, if they go rogue, if they’re publishing content on my behalf, then I can chop off that sub-user in the same way that if you are say, within the Twitter ecosystem and you’ve authorize Hootsuite or some other agent to post on your behalf and that agent goes rogue, you can discontinue their access to your account.
JAMISON:
I know enough about crypto to know I have no business implementing it. So, I appreciate you explaining all these cryptographic ideas so that it’s helpful. [Chuckles]
DANN:
Oh, I was just going to say that we have tried very hard to minimize the amount of crypto that we’ve implemented as well and rely heavily on people who really are experts in this. We did end up implementing our own version of Elliptic Curve Diffie–Hellman, which again is a fairly easy simple algorithm, just because we couldn’t find one that worked. And then we got it audited by the people who do know their crypto stuff to make sure we did a good job, yeah.
TIM:
So, these sub-keys are basically the equivalent of OAuth tokens, which web developers know very well nowadays?
DANN:
Yeah, sub-users are kind of the equivalent of OAuth tokens, yeah.
TIM:
Sub-users, yeah.
DANN:
That’s right, yeah. For the key hierarchy, we spent quite a while figuring out exactly what the actions are that one would want to form with keys. And then splitting those out into separate keys so that you had this hierarchy that correlated to the atomic actions that you would want to perform. So for example, signing a piece of content, or adding a sub-user, or changing a key. And so, we actually ended up with three keys. A key for your content, a key for administration, and then a root key with essentially is the owner key of that username.
And then each sub-user has their own copy of the three keys. So, there was, man probably 20 use cases that we ended up going through as we were developing this. What if the user wants to do this? What if this kind of attack happens? It came down to only three keys, which is really nice. But at the same time three keys is more cumbersome to manage than one key. So, this is one of those areas where you really want to minimize the surface area that you’re exposing.
TIM:
An interesting experiment I saw. Do you remember the Firefox Sync Tool used to be decentralized? Where it would generate a key on the browser and you had to have both laptops online to sync with a new browser? Did you ever see that?
DANN:
No. When did this come out?
TIM:
It was a while ago. And it wasn’t super popular or very user-friendly, because I would never when I would log into a new machine and I wanted to set up my Firefox bookmarks sync, I didn’t have the other machine with me.
DANN:
Right, right, yeah.
TIM:
And Firefox stored nothing on their central servers because they didn’t want the responsibility.
DANN:
Yes. Yeah, yeah.
TIM:
Well, after about a year they gave up. It’s now all centralized with a password.
DANN:
[Laughs]
TIM:
Because usability mattered more.
DANN:
Yeah.
MATT:
Indeed. And one of the primary concerns for what we’re doing here in terms of the development is to make it as easy as possible, and in a sense as invisible as possible from the point of view of the end user. And users shouldn’t have to think about key management and hierarchy of keys and all of these things. So, the EveryBit library takes care of those functions. And then we expose different levels of that API to developers so that they can do things in a way that makes things as simple as possible or they can delve deeper in the API to have more fine-grained control over what’s going on in the setting of the different keys and so forth.
CHUCK:
Alright. Well, I think we’ve pretty much exhausted our time. This is really interesting. [Chuckles]
CHUCK:
And I wish we could just sit and hash over it, maybe open up the code and talk through it for another hour. But we just can’t.
DANN:
Yeah. I would love that also. [Chuckles]
CHUCK:
So, if people want to know more, where do they go?
MATT:
Absolutely. So, they can go to EveryBit.com. We also have a GitHub repository and we’ll throw up a link we recently published a whitepaper here. Still in draft form, but it lays out many more of the technical details. It goes into the algorithms that we use for secure communication and talks about other things. So, they can take a look at that. We’ll make sure that that’s linked from our main GitHub repository, which itself is linked from EveryBit.com.
CHUCK:
Alright. Well, let’s go ahead and do some picks. Jamison, do you want to start us off?
JAMISON:
Yup. I have three picks today. So, my first pick is a Twitch stream called Kyleandrypiano. It’s this classically trained musician that just gets on and plugs his keyboard into the input on his computer and just jams out on the piano. He’s really good. And it’s just fun to have on in the background and hear him play stuff.
My next pick, so my other picks are going to be flatter the guest picks.
DANN:
[Laughs]
JAMISON:
Dann gave a talk at Strange Loop called ‘Visualizing Persistent Data Structures’. And it’s really good. So, persistent data structures are a hot topic in JavaScript right now, even though they’ve been around for a long time. And it does a great job of building intuition on how they work.
And then Matt actually has a blog called Statistics Blog, which is a great resource for learning more about cool statistics stuff. So, those are my picks.
MATT:
Thanks.
DANN:
My name is Dann and I approve those picks. [Laughter]
CHUCK:
Tim, what are your picks?
TIM:
Yeah, I forgot all about picks. But I’ve been busy trying to teach my son programming. And I’ve tried the angle of robotics, because everyone likes to turn lights on and off.
DANN:
Nice.
TIM:
And I discovered a company that makes parts for this called Seed Studio. I think they’re based in China because the shipment takes forever to get here. But they have really, really good stuff. And Adafruit and SparkFun are good as well, if you want things a little more local. But I’ve loved all the products on Seed Studio so far.
DANN:
Cool. That’s good to know.
TIM:
I’ll post some links to those.
DANN:
Nice. My daughter’s getting close to the age where it’s time to start moving in that direction. And so, I’m excited to start capturing all of this information. She just turned two yesterday. [Chuckles] But she’ll be there soon. [Chuckles]
TIM:
So, one pro tip. What I do is I order a bunch of stuff and it takes forever to get here. And then I store it in what I call a local store.
DANN:
[Chuckles]
TIM:
And then whenever he wants something, he can buy it from my store for a small markup. [Laughter]
DANN:
Nice.
TIM:
So, I just shop a couple of weeks ahead and like, “He’s going to want one of these.” And I order some wires or some buttons or some blinking LEDs. And that way, he doesn’t have to wait a month when he decides he wants to do something.
DANN:
And then he makes money by rewiring all the lights in your house?
TIM:
No, no. That’s a very different line of work.
[Chuckles]
JAMISON:
He holds the lights hostage.
DANN:
[Laughs]
TIM:
There we go.
DANN:
Right.
CHUCK:
Alright. I just have one pick this week. I just finished a book called ‘American Sniper’. It’s by an exNavy Seal named Chris Kyle. And it was just a really interesting view into the way that the war in Iraq went and some of the things that were going on there. He debunks some of the ideas that the media and others put out there. And so, anyway, it was a really interesting read. It’s not really in my opinion about your politics or what have you, but more about just understanding what the war was about and things like that. Though he does express a strong opinion. And there was also a movie made based on the book, but I haven’t seen it. So, just to put that out there.
Dann, what are your picks?
DANN:
Yeah. Have you guys seen this Relay and GraphQL thing that came out of the React Conference a week ago?
JAMISON:
It’s so cool.
DANN:
Yeah. I was just catching up with the videos. So, the idea is that your React components themselves, and we’ve been using React for almost a year now. I think it was a really early design decision back in maybe March of last year to start using React on all of the interfaces that we’re building. The platform’s agnostic, but our interfaces into it are built on React. Anyway, so your components can register the information that they’re interested in receiving. This gets bundled up to the parent component, which then sends that data request back to the server. And then from the server it receives exactly what each of its child components is interested in and distributes that information back down the tree. So, the end result is that you can write your component and both the rendering function and also the data that it’s requesting from the server. This is really interesting. And it actually plays in some nice ways with EveryBit.
I just wanted to plug the ClojureScript ecosystem as my second pick which has some really exciting things happening right now.
And then Michael Fogus’s Read-Eval-Print-_ove is my third pick. And I’ve got actually five more picks but I’ll trim it off at three. [Chuckles]
CHUCK:
Alright. Matt, what are your picks?
MATT:
I have a React related pick as well. Unfortunately I wasn’t at the recent React.js Conference so I don’t have my hands on the code. But they have recently released a version React Native for mobile developers. And given our experience here as Dann mentioned implementing React.js for the frontend of I.CX, the secure messaging, I’m very excited about that. And after watching the videos, very excited about their particular approach to creating native code and cleaning up many of the problems with CSS. And in general, the approach of building things out in these components and the flow that it, in a sense, forces developers to have for the data as it moves through the system. So, I’m very excited and I suggest that people go and take a look at the videos that show off React Native.
CHUCK:
Alright. Well, I don’t think we have anything else. Thanks for coming and talking to us about this.
It’s always fascinating to hear about a different thing that’s going on in JavaScript.
JAMISON:
Yeah, thank you. This was great.
MATT:
Well, thank you.
DANN:
Yeah, it was our pleasure. Thanks guys.
CHUCK:
Yeah. We’ll wrap up the show and we’ll catch you all next week.
[This episode is sponsored by React Week. React Week is the first week-long workshop dedicated entirely to learning how to build applications in React.js. Because React is just the V in MVC, you’ll also learn how to build full applications around React with the Flux architecture, React Router, Webpack, and Firebase. Don’t miss this opportunity to learn React.js from Ryan Florence, one of the industry’s leading React developers. If you can’t make it out to Utah they’re also offering a React Week online ticket. Go check it out at ReactWeek.com]
[Have you noticed that a lot of developers always land the job they interview for? Are you worried that someone else just landed your dream job? John Sonmez can show you how to do this with the course ‘How to Market Yourself as a Software Developer’. Go to DevCareerBoost.com and sign up using the code JJABBER to get
$100 off.]
[This episode is sponsored by MadGlory. You’ve been building software for a long time and sometimes it’s get a little overwhelming. Work piles up, hiring sucks, and it’s hard to get projects out the door. Check out MadGlory. They’re a small shop with experience shipping big products. They’re smart, dedicated, will augment your team and work as hard as you do. Find them online at MadGlory.com or on Twitter at MadGlory.]
[Hosting and bandwidth provided by the Blue Box Group. Check them out at Bluebox.net.]
[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit CacheFly.com to learn more.]
[Do you wish you could be part of the discussion on JavaScript Jabber? Do you have a burning question for one of our guests? Now you can join the action at our membership forum. You can sign up at
JavaScriptJabber.com/jabber and there you can join discussions with the regular panelists and our guests.]