108 iPS Synchronizing Documents & Offline Handling with Mike Ash - iPhreaks -

[This episode is sponsored by Hired.com. Every week on Hired, they run an auction where over a thousand tech companies in San Francisco, New York and L.A. bid on iOS developers, providing them with salary and equity upfront. The average iOS developer gets an average of 5-15 introductory offers and an average salary offer of $130,000/year. Users can either accept an offer and go right into interviewing with a company or deny them without any continuing obligations. It’s totally free for users, and when you're hired they also give you a $2,000 signing bonus as a thank you for using them. But if you use the iPhreaks link, you’ll get a $4,000 bonus instead. Finally, if you're not looking for a job but know someone who is, you can refer them on Hired and get a $1,337 bonus as thanks after the job. Go sign up at Hired.com/iphreaks]

[This episode is sponsored by DevMountain. DevMountain is a coding school with the best, world-class learning experience you can find. DevMountain is a 12-week full time development course. With only 25 spots available, each cohort fills quickly. As a student, you’ll be assigned an individual mentor to help answer questions when you get stuck and make sure you are getting most out of the class. Tuition includes 24-hour access to campus and free housing for our out-of-state applicants. In only 12 weeks, you’ll have your own app in the App store. Learn to code, it’s time! Go to devmountain.com/iphreaks. Listeners of iPhreaks will get a special $250 off when they use the coupon code iPhreaks at checkout.]

JAIM:

Hello everybody and welcome to episode 108 of the iPhreaks show. Today on our panel we have Andrew Madsen.

ANDREW:

Hi from Salt Lake City.

JAIM:

I'm Jaim Zuber from Minneapolis and we have a new panelist. Please welcome Mike Ash.

MIKE:

Hi, this is Mike. I'm from Fairfax, Virginia.

JAIM:

Mike, we've had you on the show before but never as panelist. A lot of people know who you are. You've been doing a lot of cool things in the Mac and iOS world but can you tell us a little bit about yourself?

MIKE:

Sure. I work at the Plausible Labs. We do various things including PLCrashReporters – maybe our most famous thing. We also have a protocol VoodooPad that we work on. We do some consulting on the side. I personally write a technical blog called NSBlog, Home of Friday Q&A where I talk about various crazy antiques that I get up to over there – lots of fun low-level stuff. I guess that's about it.

ANDREW:

I've picked articles from your blog many times and I've picked the whole blog a couple of times. I do want to say to people out there listening: it's one of my favorite blogs because you go into depth on things that other people don't really touch. I've just learned a lot from it over the years. If you have not checked out Mike's blog, you definitely should.

MIKE:

It's what I try to go for. I think the basics – how to build windows and things like that – nothing wrong with that but it's covered. I wanted to try to bring something different.

ANDREW:

Yeah. I really appreciate the stuff you write over there. I have been reading it for – I don't know – as long as I can remember knowing about it. At least 4 or 5 years, I think.

MIKE:

Well, excellent. It's a lot of fun to write. It takes up some time but I love the audience. I always great comments on it and it's always fun to have an excuse to do crazy things.

ANDREW:

And I must say your last Friday Q&A which I think you – are you only doing them once month or did you skip one?

MIKE:

I try to do it once every two weeks. It depends on how it all works out.

ANDREW:

Yeah. Well. That's completely understandable. Your article about fuzzing in AFL-fuzz was pretty cool.

MIKE:

Yeah.

JAIM:

Tell us a little bit about the Friday Q&A. What is that?

MIKE:

Yeah. Basically, the idea is that I wanted to write a blog – how it came about, I wanted to write a blog but I couldn't think of anything to really write about. I had one. I set it up. I wrote a few articles but it was really sporadic and how do you come up with stuff. So I finally decided that I'll just have people tell me what to write about. What do you want me to write about?

That's basically what it is. People write in the suggestions, I pick what I like, I write something down. It used to be every Friday then every other Friday as long as I can get it out on time. That's basically what it is. I try to pick really interesting more advanced kinds of topics preferably things like reverse engineering things or Assembly language.

One of the things I like to do is rebuild from scratch system APIs for system classes to get the idea of how they would work internally and demystify the things so that – I feel a lot of people build apps and they have this idea with APIs that they're doing something magical. A lot of them are actually very simple ones if you dig in to them. I think rebuilding them is the best way to show that.

ANDREW:

Yeah. I really like that aspect of the blog. I've been doing a little teaching lately and I'm teaching people who are complete beginners to iOS. Particularly, some of them are complete beginners to programming, in general. It's hard for me and I really try to drill into them that all of this stuff that they're using is not some magical thing. People like them wrote it and they could do the same with enough knowledge and skill.

JAIM:

Unless you’re writing Lisp then it's just magic.

ANDREW:

[Chuckles] Well. Yeah, right.

MIKE:

It's not magic. You just need a large store of parenthesis.

ANDREW:

[Chuckles] Yeah. Just got to keep the parentheses warehouse well-stocked.

MIKE:

Exactly. Not many people can afford that kind of resources. It's an exclusive club. Yeah. I actually managed to win the International Obfuscated C contest one year with a lisp program. That's a little connection there.

ANDREW:

Oh, that's interesting.

MIKE:

I wrote a lisp interpreter – not really lisp, but a very, very limited subset of lisp. I wrote it in C and I wrote it so that there was also a lisp program that printed itself out if you ran it from within itself.

ANDREW:

I'd like to see that. You should put a link in the show notes.

MIKE:

I will do that. Yes I do. Then, later I wrote a Tetris program that had no control constructs in it but that one didn't win. There's the lisp program. It's kind of unreadable by nature.

ANDREW:

[Chuckles] Yeah.

JAIM:

Yeah. I scored pretty highly on that contest one year but those entirely unintentional. [Chuckles]

MIKE:

There's a link to the Tetris program in case anybody is wondering. I think that one's even more unreadable. It looks a lot more structured but it's really just – I'd a lot of difficulty with that one because they have length limit on these submissions and it was really difficult to compress it down enough to get it in there. I actually tried to write a little script that would go through all the identifiers and do a frequency analysis on them so that I could properly dedicate all the one-character identifiers to where they would do the most good in shrinking the code.

ANDREW:

That's fun. You have very – I guess you don't really have any that you defined that are longer than one character.

MIKE:

Right.

ANDREW:

Yeah. It's cool. We were talking before the show about what you're working on for your work right now. I thought it would make a good topic. Do you want to introduce that for us?

MIKE:

Sure. We're looking at how to enhance VoodooPad for the future. One of the things that we're really interested in is synchronization and collaborative editing. Right now, we save documents on Dropbox. We don't do that necessarily but you can. The app is built to work with that to an extent. You can save a document on Dropbox; you get it on your computer; and you can edit it here and edit it there.

The typical approach to that is to store things in a file and Dropbox syncs them. The trouble comes in when you're editing things from more than one place at once. If you have two people on two different computers, for example, or if it's just you on two computers and you happen to have one that drops offline and doesn't sync up before you edit on the other one, you get sync conflicts and it's a pain.

I've been experimenting with synchronization solutions that work cleanly with third-party providers like Dropbox and avoid all of that mess and make it all work cleanly.

ANDREW:

Okay. This whole idea of syncing documents or syncing data that can potentially be opened and being edited on two machines at once is certainly not a new problem, right? It's something that --.

MIKE:

Yeah. That's been out there forever. Apple's now doing it with iCloud. Lots of Apple apps and thirdparty apps are doing it.

ANDREW:

Explain to us some of the – obviously, the big issue is that you have to somehow deal with the document being edited on both machines at once. Explain to us why that can be such a tough thing to deal with.

MIKE:

Right. The easy situation to deal with is one where every time you make a change, it goes to the server and you never make a change without having the latest stuff. That's how the iCloud model works. That's how most of these apps work because it's relatively easy way to approach it. It's like the subversion model. If you're looking at version control stuff for source code applied more generally. You're working with the document on one machine, you make a change, it goes up. Then you work on with that document on another machine, you get the latest version; apply your changes. Everything does flow linearly that way.

Then, the problem comes where if I make a change one computer. Say, I add a word to the first paragraph of this document, and then you make a change in your computer without grabbing my copy first because maybe there was a network problem or you're on an airplane in the middle of the ocean or there was just some bug that you hit in a service. Anyway, let say, you add a word to the second paragraph without having my change first. Now at some point, those two changes have to be reconciled.

As human beings, we can look at these changes and we can typically understand the intent of them and say “Well, you added a word here and he added a word there and they're not really related changes so we can just apply them both.” and end up with a copy that neither one of us actually saw when we're making the changes but that make sense given what we did. It's just very difficult to get a computer to apply that intelligence.

ANDREW:

The thing that it brings to mind to me having never implemented this in an app is that you fundamentally have the same problem with source control or version control systems?

MIKE:

Yes. It's very much the same problem. It's just applied to other situations. In fact, something like git makes a very nice way to control text documents in general as well.

ANDREW:

That leads to my next question which is you – its sounds like your implementing this system yourself but, before you started, did you look at existing solutions?

MIKE:

Yeah. I poked around to see what's out there. Since I'm very much in the experimental phase with this, some of it was – the intent was to just see what I could get done and see what I could come up with as well. I looked at existing version control systems. I also looked at a guy named Neil Fraser came up with something called differential synchronization which I'm posting the link, too, in the chat. That is mostly what I'm building it off of is the stuff that he wrote.

The basic idea – because this reconciliation process is something that depends on the structure of the document, you can't just take my bytes and your bytes and combine them into some third set of bytes. What if this document is actually image? That might not make any sense. What happens is that Dropbox essentially just gives up on the whole idea. If you ever end up in this situation, it just takes your copy and my copy and saves them both and marks one of them as “This is your conflicted copy and here is what we think the most recent copy and figure it out.”

In order to make this work, you either need some smart centralized server that can understand the contents – not just look at everything as an unstructured bag of bytes. Or you need some way to do this that deals with everything as separate files so that something like Dropbox doesn't come in and try to reconcile things that way. Since I want everything to work with Dropbox or any other service like that, I've been taking that approach. The idea is basically to log each computer's actions into a separate file that's bundled together and then the master document is actually the combination – the logical combination – of all those files rather than being an actual single file on the disk.

JAIM:

So that's how VoodooPad stores information, it's a collection of files?

MIKE:

That's how it works right now but this is going a little bit deeper now. Currently, the way VoodooPad works is every page is a file. The way my experimental system works is every computer's edits to a page is a separate file. So if I work on a page A from one computer, that goes into one file. Then if I work on that same page A from another computer, that goes into a separate file. Now you have two files whose contents are both contributing to the final page. That reconciliation happens onthe-fly essentially. That allows the individual computers to apply merging logic to what's been going on without requiring the syncing service like Dropbox to get involved. All it has to do is to make sure that those files exist everywhere.

ANDREW:

Interesting. At some point, do you flatten those down? At some point, it seems like those files that describe edits would build up and build up and you've got lots and lots of files lying around.

MIKE:

Exactly. That's one of the things I'm working on to figure out good solutions for. Right now, what happens is as you work, every so often it takes a snapshot because, right now, what those files contain is not actually the page but just the edits that you applied to that page. So if you type out, for example, “hello world”, it'll record that you typed the letter 'h' and it records that you typed the letter 'e'. The idea is that you have this sequence of events that you can play back and you can play them back in different orders and things like that. That's how you can reconcile conflicting edits afterwards. That gets expensive to play that back every time so every so often it takes a snapshot, list it to current state as I see it, as I understand it and you can start from there.

Eventually, it'll still build up. You would want to maybe throw away old data. Part of the idea is that this allows to essentially keep history as well so that you can go back in time in case you delete data that you wanted to keep or something like that. Keeping all this stuff around forever might actually be a feature rather than a bug but, on the other hand, people may want to get rid of the old stuff eventually as it clutters things up. So looking at options for maybe making it optional; maybe cleaning old stuff out after a while. It's hard to figure out exactly what people would need just yet.

Those are some of my ideas.

JAIM:

As you do snapshots, do you build off the most recent snapshots? Or do you build from the beginning of time when you're building these things together?

MIKE:

The way I have it right now is when the apps starts up, it finds the most recent snapshot and then applies any edits that come after that to come up with the current state of the document. Then, as you work, it creates new snapshots at intervals; saves them back to a file. Then those become visible to anybody else that comes along that opens the document.

JAIM:

Okay. So your last snapshot's the source of truth and you make changes to that. Those snapshots are created then those become the source of truth that you build off of.

MIKE:

Right, exactly.

JAIM:

Which avoids things like rebuilding everything that's been done since you started this document which could be huge and very expensive.

MIKE:

Right. The way I have it right now, it's still saving all of that history so if you wanted to go back and see how things were, then you could conceivably do what – I haven't built the code yet – but all that data is still there. If you wanted an on-going history of how this page used to look, it's all in there. If you didn't – because the snapshots are used – you could ditch everything before the latest snapshot and still have everything you needed to construct the most recent state.

JAIM:

What are some common approaches for document sync? You talked about your approach where you're creating different files for each change. Are there some other ways that people are doing this?

MIKE:

Well, the most common approach, like I said, is to basically give up on the idea. [Chuckles] Essentially, if you edit based off of something that's not the latest, then it just gives up and says “Well, now you have two copies. Figure it out.” That's how I've seen iCloud do it. If you open an iCloud document in Pages or Keynote or something like that; you make a change while you're offline and make a change on another computer at the same time. Then once everything gets back online, you'll end up and it'll say “Hey, which copy do you want to keep?” This is really easy to implement. It's obviously not the nicest for the user. We are typically connected a lot of the time these days so it doesn't come up that often but it is a rough point.

Another approach is to just keep everything online. For example, I think Google Documents does this. You can collaboratively edit a single Google document from multiple computers and all your edits show up live and things like that. Basically, the way that works is that you're just online and that makes sure everybody's always up to date.

What's different with our approach is we really wanted not to have our own service. With something like Google documents, you got your own service that they have the servers that are running fancy code that manage all this stuff. Apple is running iCloud. They have probably 10 million servers running all this code. What we'd really like to do is not require people to sign up, for example, our own stuff but allow people to keep using whatever services they're currently using – whether it's Dropbox or Google Drive or Microsoft Sky, whatever; and build our solution so that it still works well on other people's stuff.

ANDREW:

A lot of what you've talked about so far is something you can do in VoodooPad because VoodooPad is essentially text files, right?

MIKE:

Right.

ANDREW:

Its seems some of the stuff breaks down if you're, instead, editing images or editing audio files or something like that where it's not so easy.

MIKE:

Right. That's a really interesting question. For something like images, it's hard to see how this could work because an image is fundamentally a non-linear thing, right? It's a two-dimensional thing and you could apply your edits and I could apply my edits and they're typically global edits, right? If you're applying a filter or something like that in the entire image, it's hard to see how you could allow that to work.

It's interesting to think about it in the context of audio editing because audio editing is a lot more like text editing. A lot of times you’re slicing things out; maybe re-arranging things; inserting effects. I could see an audio editor potentially using an approach like this. Say, you kept a list of edits and tried to apply them intelligently.

JAIM:

Once that's gone out to a wav file, that's not going to work. But, definitely, if you're keeping track of effects and what the effect is, that type of thing can be replayed.

MIKE:

Right. I could see editing a podcast, for example. I imagine a lot of it is chopping out bits that sound stupid or chopping out long bits of silence, things like that. You could definitely have something like that where you're just recording parts that you don't want anymore. That could all get applied to the document after the fact. Reconciled on the slide like that.

ANDREW:

That's interesting.

JAIM:

You have iPhreaks Remix.

ANDREW:

Yeah.

JAIM:

I like it.

ANDREW:

In any case, it sounds like for this solution to work, whatever program is doing the merge or the synchronization – whatever do you want to call it – has to know something about the format in question.

MIKE:

Yes, I think so. In the case of text, my current approach is to essentially record some of the context where you make the changes. If you're adding a word in the middle of a paragraph, it'll record the before and after. That way, when it goes to reapply that edit events to a slightly changed document because maybe that already applied your edits, then I can take that context and try to figure out where this is best fit now. That fit, you'd have to definitely have to know about the structure of the documents to be able to do that.

With audio data, you could apply the same context idea but you would need to understand that your audio samples are two bytes wide or four bytes wide or whatever they are, how to actually get at that stuff. The cool thing is that only the client has to know about it – not the server.

ANDREW:

Yeah. That's interesting. The server is now just a regular dumb file.

MIKE:

Right. As long as the server can make files appear on multiple computers as they're changed, then were good.

ANDREW:

Well, there's one scenario that I don't think we've mentioned yet that I'm curious about. That is, say, you've got a document with two paragraphs and one user edits – completely changes the first sentence of that paragraph to say something new or changes a few words in that sentence. Then, the other user changes the exact same words to something else. So you've got a conflicting change. How do you deal with that?

MIKE:

Right. That's a really interesting question. I think, from a theoretical point of view, basically there's no right answer. Somebody has to win essentially. Somebody's edits have to go away. The way my code currently handles that is it will apply all of the edits and it will just try to figure out the best place to put them. In the scenario as you gave it, what will likely happen is you'll end up with both sentences, I think – be interesting bit of torture test to see what it comes up with right now.

Essentially, what's it going to do is for every edit it goes through, it takes that context and tries to figure out where in the current document best matches the context of the edit as it came in. If you're typing something, say, you deleted that first sentence and you're typing something new, then that context is going to be, for example, is going to say “This came right before the second sentence.” So as you're applying those, you apply that first edit. Maybe that applies cleanly because you're the first one there and then you're applying the second edit which happens simultaneously with the first one but now you've got the extra content there. The context that it'll figure out will probably be like before or after the new content. So it'll say “Well, this seems to fit best because here's a context match, right? We recorded that it was before second sentence and now we'll just put it in before the second sentence again.”

You definitely need some human oversight. The idea is not to be perfect but to apply things such that things – you want it to succeed where it can and at least do something not catastrophic where it can't.

I think in the long run if we run into something like that, we'll probably either – if it conflicts too much, I imagine we might ditch a copy; might save it and present both, something like that. The idea is to make sure that it only happens if you're doing something that really truly conflicts like you both edit the same sentence – not just you edited the sentence over here and I edited a sentence over there. It's a conflict just because it happens to be on the same page.

ANDREW:

It seems like, in this case where you do have true conflicts, this whole problem moves out of the realm of an interesting sync problem into a plain old app design, UI problem.

MIKE:

I think there's always going to be a limit where the computer can only help you so much. In the end, imagine if we're collaborating on a document. I decide this sentence is terrible and it really should say this. And you might have a completely different idea about what it should say. Maybe we get into an argument about that. [Chuckles] The computer can't really solve that for you. At some point, people have to work things out.

JAIM:

That's when we get into the area of merge strategy.

MIKE:

Right.

JAIM:

And how much context can you provide about the application with the users are trying to do that can help you about that. What are the common approaches? I'm sure it's something that people have been thinking about for years. When talking about merging, how do people do this?

MIKE:

Yeah. I'm not as familiar as I probably should be for this. It basically boils down to either trying to find a way to replay history so that everything is consistent. Maybe if you got – you might have multiple edits that don't depend on each other. The idea is you can apply them in different orders. You might search for an order that makes sense or you might just bail out and tell the user, “Here's a conflict.” If you can try to use your history to a varying degree to try to figure out something that makes sense.

There are ideas about using semantic information especially if you're doing code with version control. If you can apply more information about what the code actually means instead of just the textual representation, you can do a lot better. For example, you might have – most version control systems out there work on a line basis so if I edited the line and you edited the line, that'll be a conflict. But if you have semantic information about it, you might be able to tell that, for example, I edited variable name whereas you edited the variable's type. Those two could be applied simultaneously without having any trouble.

JAIM:

That's just built on the specifics of your app or your document type.

MIKE:

Right. For something like that, you need know what it all means. It's interesting to think about that as applies to more general app like VoodooPad or text editor, something like that where the language in use is not a formal language but it's something like English. It's interesting to ponder the possibilities. What if you gave the computer some understanding of the language? Could you use that to improve merge strategies? I don't know what the answer is but it's interesting.

JAIM:

I think you'd have to pick a logical language.

MIKE:

Right. That would certainly help. Maybe we could go back to lisp and we can do everything with Sexpressions instead of English. Then this whole problem becomes really easy.

JAIM:

If we could only speak that way, then we'll be all set. [Chuckles]

MIKE:

I'm not going to try. I'm tempted but I'm not going to try.

ANDREW:

[Chuckles] Yeah. It just reminded me about Star Trek: The Next Generation episode where they meet some aliens that speak in binary. I don't think they ever explain anything beyond “They speak in binary.”

MIKE:

Yeah.

ANDREW:

Who knows what that means.

MIKE:

Yeah. That's one of those things that almost sounds that it makes sense until you think about it.

ANDREW:

[Chuckles] Right. We've, during this whole discussion, been talking about a scenario where multiple users are editing. You worry about two users that are both editing a document on two different machines. For VoodooPad, in particular, is that a scenario that you actually see being the common one? Or is it just that the user has a desktop and a laptop?

MIKE:

It's more typically the one user with two computers case. A find user to think about the case where it's more – where it's multiple people working on a single document. But when you really get down to it, they're more or less equivalent because I ran into sync conflicts with my documents that I've never shared with anybody. The way it works is I made an edit on my desktop computer and then I had to go somewhere. I grab my laptop. I wander off. I pop open my laptop and I make a change there not realizing that it never connected to the network. I'm still offline and I'm working off an old copy. Now I have a sync conflict that I have to deal with.

From the computer's perspective, that's really no different from the scenario where you actually have two live humans working on the same thing. It's easier to not try to think about computers and humans as being different entities here. It's all just you got the device – whatever it is – is editing the document. It doesn't matter whose behalf it's on once you get down to that level. If you can solve the problem of multiple people all working on the same document, then you also solve the problem of one person working on the same document from different computers with network connectivity problems or whatever might be going on.

ANDREW:

It's seems like if we just always had a 100% reliable, zero-latency internet connection, all of these would get a lot easier.

MIKE:

That really would solve a lot. I'm definitely all for that solution for sure. I think that might be a little harder though.

ANDREW:

Yeah [chuckles]. Unfortunately, I don't think that's in the near future.

JAIM:

Write your congressman.

MIKE:

Yes, there you go.

ANDREW:

Oh, I'm sure they'll be the ones to fix it.

MIKE:

I do see a lot of apps seem to assume that though. It's a little shocking and annoying. There are so many apps that if you launch them and you don't have a good network connection, they'll just lock up because they're trying to contact the mothership before they actually do anything.

ANDREW:

Yeah. That's actually what I was getting at which is that for the average app developer that might be listening to this, it's easy to forget that there's a difference between you in your house with 100 Mbit connection; be a really reliable WIFI and the actual users of your app who're going to be out on a poor cellular connection or often on an iPad with no WIFI connection at all or whatever.

JAIM:

With an actual device because we're testing in a simulator, right?

ANDREW:

Yeah. I actually don't – the apps I work on, we can't test in a simulator because they use audio API or the iPad library APIs.

MIKE:

I definitely think that's a key point because it really is something that a lot of apps miss. I think my favorite example of this is: AT&T has an app called Mark the Spot. It's an app that you can use to report network problems, basically. If you're in an area where you think you should have good data coverage or something like that but you don't, you can use this app to report it. It's hilarious because it calls home as a blocking thing. The entire UI will lock up when you launch it if you don't have a good connection.

[Chuckles]

ANDREW:

And the whole point is to report a bad connection.

MIKE:

The whole point is you only use this in places where you have a bad connection. Yes. It usually does get through eventually but it's very painful to use the app because you'll start it and you have to sit there and look at it. And it's got all these buttons like “Select what you want to do.” and they won't work.

ANDREW:

I've used that app before [chuckles] and basically all I do is just mark my own house because the reception is so bad here. We actually just finally broke down and bought a microcell from AT&T and it was $75 which I feel a little ripped off having to pay for their poor network coverage. You pay even more.

MIKE:

Yeah, I always found that a little weird.

ANDREW:

It has worked very nicely.

MIKE:

Well, that's definitely a bonus. I'm lucky to have good coverage here but there are a few areas I go to out and about where less often it stops working. I report it slowly.

ANDREW:

Yeah.

MIKE:

I see this in a lot of different apps even Siri, for example. If you trigger the thing and you start talking to it and you let go and if you have bad network connection, it'll spin for a while and it'll say “I'm sorry. I couldn't do that.” It completely forgets what you told it. There's no retry button or anything. What would be ideal is if it would automatically retry, for example; or maybe you could have a button that says “Hey, I think I might have better connection. I'll try it again”. But instead, you have to remember what you just said and repeat it. It's a really weird failure mode. The thing is definitely capable of remembering what you said until it actually can communicate back to Apple but it just doesn't. They seem to have assumed that connections are much more reliable than they really are.

ANDREW:

I think this is for advice to the people listening. Don't do this in your own app. Think about the scenario where somebody does not have a good network connection or doesn't have one at all. The apps I work on pretty much require a persistent internet connection but we spend a lot of time and go out of our way to think about how they should behave when there's not a network connection.

MIKE:

Right. Apple has a Network Link Conditioner which can be really useful for this. It basically lets you simulate a bad network without actually having to drive out in the middle of nowhere or whatever it takes. I guess go to your house. Another fun thing to do is – you typically will have a bottleneck for network activities in your app somewhere. Every http requests, for example, might go through one chunk of code. A lot of apps are designed that way. What can be really interesting and entertaining and useful is to just put a little bit of something in that code where every time it makes a call out, it randomly adds a delay between 1 and 10 seconds or 20% of the time it returns an error.

Just put that in there and put some comments around it to make sure you don't accidentally commit it. At least keep it turned off. Run the app that way and use it and see how it behaves when everything is failing inconsistently. Obviously, if your app depends on the network, it doesn't have to be useable but it should, at least, fail gracefully.

ANDREW:

Yeah. I think that's the key. Some apps, a network connection is just absolutely vital to whatever it is they do.

MIKE:

Right.

ANDREW:

It's okay to throw up screen that says “You don't have a network connection. Come back when you do.”

MIKE:

But don't lock up the UI or something like that.

ANDREW:

Right.

MIKE:

And definitely try to handle it somehow.

ANDREW:

Don't lose user data that they've already spent the time to input or whatever.

MIKE:

Yeah. That's really the worst. There are so many apps out there where you type in something and then you hit send and then it hits an error. Then you go back and it's all gone. Then you hate the world.

JAIM:

Rage quit.

MIKE:

Yes. Rage tweet except it's usually the Twitter clients doing this.

ANDREW:

[Chuckles] Yeah. I have seen several of them do that. Also, if you use the Network Link Conditioner which we should remember to put a link to that in the show notes. Don't forget that you have it active.

MIKE:

Yeah.

JAIM:

You figured that out pretty quick though. You go to google and nothing shows up.

MIKE:

I think it depends on how badly you've turned it down. It can be advantageous to test on a moderately bad connection, right? What if you set it up to 5-12 kB DSL connection with 5% packet loss? Then you're just going to be angry all day and maybe not realize that it's broken.

JAIM:

Okay. Do you find much use out of those? I've used the Network Link Conditioner for the bad connections and I don't find them to really weed out any bugs. I usually had to offline or full on. I don't find that I get much use out of doing the bad connections. It seems this stuff just works.

MIKE:

I prefer to do my own. Like I said, if there's a bottleneck, put in some code to artificially delay things. Where the bugs really hit, in my experience, are especially loading images and things like that where they come in out of order. If you end up with a large delay between requests and it's an inconsistent delay so that request 2 comes back first and then request 1 comes back 10 seconds later that can reveal caching bugs and things like that.

It's really common for things like table view cells to be implemented where when the table view cell is first displayed, it triggers a download. Then, once the download is complete, it puts image in there. There are a lot of apps out there; lot of code out there where you might do that. And the table view cell goes out of view because these users scrolled it away and then it gets reused for something else. Then, an image download from the first time finally completes and then you get the wrong image in there.

For the Link Conditioner, it's harder to trigger things like that. It's definitely a realistic scenario but it's rare. The Link Conditioner doesn't necessarily trigger it very frequently so I like to do my own delay code for that. I think you can use them both; try it out; see what pops up. It's a good exercise. At the very least, even if you don't encounter any bugs, you can at least see what it's like to use your app on a bad connection and maybe get some ideas for improvements that way even if there are no bugs.

JAIM:

It'd be an interesting open source project to start messing with your requests and just randomly dump one. You wouldn't need to do it in your code. Do you have a framework that you just wire up that causes havoc.

MIKE:

Yes. That would be a fun little project. It's a good idea. Then everybody can have the benefit of trouble.

[Chuckles]

I have to make sure to put in some code that warns you if you try to ship it or something.

JAIM:

For app developers who are trying to implement some basic sanity required in being offline, how do you detect that you're offline? Or how do you know if you're online or on WIFI?

MIKE:

Ideally, don't try to detect it but just try to make your requests and then gracefully handle a failure. If you're talking to a server, then just try to talk to the server. It'll work or it won't work. If it doesn't work, you might be online but the server is down or you're online but there's something in the way. The server's been blocked because it got taken over by bot net or something like that.

The best way to find out if something will work is to try it. If it will work, then it does work. If it won't, it doesn't. If you really need to find out ahead of time, you can use Apple's connect – sorry – Reachability API. This will give you the basic 'go', 'no go' as to “Is there actually a network connection available at all? Does that network connection maybe look like it leaves the internet somehow?”

ANDREW:

I think Reachability is especially useful for telling when the device is definitely not online because it's pretty easy to tell if there is just no WIFI connection or no cellular service. But I've seen people make the mistake of thinking that because Reachability says the device is online, that everything is fine when it could be that the server they're trying to talk to is down.

JAIM:

It tricked you.

MIKE:

Right. Or you might just get a network which can't connect to the internet for some reason. There's a ton of different failure modes but it's great to have – to specialize your error messages that way because if you're not on a network at all, you don't want to say “Couldn't connect to the server.” You'd like to be able to say “Hey, get on some WIFI.” or something.

JAIM:

Yeah. Reachability makes itself a little bit confusing that way because you can initialize it with a URL saying “Here's my server. This should be up if reachability is up.” That's not how it works.

MIKE:

Yeah. It's very much a local thing and it doesn't always make that clear. But it's still really useful for sure.

JAIM:

Regarding the other tests you were talking about, just making requests, throwing it out there and seeing if it fails. If the server's down, hopefully, you're getting a 500, something like that. But if the server is slow and you're getting a slow response, how do you configure your request so you can do that reasonably? You configure “Okay, something's not happening right.” We don't have the connection we need without sitting there for – what's the default timeout? It’s minutes.

MIKE:

Right. The first thing to do is make sure that all of your requests are asynchronous. You don't want to be blocking anything on that. You're UI should remain responsive. You want to make sure that the user can cancel whatever it is that you are doing if the get fed up. You can't tell whether the user is willing to wait for 10 seconds or 10 minutes. If you can let them say “This sucks. I'm going to come back and try again later.”, then that'll help them a lot to be able to say that.

Then, as far as your own code goes, if you can figure out some sort of reasonable timeout – you don't want to make it too short. If people are on a bad connection that they're willing to put up with, you don’t want to say “Well, I'll only wait for 10 seconds.” because maybe they wanted to wait more. If they're willing to put up with their bad connection, they will get upset with you for not putting up with it.

JAIM:

Yeah. Project manager says “60 seconds is too bad”. And user may know they're on a bad connection and saying “I need this information. I'll wait.”

MIKE:

Right.

JAIM:

It's good to give the power – keep the power in the user's hand. It’s a good pattern.

MIKE:

Yeah. I've seen internet connections out there like weird cellular connections where the ping time is literally two minutes. Everything still works but it literally takes two or three minutes to get anything done. In a case like that, if you send off a command, it would be great if it could just wait. Really, I think giving the user power to decide that on their own is really a good way to go. If your connection times out, if you're using the default system timeouts, then you'll get an error back that way and definitely need to handle it. But just being asynchronous and giving user control is really helpful.

I also wanted to mention that the Reachability stuff – getting back to that. Another thing it's really good for is finding out when to retry a thing; when to retry a request because not only can Reachability tell you, “Yes, you appear to be on a network.” or “No, you're not.” but it can also call you back when that changes. If the users initiates the request, for example, and it can't connect and you realize you're not even on a network, then you can have Reachability tell you “Hey, you're actually online now. Maybe you could try that request at this point.” without having to force the user to do it on their own.

It's another nice thing to improve the experience.

ANDREW:

One of the apps we work on, if the network connection – you queue up a bunch of stuff to be done and it potentially, depending on how many – It’s analyzing songs but there's a web component of that and depending on how many you've queued up, it could take hours. If your network connection drops right in the middle, it pauses that. But if we can tell that the network connection has come back online, we resume that. Because a lot of times, users walk away – this is a Mac app, of course, not an iOS app – they walk away. They're not even near their computer to know that anything's gone wrong. It's a bummer to come back and find out that the whole thing stopped 10 minutes after you walked away and never started itself up again.

MIKE:

Yeah. That thing is really good. Think about how your users are going to use it and being able to recover when things go wrong can be really nice. Because if the connection drops for 10 seconds, and then that just kills the whole process, that's no good.

Reachability can really help like that. Even if you ignore that, just retrying failed requests can be a really helpful thing – if you just retry things automatically that can make the user experience a lot nicer.

JAIM:

How do you determine how often to retry?

MIKE:

[Chuckles] That is a really good question. I pretty much just wing it. My usual approach is to go for an exponential backoff thing. If it failed, retry within a second. If that fails, try again 10 seconds later. If that fails, maybe try a minute later. And if that fails, you're out of luck at this point. Depending on what it is. If it's an interactive thing, then you don't want to take too long. If it's something where – like with the song analysis stuff; if it might be taking a long time, then you definitely want to keep retrying.

But the idea with that, if the connection's been down for five minutes or whatever the problem is – the server's unreachable for five minutes, then there's no point in trying once a second because you've already been dead for five minutes. If you're trying once a minute at that point, you're not wasting too much time relative to how long you've been waiting. If you backoff a little bit like that, you can pick things up reasonably quickly without hammering everything to much.

ANDREW:

I've noticed the Slack app handles this fairly well. If the Slack Mac app can't connect, it puts a little banner that says “Unable to connect. Retrying.” in some number of seconds. It counts down and it does this exponential backoff. The first time, I think, it retries five seconds later. But then they also add a button that says “Retry now.”

MIKE:

Yeah. Allowing the user, again, user control over that stuff because they may know what's going on better than you do, especially if you're backing off towards a minute or two between requests. Maybe the user is like “Oh, well, you can't connect because my cable modem crashed and I just rebooted it.” so now you can connect. If they don't make them wait, they may make them happier.

JAIM:

Yeah. That's pretty solid pattern. Gmail does the same thing.

MIKE:

Right.

JAIM:

You see the power to try it again if you want.

MIKE:

With a web page, there's always the refresh button. For apps, you have to do that yourself.

ANDREW:

Is there anything else about syncing or network connectivity planning that we should talk about?

JAIM:

I wanted to talk about – going back to the document syncing – how does Dropbox make things more difficult?

MIKE:

The main problem that I have with Dropbox – and this is not really their fault exactly. It's just because they try to do work with everything – is just that they make no attempt at merging. If you edit the same document from two places at once without – one of them is offline then you just end up with two copies and it says “One of them is conflicted” and that's it.

Rather than try to deal with these conflicted copies, that's why I decided to basically go with every computer gets their own file. Instead of trying to work with Dropbox, its conflict resolution – maybe that's different from Google Drive or Microsoft stuff, just try to bypass that.

Really, what's difficult about Dropbox is isn't so much of what they do. It's just the fact they are not really a synchronization service exactly. They're more of a – not really sure how you would describe it – essentially, they're just trying to replicate files. From the level where they work, they can't really do anything fancier. The fact that they are replicating files means that you have to work at the file level, essentially. That restricts how you can approach things.

ANDREW:

Are there any differences? You talked about wanting to support other things like Dropbox; like Microsoft's – is it called OneDrive? I guess, probably, iCloud Drive as well. Are there any different considerations for those or do you treat them pretty much the same way?

MIKE:

My goal is to just have the whole thing not care what you're using and just have it all work. I think – because they all ultimately do the basic thing. What they do is they take a file and then makes sure it goes everywhere. The only place where they could really differ is what happens in the event of a conflict. I'm not sure how these other services deal with conflicts. I imagine it's probably similar to Dropbox in that they'll just give you both copies and let you work it out. But my idea is that if I can just avoid ever creating a conflict at the level that they can see, then all those differences should disappear. You'll just see a service that makes sure that the same data exists everywhere.

Then, because everybody, every computer's dealing with a different file, you'll never conflict at the file level. Then, at the document level which, in this case, is conceptually a collection of files, then you can deal with conflicts more intelligently in our own software which understands what's going on more. My hope is that if I do it right, I don't have to care what the differences are because they won't apply.

ANDREW:

Cool. I'll be interested to hear how that actually works out.

MIKE:

Yes. Me too, actually.

ANDREW:

I read – this is an aside but I read a blog post by – I think it was actually a podcast with Gus Miller yesterday. The interviewer asked about VoodooPad and he said “Oh, I don't really know what the status is.” But you were working – that he thought you were working on it. I'm glad to hear that you are working on it. I think there's a lot of people that like VoodooPad and are eager for an update.

MIKE:

Yes. We had a period where we couldn't dedicate much time to it but now we're into the swing of it. We're hoping to get a point update into people's hands pretty soon here and bigger things to come.

ANDREW:

Yeah. Well, that's good news. I think we all know that how app development is an unpredictable beast. You never quite know how long it's going to take.

MIKE:

Yes. That is true but we all try our best for sure.

JAIM:

Well, it always takes until money runs out.

MIKE:

Oh boy.

[Chuckles]

ANDREW:

I think Mike is hoping that's not anytime soon. [Chuckles]

JAIM:

Oh, no.

MIKE:

I'm hoping we can get something out there before the money runs out. That would be cool.

JAIM:

That's good.

ANDREW:

Anything else you guys want to talk about?

MIKE:

I would just like to very briefly propose that we all agree that the past tense of sync as in document sync should be sunk – not synced.

[Chuckles]

ANDREW:

I like that.

MIKE:

Alright.

JAIM:

That's my reputation after the app doesn't work like they supposed to.

MIKE:

There we go.

JAIM:

After you sunk your app – Yeah, this has been some really good conversation on document sync and handling offline cases in your apps. We're ready to get to the picks.

ANDREW:

Sounds good to me.

JAIM:

Alright. Mike, your inaugural picks as an iPhreaks panelist.

MIKE:

Alright. Well, since I just posted the link anyway, how about that differential synchronization stuff? That's a really interesting webpage with a lot of interesting information about how to diff text and merge text and things like that. Even if you're not doing anything like that, it's still very educational.

JAIM:

Very cool. Andrew.

ANDREW:

Yeah. I've got a couple of picks today. The first one is a company called MailRoute. I think a lot of people have probably heard of them. It's a server-side spam filtering service. I had one particular email addressed out of the 10 or so that I use actively that had recently just become inundated with spam. Changing the spam filtering settings on the host didn't fix anything. I got fed up so I signed up for MailRoute which is not free but it has completely eliminated my spam problems so far for the last week or so. I'm pretty pleased with it. It was not hard to setup. They've been very good customer service-wise, too. That's MailRoute.

Then, my second pick is a – I have a Mac 2 that has recently had it's hard drive go bad which is not too surprising because it's the same hard drive that was put in the machine when it was new in 1987. The problem is it's a little bit tough to come by good new hard drives, 40 MB hard drives for Mac 2. I found this project called SCSI2SD. It's an open-source hardware but the guy is also selling the boards. It's basically just a circuit board that's roughly the same dimensions as a hard drive but instead of being a hard drive, it has a SCSI port and a microSD slot so you can use this to put a microSD card in your old computer. It's specifically meant for old SCSI computers including old Macs but also Amigas and probably some Atari computers and things like that. I do not have one yet but I'm hoping to get it soon. I'll report back with my findings.

Those are my picks.

MIKE:

You really have me wishing I had an old computer that I could use this in now.

ANDREW:

Yeah. I've got a Mac 2 and a Mac Plus. But the Mac Plus is actually okay. Hopefully this will resurrect the Mac 2. I luckily have a backup of everything that was on the hard drive that I pulled off before it died. I hope I can get back up to where it was. It's running System 6.

MIKE:

Very nice.

JAIM:

Alright. For my picks, I'm going to carry on the tradition and do WWDC picks. For those who are travelling out there, we don't have Pete this year. He was our resident Bay Area expert. Mike, are you going to the conference?

MIKE:

No, I'm not. I couldn't make up my mind as to whether I wanted to try to get in and time makes fools of us all at that point.

ANDREW:

I did the same thing. I couldn't decide if I could afford or do not which meant I could not afford it. So I'm not going.

JAIM:

That's too bad. I'll be there. Alondo's there. I'll actually be speaking at AltConf on Tuesday morning so you can check us out there. Definitely, if you're in the area, hit us up on Twitter. Alondo's there; I'm there.

A couple of picks that Pete did last year, I enjoyed. There's a food truck Curry Up Now that is in – it moves around but the area where I found it was in the GFood Lounge which is a bit of a walk from the conference area but doable within an hour. The GFood Lounge has three or four food trucks out there that have different selection of food. If you people that can't decide what they want to eat, it's a good choice. It's a nice walk on the way to the giant stadium.

Another pick I have is Calzone's which is in Little Italy. I had my Airbnb cancelled over two weeks before the event last year so I had to scramble to find a place to stay. I ended up staying in Russian Hill which meant I had to commute down to the Moscow-ny area. One night I walked home and wandered to Little Italy and stopped by at Calzone's and had some amazing Italian food. There's a place across the street where pastry and candy shop where I had some amazing cannolis. I can definitely say “Get a cannoli from Little Italy.” if you can pull it off. SoMa is great but there are a lot of really cool areas in San Francisco.

Those are my picks for WWDC this year. If you're like me, you're probably listening to a podcast on a plane on the way there. Possibly – hopefully you're podcast is handling your offline stuff reasonably well. That's all I had.

Anything else from the peanut gallery?

ANDREW:

Are you going to wear your iPhreaks T-shirt?

JAIM:

I will. I've got it. You'll see the light pink long sleeve shirt. I shouldn't be hard to spot.

ANDREW:

No. These pink iPhreaks T-shirts definitely don't help you blend into the crowd.

JAIM:

Nope. That's how it goes. So, yeah, great show. I learned a ton. I think we pry into some really good information. Welcome to the show Mike. We're glad to have you on.

MIKE:

Well, thanks very much. I'm looking forward to more.

JAIM:

Yeah. We'll see you all next week.

[This episode is sponsored by MadGlory. You've been building software for a long time and sometimes it gets a little overwhelming. Work piles up, hiring sucks and it's hard to get projects out the door. Check out MadGlory. They're a small shop with experience shipping big products. They're smart, dedicated, will augment your team and work as hard as you do. Find them online at MadGlory.com or on Twitter @MadGlory.]

[Hosting and bandwidth provided by the Blue Box Group. Check them out at BlueBox.net.]

[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit cachefly.com to learn more]

[Would you like to join a conversation with the iPhreaks and their guests? Want to support the show? We have a forum that allows you to join the conversation and support the show at the same time. You can sign up at iphreaksshow.com/forum]

108 iPS Synchronizing Documents & Offline Handling with Mike Ash

0:00

52:14

Playback Speed: