136 JSJ TrackingJS with Eduardo Lundgren - JavaScript Jabber -

136 JSJ TrackingJS with Eduardo Lundgren

The panelists discuss TrackingJS with Eduardo Lundgren.

Hosted by:

AJ ONeal •

Charles Max Wood

Special Guests:

Eduardo Lundgren

RSS Spotify Apple Podcasts YouTube Amazon Music

Show Notes

The panelists discuss TrackingJS with Eduardo Lundgren.

Special Guest: Eduardo Lundgren.

Transcript

[This episode is sponsored by Frontend Masters. They have a terrific lineup of live courses you can attend either online or in person. They also have a terrific backlog of courses you can watch including JavaScript the Good Parts, Build Web Applications with Node.js, AngularJS In-Depth, and Advanced JavaScript. You can go check them out at FrontEndMasters.com.]

[This episode is sponsored by Codeship.com. Don’t you wish you could simply deploy your code every time your tests pass? Wouldn’t it be nice if it were tied into a nice continuous integration system? That’s Codeship. They run your code. If all your tests pass, they deploy your code automatically. For fuss-free continuous delivery, check them out at Codeship.com, continuous delivery made simple.]

[This episode is sponsored by Hired.com. Every week on Hired, they run an auction where over a thousand tech companies in San Francisco, New York, and L.A. bid on JavaScript developers, providing them with salary and equity upfront. The average JavaScript developer gets an average of 5 to 15 introductory offers and an average salary offer of $130,000 a year. Users can either accept an offer and go right into interviewing with the company or deny them without any continuing obligations. It’s totally free for users. And when you’re hired, they also give you a $2,000 bonus as a thank you for using them. But if you use the JavaScript Jabber link, you’ll get a $4,000 bonus instead. Finally, if you’re not looking for a job and know someone who is, you can refer them to Hired and get a $1,337 bonus if they accept a job. Go sign up at Hired.com/JavaScriptJabber.]

[This episode is sponsored by Component One, makers of Wijmo. If you need stunning UI elements or awesome graphs and charts, then go to Wijmo.com and check them out.]

CHUCK:

Hey everybody and welcome to episode 136 of the JavaScript Jabber Show. This week on our panel, we have AJ O’Neal.

AJ:

Yo, yo, yo, coming at you live from the very southern bowels of Provo.

CHUCK:

I’m Charles Max Wood from DevChat.TV.

I just want to give you a quick reminder. If you text JSREMOTECONF to 38470, then you can get information about the JavaScript Conference that I am pulling together, or you can go to JSRemoteConf.com and sign up. It will be after work if you’re in the US. And it should be a good time. I’ve got some awesome speakers lined up.

We also have a special guest this week, and that is Eduardo Lundgren.

EDUARDO:

Hello.

CHUCK:

Do you want to introduce yourself really quickly?

EDUARDO:

Yeah, awesome. My name is Eduardo Lundgren. I currently work in a company called Liferay. It’s based in L.A. although I live in Brazil. And we currently take care of the branches down there. We have four different branches there, one in Recife which is where I live, and another in São Paulo. I have been contributing to open source for a few years, since 2007. I’ve been involved pretty much in Java projects, PHP projects. And then I started to like JavaScript and frontend development around this time.

So, I started to contribute to projects like jQuery and jQuery UI. I was responsible for dragging, sorting, resizing, and all those little pieces for jQuery UI. And it was very good. I learned a lot by this time. In 2008 I released a library called jQuery Simulate. And this was basically because we didn’t have a good way to test frontend applications on the browser. I like to tell the browser, “Click here. Drag and drop this node from X five pixels to Z whatever”. And this library allows you to simulate and create a real event in the browser, like someone was creating if they are pressing a key or clicking a mouse. So, it’s called jQuery Simulate.

In 2009, I moved forward to YUI library from Yahoo Company. And we built a library called AlloyUI. It’s a framework that is built on top of this YUI 3 version of Yahoo. And it has around 50 widgets like scheduler, models, state [peekers] and so forth, in order to allow you to use those in your application. And yeah, that’s it. And now I have the Tracking.js project which is this computer vision library that we are going to talk about today.

CHUCK:

Very cool. It looks like Joe Eames also joined us.

JOE:

Hey.

CHUCK:

Anyway, so Tracking.js is a computer vision library. Now, when I think of computer vision it’s mostly looking at images or videos and picking out particular things. Is that what this does or is it more of a drawing and physics library?

EDUARDO:

Yeah, so Tracking.js includes a bunch of algorithms that you can create solutions for computer vision. So, in other terms this means that we have algorithms to find keypoints on a picture or on a scene for the camera. So, keypoints are important part, any variant part of the scene that it can use that information to match in other frames. So, we also have libraries for matching those keypoints.

So, let’s say you have two different frames in a video and you find all the keypoints, the important information in one video. And in the second frame of this video, we need to match them in order to combine the variables to resolve the math necessary in order to track the elements or to find elements in the scene. So yeah, Tracking.js combines all these little pieces of techniques that we can use them to find objects in a video or to find faces, find mouth, ear, eyes, these parts on the scene.

CHUCK:

Very cool.

AJ:

I’m playing with the demo right now and this is freak-amazing. I knew this kind of stuff was happening, but I did not realize that it had progressed this far. This is on par with the C library that everybody uses, I think.

EDUARDO:

Yeah. There is a very famous library in C called OpenCV.

AJ:

Yeah, that one.

EDUARDO:

Yeah, OpenCV is very famous although it doesn’t run on the web.

AJ:

Yeah. It looks like you’ve got just tons of really great features in Tracking.js.

EDUARDO:

Yeah, I listed them here. We have currently utilities to detect a feature from a scene, like important parts of the scene. We also have feature descriptor which is what matches information between the scenes. And they have examples on the website. I will put the links in the picks later on. We have helpers for image convolution so we can process images like blur some image or transform an image to grayscale or things like that. We have also I think Sobel technique, which is a technique that extracts the edges on an image. So, let’s say you have your picture. I can extract just the edges, like a kid is drawing your face on a paper like this. So, we can extract that one.

We have implemented Viola–Jones. Viola–Jones is a very popular face detection algorithm, object detection algorithm, which is very popular because it’s very fast. It’s also the ones used inside that old digital machines to take pictures, the cameras. Remember a few years ago when it would start to focus on a person. It shows little rectangles on the faces of them. So, this algorithm is Viola–Jones implemented on the hardware. And they could do that because it’s very efficient algorithms. And this one was the one we picked to bring to the web because of these characteristics.

We also combine all those little utilities we have, color tracking and object tracking. So, the color tracking, the name says already, you can find colors and you can define the colors you want to track. And the object tracker you can train the Viola–Jones algorithms to detect some objects. So, currently we detect faces, eyes, mouth. But they have other trainings available on OpenCV project. We can get those trainings from OpenCV, their image database that they train their information, we can import to Tracking.js as well.

CHUCK:

So, was there a project or something that prompted you to work on this or was it just a, “Hey, people could really use something like this”?

EDUARDO:

Yeah, that was very, this is a long story. So, I started the Masters in Computer Science, but I’m not a computer scientist. I am an Electronics Engineer. So, when I started the Computer Science Masters, I was trying to find something interesting to do. I didn’t have an idea which area to go. And then I met these people in the university that they were involved with augmented reality and computer vision. And this was very interesting because I like the kind of subject. It’s very interesting. You can use math on algorithms which is fun.

And I started to play around with no intention to build anything. But ended up that in the end I had a few techniques implemented in JavaScript on the web, because I realized most of the techniques were in C, C++. And they were very, very hard to use. The API is very long and no documentation. It’s not like the kind of APIs web developers are used to using. And I would like to build this in a simpler way. And JavaScript was my choice.

And by then, it was very dangerous because we didn’t even have getUserMedia. And without getUserMedia API which is the specs that W3C is working on that makes available, you detect camera and audio from JavaScript without any plugin or third-party installation, so we didn’t have anything. So, I used to test all the algorithms only in Canvas. And then I remember when they released the getUserMedia on Chromium I was so happy, because I could get the algorithms that was only applied for Canvas and actually do on videos. And that’s how it started, the library, based on these little pieces implemented on the web.

JOE:

So, you said that when you’re looking at the language, you chose JavaScript. Could you give us some indication why you chose JavaScript? For image processing, it’s obviously not going to be the most popular language because it’s not your classic computer science scientific type language.

So, why choose JavaScript?

EDUARDO:

Yeah, that’s a very good question. First, I think JavaScript is a great language because of one characteristic, which is it can run in different devices and in different platforms without too much effort. But of course we know JavaScript has a lot of performance issues. And those two combined, the good part which is portability and the bad part which is performance, made this project very challenging for me. And that was pretty much the idea to do. I would like to have something that was portable and runs everywhere. You can write one time code. It runs in devices, in cellphones, in notebooks.

And also, I would like to have a very simple API because JavaScript people are used to having simple APIs for doing complex things. Like jQuery is a good example of it. You can traverse DOM. You can do Ajax. You can do a lot of operations in a website with a very simple API. And why for computer vision this was not true? In order to use OpenCV or OpenGL, it’s very complicated. And bringing those techniques to the web with a very simple API was also another inspiration.

CHUCK:

So, one thing that I’m wondering about. It looks like there are two things here. So, one is the tag friends or it’ll pick a face out of the image or off your webcam. But then you got iron spheres and…

AJ:

It’s augmented reality.

CHUCK:

Oh, I see.

AJ:

Yeah, so if you move your head to the right it detects it. It’s mirrored, unfortunately, which I guess is the web API needs to have mirrored equals true or something like that. But if you turn your head to the right, it moves left. If you turn your head to the left, it moves right. If you move up, it goes down. And if you move down very slowly, it goes up. It’s like this AR realm, and the same thing with the racing car. You can take some blue tape, like blue masking tape, put it on a box that you’ve got laying on your desk, and then move the box and it steers with the blue tape dots.

CHUCK:

Oh, I see.

EDUARDO:

Yeah.

CHUCK:

So, the computer vision component is you’re controlling the stuff with the computer vision stuff. The others, like the iron spheres or the actual racecar, are written with some other Canvas library or something.

EDUARDO:

Yeah. So, those examples are examples how to apply those techniques to real applications, like to do augmented reality applications. And then we use three.js to render the 3D part. And we control the coordinates and the operations we need to apply on three.js based on the coordinates we extract from the world using Tracking.js.

AJ:

Whoa. Leap motion controller, what is that? I’ve heard about that? Is that something that integrates with Tracking.js?

EDUARDO:

No, leap motion is a hardware. It’s a hardware that it can apply [inaudible] to your computer. And it takes tracks from the scene, the 3D scene, the coordinates of your hands, your fingers. And then you can control the computer using that.

AJ:

Okay.

CHUCK:

Yeah, it’s really cool.

EDUARDO:

It’s very nice.

AJ:

So, that’s just part of the HexGL demo. That’s not part of the Tracking.js demo?

EDUARDO:

Oh, the HexGL demo you are seeing here, you don’t need the device. You can use some colors. If you have two colors there…

AJ:

Yeah.

EDUARDO:

If you have blue and red, you can actually play with it.

AJ:

Yeah, I see it has options for gamepad, keyboard, leap motion controller. So, I guess APIs for the leap motion controller are built into Chrome? I mean, this may be a little off topic, but I’m curious since I’m seeing that on the demo here.

EDUARDO:

It seems like you can install some plugin to access the leap motion API.

AJ:

Okay.

EDUARDO:

In this game, we are only using the camera one to control. So, [inaudible]

AJ:

Right, which is the, that’s the wheels control.

EDUARDO:

Yes. So, this one is the one we implemented on top of this game. This game is from a guy, Thibaut Despoulain. I don’t know where he’s from. So, we got his game and we applied these controls using Tracking.js to control the car on the 3D scene.

CHUCK:

Oh, that is cool.

AJ:

I’m wearing a blue shirt, so I’m having a hard time with it because it [chuckles].

EDUARDO:

[Chuckles] Yeah. That’s actually a funny thing because the first presentation I made about Tracking.js I was thinking which color should I pick in order to not conflict with any existing color in the nature, in the environment I’ll be? And then I picked magenta. Magenta, you don’t find magenta color very easily. And when I arrived in the event, all the theming and all the background of the event was magenta.

CHUCK:

[Laughs]

EDUARDO:

It was crazy because I didn’t have any other object. I set up all my code by then to use magenta. And it was a nightmare.

CHUCK:

[Chuckles] You had to go get a backdrop that was white or something? [Chuckles]

EDUARDO:

[Chuckles] Yeah, someone should hold some towel behind me or something.

CHUCK:

There you go.

JOE:

So, you’d mentioned before the good parts about JavaScript, the bad parts, and you mentioned performance. Could you give us some idea about performance characteristics for this sort of thing? You specifically mentioned it running on a phone, so I’m interested. What’s it like when it’s running on a phone? How does it compare to C++ libraries performance-wise? What kinds of things are really slow and difficult and what kind of things work fairly fast even in the browser?

EDUARDO:

Yes, very good question. I extracted some numbers for that a few months ago, and it’s getting a lot better. So, you’ll see we have currently the object detection algorithms and the color detection algorithms. The object detection algorithms, we can reach very, very close to 50 to 60 fps. When you have a low number of faces and frame size about 400 pixels width and 300 height.

So, if you have a normal size for your camera captured on the web, the object tracker can apply to the scene in real-time, pretty much. It can find your face. You can find your mouth in a very good amount. When you start adding more faces to the camera, let’s say your friend joined the video. When you have two faces or two mouths, it starts to decrease the performance. And then my numbers showed me that until 5 faces, you can reach 25 frames per second, which is relatively good and it’s considered real-time still. When it will go more, when you have 10 faces, it’s not good. It starts to get very slow.

So, when you have a language like JavaScript, you start to have this kind of problem. If you are in C or C++ maybe it will not be 5 faces. It’s going to be 15 or 10. But in reality, on the web, we also have some advantages. Let’s say we can predict where the face of the user is, how far the face is going to be from the notebook or from the computer. And this helps up to speed up the algorithms a little bit. So, we may improve this even better at some point.

elements:

float, int, unassigned array, all of those kinds. And then we find whatever was best for the algorithms. Then we picked this instead of regular arrays.

And we learned a lot of things that were crazy in this research, because let’s say you have a typed array and you have all numbers inside this array. Some browsers when you have zero or that is not for example a float, or you have undefined value mixed with the other values, your array becomes very, very slow. So, we were flagging an array with some information. Sometimes, we put numbers, sometimes we put undefined. That’s not a good idea. So, it started to be very, very slow. So, then we had to optimize all the arrays we have between the algorithms to be, if it’s float arrays, only use float. If it’s unassigned array, make sure you are doing unassigned numbers inside this array. We learned that by mistake.

Another thing that we had to do in JavaScript which you may not need to do in C or C++ is to optimize the multiplications, divisions operations. Sometimes we had to replace division or a multiplication with some shift, right shift or left operation binary. That speeds up a lot as well. So yeah, I have to do all this tricky things in JavaScript in order to be decent. And it is not even close to C. But it’s a beginning. Chrome is evolving a lot. Now the new Chrome version in a few weeks, it’s going to have some very good improvements in Canvas. The way they talk to the GPU is going to be optimized, even for 2D contexts. And I’m pretty excited to keep testing these numbers and see how it goes in the future.

AJ:

So, what about Firefox with asm.js?

EDUARDO:

Asm.js?

AJ:

Yeah, do any of your optimizations, can they be applied to that? Do you know?

EDUARDO:

I don’t know this sm.js thing, Firefox. Oh, asm.js. I see. Oh, okay. I see what you mean. Asm, right?

AJ:

Asm, yeah, asm.

EDUARDO:

Asm, yeah, perfect. I misunderstood. Okay. Asm is this pretty nice project. Currently we don’t need to do any optimization using that in those algorithms. But definitely if you do, they’re going to be very fast.

AJ:

Okay.

CHUCK:

I see all your examples. Do you know of websites on the web that are actually using this?

EDUARDO:

Yes. I picked some links for you guys. Let me paste here. So, there is this, first one is Burning Head. Some guy made this pretty useful demo which you can record yourself. And in this recording you can place gif, animated gif that burns your head. So, it can save it and tweet it. This is one example of this.

The second one, there is a video. I took video for this Burning Head as well. You have to click allow on top of your browser to allow the camera to capture yourself.

CHUCK:

Now I have to try it. [Laughs]

EDUARDO:

Yeah, try it out.

CHUCK:

Okay, I told it to record, convert to a gif.

EDUARDO:

Yeah.

CHUCK:

Yeah, it just recorded me. I thought it was supposed to make my head on fire.

EDUARDO:

[Chuckles] It’s your head, just your head. Do you see your head on fire?

CHUCK:

No, I just see my face.

EDUARDO:

Mm, you should see something like this. Check this second link here. This link has an example of this.

CHUCK:

There’s music. We’re going to hear it.

EDUARDO:

You can go to the end of this video.

CHUCK:

Oh, there we go. That’s funny.

EDUARDO:

Yeah. And a second example, this guy created this example to scan Nespresso capsules. Do you know the Nespresso, the espresso machine, the coffee machine?

CHUCK:

No.

EDUARDO:

Nespresso. It’s like a machine to build coffee. And this machine has little capsules. And those capsules, they are different colors. It’s the flavor of the coffee. And in the second gif here, I’m going to paste it again, this person created an application that runs on the phone and also on the desktop. And this application can detect the capsule you are trying to figure out and they give you all the information about that, like the color, the name, the flavors.

CHUCK:

Very cool. So, you just implemented some fairly well-known algorithms for this, right?

EDUARDO:

Yeah.

CHUCK:

You didn’t create your own?

EDUARDO:

Yeah. And just finalized, the last example is this Slipknot. Slipknot band, have you heard about this one?

CHUCK:

Yeah, the slipknot?

EDUARDO:

Yeah, the guys wear some masks. They released this website. It’s a new album. And there is, it’s called ‘The Gray Chapter’. In the album CD, they have these masks application that you can wear the singer masks on yourself and try it out. So, if you go to this website that I pasted, they have ‘The Devil In I’, you click on one of the band persons like Mick or Craig and then it will ask you to allow your camera. So, you allow your camera on Chrome or Firefox and then you can wear the masks and take pictures of you wearing the masks of the singers.

CHUCK:

Huh. That’s fun. I could see some fun things for kids and stuff, where you enable the camera and then it does fun stuff around them and things like that.

EDUARDO:

Yeah. One more application that could be useful is to do real-time video editing. So for instance, this video here that I’m going to paste the link for you, it’s a video running and you can extract from the scene yellow parts or magenta parts or green parts. So, you can create, you can cut all the green background on my video and apply some nature or some different background if you want.

CHUCK:

Oh, so it’s like a green-screen except it’s real-time.

EDUARDO:

Exactly.

CHUCK:

Yeah, this looks like something you could have a lot of fun with.

EDUARDO:

Yeah. The demos in the website, they are very basic. So, I’m curious to see what people are going to do with them at some point.

CHUCK:

So, one thing that I’m wondering about a little bit is, are there any limitations to this as far as what platform you can run it on? Or do you see that it tends to work better with certain browsers or anything like that?

EDUARDO:

Currently the algorithms work in any browser that supports Canvas. So, if your browser has support for Canvas, which means IE 9+, Firefox 31+, Chrome, Safari, Opera, iOS’s Safari, Android browser, Chrome for Android, all those browsers support Canvas already. And this means that you can apply those techniques to pictures, to Canvas and everything. When you go to video capturing, you rely on getUserMedia. And getUserMedia and the stream API is only available, is not still available on IE and Safari. So, Chrome, Firefox, Opera, all those browsers, even Android browser has this. The Chrome for Android also has this. But IE and Safari don’t capture your camera. So, you can partially use Tracking.js in all browsers. But when it goes to capturing your videos, you need this spec to be on.

CHUCK:

Hmm, good to know. Does it work on Chrome for iOS?

EDUARDO:

Yes.

AJ:

This is interesting. I get different results in Firefox than in Chrome. It’s detecting my image slightly differently.

EDUARDO:

In which demo?

AJ:

In the F0 demo, the racecar demo.

EDUARDO:

Oh.

AJ:

Before, it was detecting red on an image on the wall behind me. And now, it’s detecting red on my face.

EDUARDO:

Yeah.

CHUCK:

What’s on your face?

AJ:

My nose. [Chuckles]

AJ:

It’s actually putting a nose, like a little clown nose. [Chuckles]

EDUARDO:

Yeah, the color algorithm has this characteristic to be very sensitive. So sometimes, when your light situation changes or…

AJ:

Oh, maybe the position of the sun is a little bit different than ten minutes ago when I was trying.

EDUARDO:

Yes, exactly. So, it captures everything like this. So, when you use those techniques to do something real, you have to start applying different techniques to make sure it doesn’t get very sensitive. So for example, when you detect a face, before detect the face blur the image a little bit.

It’s better.

AJ:

Oh, interesting.

EDUARDO:

Yeah, things like that. So then, we don’t combine things like this because this depends on how you are applying your techniques to create your visual augmented reality or virtual application. So, then we leave it free. You have to start playing around, combine techniques to see what makes sense or not to be together and so forth. Yeah, but that happens.

CHUCK:

So, I was going to ask about running this on Node.js.

EDUARDO:

Yes?

CHUCK:

I guess it would work a little bit differently because you don’t have a Canvas element to work off of.

EDUARDO:

Yeah.

CHUCK:

Does it still work or are you better off using a library that’s compiled against OpenCV?

EDUARDO:

No, it works. And we are already using this in some internal project that we upload all the pictures from the users. And we have a scheduled task that using Node.js grabs all the pictures and indexes the faces of all the users. So, then you can do like Facebook’s tag friends. And in Node you don’t have Canvas. So, because you don’t have Canvas, you cannot easily track an image that you draw inside this Canvas. Then you apply Tracking.js to give you the faces.

But the API of Tracking.js is very separate. It’s decoupled. So, the techniques you can use, applying arrays of pixels instead of the Canvas itself. So, you can, using Node read the image’s pixels. There are many libraries to do that. And you get these image pixels and invoke the low-level algorithms in Tracking.js to extract you the rectangles for the faces. So, our tests are doing it like that. The way we test the APIs off in Node, we fake these arrays manually in order to apply the algorithm. But you can actually use those libraries to extract from real images and use them.

CHUCK:

Where was this when I was in college? My senior project was that we had to build a robot. It had a camera mounted on top and it would basically follow a marked path with white on both sides across a lawn. And so, it had to be able to see that the path turned and then follow the path. Anyway, if the white on both sides disappeared, it had to go straight until it found the path further on. And so, it couldn’t deviate or anything else, which was interesting in different sunlight and things like that. But anyway…

EDUARDO:

Yeah.

CHUCK:

This would have been really handy for that because we were writing a lot of stuff by hand to try and get it to work.

EDUARDO:

Yes, I know what you mean. There is also in Tracking.js a tracker interface, which is perfect for what you just described. This tracker interface gives you all the low-level infrastructure to only require from the developer to implement one single method that looks at the array, find whatever he wants, and returns. And then you can implement custom trackers like the one you described very easily. In the documentation page, TrackingJS.com/docs they have one example of using the tracker interface.

CHUCK:

Very nice. So, do you have to know much about how the algorithms work in order to build apps on Tracking.js? Or do you just say, “I care about colors that look like this” or, “I’m looking for a head and if it tilts this way,” or things like that?

EDUARDO:

Yeah, in Tracking.js you don’t need to know the algorithms in order to use them. And this was also one of the inspirations for this library. I didn’t understand why everything in computer vision when I got to the university was so complicated to use. And I said, “I need to build something that people don’t need to spend three weeks learning how Viola–Jones works or how to track some caller.” I just want them to use and have fun with that. And Tracking.js provides you a very simple interface.

So for example, when you tell Tracking.js to find some faces in a video or a Canvas or even an image, what Tracking.js gives you is one event, a track event that fires. So, it just listens for Tracking.onTrack and it gives a data payload. And this payload gives you rectangles. So, four coordinates…

CHUCK:

Mmhmm.

EDUARDO:

For a face. So, if I have three of those, it means you have three faces. So, it’s very easy. If you are tracking colors, exactly the same interface. It fires a track event. This track event gives a data payload. And this data payload gives you coordinates for the colors found and also the color and everything. So, you don’t, you don’t need to know them.

CHUCK:

Now, can you define something that’s not a face that you want it to find in the same way that you find faces? So for example, if I was looking for a particular, I don’t know, particular letters on the screen or a particular word, if I wanted to use image capture for that. I don’t know why, but you know.

EDUARDO:

Mmhmm. Yeah, so for example when you try to find a word, that depends on what you have on this word. For instance, if you have a color on this word, I would recommend you to use the color tracking instead…

CHUCK:

Right.

EDUARDO:

Of the object detection, because the object detection is more complex to find the objects than the color. So, you can also use one of these little techniques that we have like the feature detection and the feature description which means it can find information on the scene that varies very low. So, for example if you have different frames, they will always find the same points. So, you can use the same points you found between the frames of your video to understand that it’s the word you are trying to find.

But if you want really to train the object detection to find any object, you have to use some other library because the training part of Viola–Jones is a totally different algorithm which OpenCV implements. And if you get OpenCV you can position some object in front of the camera for a few seconds and do some movements that they ask you to do. And this will output an XML. And this XML is the training that they extracted from this recording. Say you want to train a pen. You get your pen or your pencil and you start moving around the camera. So, those coordinates are going to be extracted into this training data. And Tracking.js understands this training data. So, you can import that inside Tracking.js and then use on the web your new training. That was clear?

CHUCK:

Yeah, mostly.

EDUARDO:

[Chuckles] Yes.

CHUCK:

It is a little bit complicated I think. But yeah, what I gathered was that it’s easier to track things if they’re colored differently or have some other defining feature other than just the shape of the letters. If you’re going to be doing word recognition then you’re probably better off finding something that is good at word recognition. But if that word is going to be a different color every time it occurs, then Tracking.js can find it and identify it and all that stuff.

EDUARDO:

Yes. And this object tracker which is available in Tracking.js uses Viola–Jones algorithm. So, that’s why I was mentioning Viola–Jones and OpenCV stuff.

CHUCK:

Mmhmm.

EDUARDO:

Because when you have object detection, object tracker, you are actually using Viola–Jones algorithm. And this means you can use any training data for this algorithm to inject inside Tracking.js. And Tracking.js will find them.

CHUCK:

So, if I had a video of my son or a video of racecars or something like that, then I could say, I care about the red car or my son is wearing a blue shirt, so it’ll identify all of the kids out on the basketball court with blue shirts.

EDUARDO:

Yeah. This example you gave currently is easy to do. You need to track this video and say I want the blue color. And this will find all the rectangles of the blue colors found. And then you can look through them and do some painting or remove the t-shirts of the kids and do things like that.

CHUCK:

[Laughs]

EDUARDO:

[Chuckles]

CHUCK:

It sounds like you could have a little bit of fun with this.

EDUARDO:

Yeah.

CHUCK:

I’m just trying to think of anything else that I should ask about this. So, are these techniques the same kinds of techniques that Google Effects on the Google Hangouts use?

EDUARDO:

Yes. They are similar. Google Hangouts, they combine two techniques in order to find and place masks on you. Like I said, the example of the image before detect the faces, you can combine other kinds of algorithms. So, there is one algorithm that is very famous because it speeds up your detection, which basically is when you find some point, let’s say I want to track your nose. So, I just click on your nose with the mouse. And when you start moving your face the video of the frames gives you optical flow that makes sense. So, this ‘makes sense’ means I can find it. So, I can use these optical flow techniques that are very fast to follow your nose.

But what’s the problem with using only this technique? It’s because after some point, it loses precision. It will start to not point to your nose. It will point for example to your cheek or your eye because it’s not very precise. It’s not very smart. It just uses the flow of the video to follow. But this is very good because you can combine this with face detection. So, let’s say you do a face detection that finds exactly where your face is. But I don’t need to do every frame because when I do every frame it’s spending computer without need. So, I can combine for each five frames I will find your real face. And between those five frames, I will only go in the flow using the optical flow. So, this gives you more smooth tracking.

So currently, Tracking.js doesn’t have optical flow. It’s actually per frame finding your face, which is very heavy. [It’s not like how people do]. But the algorithm is available for you. You can combine with other techniques and make, for example to detect only after 10 frames or 15 frames. So, it’s up to you to use. And in Google Hangouts they find your face very little and they use this optical flow to follow where your face is going. That’s why when you move very fast using Google Hangouts the masks don’t go exactly in the right moment. It goes like being attracted magnetically to your face if you start moving that.

CHUCK:

Yeah, that makes sense. And I’ve played with that and it does. It follows your face as opposed to staying on your face.

EDUARDO:

Yeah, because they do that to be faster.

CHUCK:

Well, I don’t know if I have any more questions. Do you guys have any other questions, AJ and Joe?

JOE:

I guess I was curious if while you were building this and doing your research and all other stuff that you’ve obviously been around this, have you seen anything that other libraries may have done? Other implementations that haven’t really been done yet on the web, they’re just too big and you haven’t had time to do it but you’re excited to see somebody else put that together?

EDUARDO:

Yeah, there are some examples. There is one very famous library called ARToolKit. And this library’s in C actually or C++, something like this. And some guy made a port for the web. It’s called FLARToolKit. The port this guy made uses Flash. And another guy got this FLARToolKit in Flash and ported to JS, JSARToolKit. So, they ported from C to Flash, Flash to JavaScript.

And there is this project which gives you marker tracking. What it means is you can use fiducial markers in the scene. I don’t know when you’ll see this augmented reality examples. They have these black squares or patterns that people print in order to track or to position some element on top of it. So, this library is specialized on this kind of trackers. It’s 2Dmarkertracker. And this was the first one I saw on the web in the past. And I got very excited. I found it very nice. But again, it was hard to use and it was only supporting marker, fiducial markers marking, which I didn’t like too much. I would like to have something more realistic, like you position some color or your face or any object. And I use that information that is nature in the scene to extract information.

There is another one which is, FLARToolKit is one. There is another one. Yeah, this one is called Unifier Viewer but closed source, from a company called Metaio. This was used on the General Electric’s campaign, marketing campaign a few years ago where you can position some magazine in front of your computer and General Electric positioned some 3D scene with some electric components on top of it. So, this was another example that runs on the web and it was pretty popular in the past. But this one is closed source. Other than that, I don’t know many others.

JOE:

Cool. I do have one other question. If somebody was interested in this and wanted to just try screwing around but wanted something very basic, besides just tutorials do you have any suggestions for an interesting little project that could be done in just a short amount of time to learn and play around with Tracking.js?

EDUARDO:

Right. So, in the Tracking.js repo the examples folder, we have around 15 to 20 examples. And those examples try to be a little bit realistic. So for example, when it finds a color it plots some rectangle or something on top of this color. So, if you go to the Tracking.js GitHub which I’ll paste the link as well. It’s github.com/eduardolundgren/tracking.js. You can find inside the examples for things like that. You can also check this Burning Head demo from this guy that he’d been using Tracking.js, Konrad Dzwinel. His name’s kind of complicated. But I will put the links on the picks later. And you can use this open source example to see how he integrates with the face tracker.

But other than that, I don’t have. But feel free, any of you that are listening, if you want to play and have some questions or suggestions how I can show more demos or examples that will help you to build stuff on top of it, let me know.

JOE:

Awesome. Good answer.

CHUCK:

This definitely looks like something I want to play with. I don’t know if I will have time to play with it.

Anyway, well let’s go ahead and do the picks. Joe, do you want to start us with picks?

JOE:

Oh, I have to be first, huh? Yes, I will do my picks. So, last week I think on the Adventures in Angular show I picked a book called ‘Strong Fathers, Strong Daughters’. And I want to pick that book again. I’ve read a lot more and I just found it to be super, super eye-opening as far as the purpose and value of being a father if you have daughters. And it’s one of those books I feel like if you have a daughter, you absolutely should read this book. It should be required reading for anybody who has a daughter to read. It’s written by a medical doctor who spent her entire career dealing with teens and all the corroborating statistical evidence. Just a really awesome book about being a father. So, I really liked it. It wasn’t a very religious book. It was more of a practical book but still filled with good advice about being a father and raising daughters. So, that’s going to be my first pick.

And my second pick is going to be a board game. It’s actually a card game and it’s called Munchkin Loot Letter, which is actually a variation of another game called Love Letter. It’s a great game, very fun, two to four people. Love Letter was sort of, they just wanted a different take on it so they produced a version that uses the Munchkin property or IP and has some funny different names, and changed the card names around. It’s pretty much the same game but it’s got a more funny feel to it. But it’s a super fun game, easy to learn. You could play around in five minutes or something. It plays up to five people and it’s just a super fun game, great game for when you just got a little, short amount of time and a few people that want to play a fun game. So, that’ll be my second and final pick.

CHUCK:

Awesome. AJ, what are your picks?

AJ:

So to start off, I’m going to pick the same book that ten other people have picked because after nine other people picked it I finally bought it and started reading it. And that is ‘Ready Player One’. It is a page turner. I am a person who only likes certain books that are really good. [Chuckles] And this is one of them. I’m all the way to chapter ten after just a couple of hours of reading it on a couple of evenings.

Zelda:

A Link Between Worlds’ and the Nintendo 3DS. The Nintendo 3DS because it actually does 3D without glasses in a way that is kind of cool. And the games utilize the 3D in a way that it actually enhances the experience. The depth perception is more clear. It’s easier to tell where there might be a little hidden area or something like that because of the depth perception. And they do things that I, it’s basically like maybe as powerful as the Nintendo 64. But the way they use the 3D elements to enhance the gameplay are just very Nintendo-esque. And I like it.

Zelda:

A Link Between Worlds’ because it takes me back to my glory days of the Super Nintendo but with a different storyline. And so, I haven’t gotten quite that far yet because I just wanted to explore the map and find all the little hidden places where I could put bombs and [inaudible] and whatever. But it’s a very fun game. And if you liked the Super Nintendo ‘The Legend of Zelda: A Link to the Past’ then you will probably also very much enjoy ‘The Legend of Zelda: A Link Between Worlds’. And I just love Nintendo. I think they’re the best game company. That is all.

CHUCK:

Awesome. I just have to plus one all those picks. I still need to read ‘Ready Player One’. Anyway, so I’ve got some picks.

The first one is a new show that I’ve picked up. It’s called ‘Life on Fire’ by Nick Unsworth. He’s just got a ton of great content, interviews some terrific people on there. And so, I’ve really been enjoying it. And they actually had a bundle that you could pick up where you got a whole bunch of training. And all the money went to basically a school in Guatemala, to build a school. And so, it was kind of a cool deal because you could give the money to them and it went to a good cause. And you got all this awesome stuff back. So, I’m not sure if that bundle is still available. But ‘Life on Fire’ is just, it’s been great to listen to. Really been enjoying the content there.

I’m also in the middle of reading a book called ‘In the Plex’ which is about Google and how Google got started, which is really interesting. And it’s, the thing that I find inspiring about it is that basically it started off as a mental exercise and then turned into something really handy. And I just, I don’t know what else to say about it other than it’s just been interesting to see how it grew and where the company went and the decisions they made and the things that they valued. And so yeah, so I really have been enjoying that.

One other thing I want to just put out there is that I am looking for work. So, if you have contract work that you need some help with, then I am available for that, mostly Thursday and Friday. I’m not completely out of work, but I am looking for two days’ worth of work during the week. So anyway, if you have that then feel free to give me a call or send me an email. My email is chuck@devchat.tv. And my phone number’s 801 367 6164.And I don’t know how smart that is to give that out on the show. But I trust our listeners to not abuse me, I guess, my phone number. So anyway, but yeah that’s what I got.

JOE:

I hope you don’t trust the panelists.

CHUCK:

You already have my phone number. So does AJ. [Laughter]

CHUCK:

Not so worried there. If something was going to happen, it would have already happened.

JOE:

Well, now you gave me the idea.

CHUCK:

Darn it. Alright Eduardo, what are your picks?

EDUARDO:

Yeah. I have one pick only, which is one book that I read a few years ago when I was building Tracking.js which helped me a lot to think about how to make this simple, how to create this easy for people to use. Because computer vision is not so attractive for people when you go to code. It kind of scares developers when they play with it. And this book helped me to keep this simplicity idea. It’s called ‘The Laws of Simplicity’ by John Maeda. And this is the link here. And it’s good because it shows some good points.

For example, when you try to make anything, it doesn’t need to be software, simple you need to start shrinking things from it. You need to start, I don’t know, if it’s a design you need to start shrinking parts of this design until a certain point that it doesn’t compromise the number of features it has. And then when you reach this point, you have to hide the rest. So, when you cannot shrink you have to hide. And this is very, very simple.

You have to make sure that the quality’s very high, because when some mistake is made on a simple application or a simple design, it’s most noticeable. And then when it’s a very large, very chaotic application. So then, when we make it simple make it with a lot of quality. And simple is not different than complex. So, we should also make sure that we still have complex parts on our system. And those parts can be hidden. Yeah, it’s a combination of different tips that most of us have thought already or have heard someone tell. But it’s good because it combines everything in the same book. It’s a very small book. You can read very fast, during a flight or something. Yeah, that’s my pick for today.

CHUCK:

Awesome. Well, thanks for sharing and thanks for all the work that you and the other folks that have worked on this Tracking.js have done. It looks like just a fun thing to build stuff on.

EDUARDO:

Yeah, and it’s just the beginning. It’s the first version. We still have to dedicate more time to get feedback from users and so forth. But thanks for having me here on the show. Thanks Guillermo to recommending me to be here. I’m very happy to be participating on this.

CHUCK:

Alright. Well, I don’t think we have any other announcements. So, we’ll wrap up the show and we’ll catch you all next week.

[This episode is sponsored by MadGlory. You’ve been building software for a long time and sometimes it’s get a little overwhelming. Work piles up, hiring sucks, and it’s hard to get projects out the door. Check out MadGlory. They’re a small shop with experience shipping big products. They’re smart, dedicated, will augment your team and work as hard as you do. Find them online at MadGlory.com or on Twitter at MadGlory.]

[Hosting and bandwidth provided by the Blue Box Group. Check them out at Bluebox.net.]

[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit CacheFly.com to learn more.]

[Do you wish you could be part of the discussion on JavaScript Jabber? Do you have a burning question for one of our guests? Now you can join the action at our membership forum. You can sign up at

JavaScriptJabber.com/jabber and there you can join discussions with the regular panelists and our guests.]

136 JSJ TrackingJS with Eduardo Lundgren

0:00

47:18

Playback Speed: