Face Detection in the Browser using TensorFlow.js

Simple face detection using the Blazeface model in TensorFlow.js.

Adarsh Menon
JavaScript in Plain English

--

TensorFlow.js is a great tool to run ML models in your browser, and comes with lots of pre-trained models for us to work with. Today I will be using the Blazeface model to detect faces from the webcam feed on the browser.

Using this, you can crop images to get the face only, detect faces for tagging people in images or this can be the first step towards face recognition.

We can break the process down into 3 steps:

  1. Getting access to the webcam video feed
  2. Perform face detection
  3. Display the result

To run this code, it would be good to have a local server setup that supports hot reloading. I like using the Live Server plugin on VS Code.

So let’s get started!

The video version if you prefer that !

Getting access to the webcam video feed

The first step is to get access to the webcam and its video feed.

We will use the browser’s MediaDevices interface which lets you access connected media devices like cameras, microphones, and even screen sharing.

Start off by creating the index.html file, with the basic boilerplate code.

I have added a video tag in the body, which we can use to see the video output from the webcam.

Now let’s create another file called script.js where we will write our JavaScript code and include it in index.html. I have added it after the body tag. Scripts can be included in the head tag, but including it after the body tag makes sure that the script runs only after the HTML is rendered.

Here we are getting the video tag element and storing it in the variable video. The function setupCamera, calls the getUserMedia API and we specify the video dimensions. I have also set audio to false here, since we don’t need it.

The getUserMedia function returns a promise with the MediaStream object, which we assign to the video element’s srcObject property. Finally, at the end, this function is called.

Now to run this start your live server or open the index.html file in your browser. Your browser will ask permission to access the camera, allow it and your should see the webcam feed on the screen.

Performing Face Detection

To detect faces we will be using the Blazeface model in TensorFlow.js. It is a lightweight model to detect faces in images.

First, let’s import the TensorFlow.js library and Blazeface model in our index.html file using the script tag.

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/blazeface"></script>

Make sure that the above tags are added before importing script.js. This is to make sure that our scripts run only after the required dependencies are loaded.

Next, create a variable model and load the Blazeface model inside it.

let model;
model = blazeface.load();

Next, let’s create a function detectFaces in our script that will perform the face detection and give an output.

const detectFaces = async () => {
const prediction = await model.estimateFaces(video, false);
console.log(prediction);
};

Now if you call detectFaces and run everything, you will get an error. This is because blazeface.load() is asynchronous, and it takes some time to load. But detectFaces gets called before Blazeface is loaded.

So to solve this problem, we will first wait for the video feed to load from the webcam. Once that is done, we wait for the model to load and only then call the detectFaces function. On loading the video feed, there is an event listener called loadeddata that we can listen for.

So, incorporating all of the above, the final script.js file should look something like this:

Now if you check your console, you should be able to see the prediction object, that looks like this:

[
{
"topLeft": [
186.164306640625,
229.93507385253906
],
"bottomRight": [
400.3614807128906,
372.7312927246094
],
"landmarks": [
[
249.68140237033367,
275.46426653862
],
[
326.18556171655655,
271.4667320251465
],
[
285.10335087776184,
312.4013841152191
],
[
290.80926552414894,
337.9300758242607
],
[
218.2365596294403,
286.3128364086151
],
[
371.4763283729553,
279.08667623996735
]
],
"probability": [
0.9811193943023682
]
}
]

The prediction variable is a list of objects, where each object corresponds to a face that was detected. Inside each object you can find the top left and bottom right coordinates, also the 6 face landmarks corresponding to eyes, ears, nose and mouth.

Next, we will use these coordinates to draw the rectangle around the face.

Displaying the Results

This step is pretty straightforward. Simply iterate through the predictions array, and draw a rectangle or points for the face landmarks.

We can use HTML canvas for this. So, create a new canvas tag, in the index.html file, and give it a width and height that matches the width and height of the video feed. Also, we don’t need to display the contents of the video tag anymore, so set its CSS display property to none.

After doing the above, your index.html should look something like this:

Next in your script.js file, create another variable to hold the canvas object and create a 2d context for the canvas. Next, you first need to draw the current video onto the canvas, followed by the rectangle and face landmarks. This can be done using the drawImage and rect functions of HTML canvas.

Also, the detectFaces function is now only called once. To see the output as a video we need to keep calling it, either 24 or 30 times a second according to the frame rate you want. So to keep calling it, we can use the setInterval function.

Finally, your script.js file would look something like this:

Hopefully, this post was helpful. Connect with me on Twitter, LinkedIn and check out my YouTube Channel.

More content at plainenglish.io

--

--