Mediapipe: A Machine Learning Solution for Web Developers

Afzal Ansari
JavaScript in Plain English
5 min readJul 21, 2023

--

Mediapipe is an Open-Source project by Google that offers cross-platform Machine Learning and AI solutions.

Mediapipe offers Mediapipe Solutions and Mediapipe Framework and they both share the same goal of providing solutions to machine learning problems on different platforms.

Mediapipe Framework is a low-level component for building machine learning pipelines. While Mediapipe Solutions is an easier project to get started with and offers solutions for Web/JavaScript too.

Mediapipe Tasks

Mediapipe Solutions offers a set of libraries that can be used to integrate pre-trained or customized machine-learning models in a web app.

It provides domain-specific libraries such as @mediapipe/tasks-vision for computer vision, @mediapipe/tasks-audio for audio processing, and @mediapipe/tasks-text for basic natural language processing, are known as tasks and are available on NPM.

These libraries will be used to load trained models in .tflite format and run inference on.

Pretrained & Customized Models

You can use pre-trained supported models mentioned in the documentation, download it and use it as your primary model. But often times one may need a model that solves specific problems of the same domain.

Here, we can customize our models for our specific needs. Model customization is a limited feature right now. You can customize models for Image Classification, Object Detection, Gesture Recognition, and Text Classification only.

Example with Pretrained Image Classification Model

We need to take a look at one example here. We will use a pre-trained model to classify the images.

Mediapipe is a low-code Open-Source-Software and you can do a lot in a few lines of code.

What we will build here is a simple web page with an input element that will accept images, classify & preview the image, and give us a confidence score.

Install the @mediapipe/tasks-visionlibrary first.

npm i @mediapipe/tasks-vision

Set up the basic markup like below.

<input type="file" id="file-input" />
<h2 id="label"></h2>
<img width="400" height="400" id="preview-image" src="#" />

Then, paste the following code into your script file.

// Import the following APIs
import { ImageClassifier, FilesetResolver } from '@mediapipe/tasks-vision';

// Get the elements
const fileInput = document.getElementById('file-input');
const label = document.getElementById('label');
const previewImage = document.getElementById('preview-image');

// `ImageClassifier` is created Asynchronously.
const createImageClassifier = async () => {
// Fetch the fileset for vision task.
const vision = await FilesetResolver.forVisionTasks(
"https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@0.10.0/wasm"
);

// Return the created `ImageClassifier`.
return await ImageClassifier.createFromOptions(vision, {
baseOptions: {
// We are using a pretrained image classifier model directly
// from the googleapis storage.
modelAssetPath: `https://storage.googleapis.com/mediapipe-models/image_classifier/efficientnet_lite0/float32/1/efficientnet_lite0.tflite`
},
// We want max results to be 1 but there can be more than 1.
maxResults: 1,
// `runningMode` can be either `IMAGE` or `VIDEO`.
// We will working with image so `IMAGE`.
runningMode: 'IMAGE'
});
};


document.addEventListener('DOMContentLoaded', () => {
// Get the `ImageClassifier`.
createImageClassifier().then(classifier => {
// When the image has been submitted we will immediately classify it.
fileInput.addEventListener('input', () => {

// Get the submitted file/image.
const file = fileInput.files[0]
const image = new Image()
// Create data url for the image.
const dataUrl = URL.createObjectURL(file)

// Once image is loaded classify it.
image.onload = () => {

// Pass the image and get the result.
const result = classifier.classify(image)
// Get the label name from the result.
const labelName = result.classifications[0].categories[0].categoryName
// Get the score in percentage.
const scorePercentage = Math.round(result.classifications[0].categories[0].score * 100)

// Preview the image with the label.
label.textContent = `${labelName} with ${scorePercentage}% Confidence`
previewImage.src = dataUrl
}

// Set the `src` of the image.
image.src = dataUrl
})
})
})

Code Explanation:

It’s a very basic setup that will let the users upload the image and get the result. You can understand how it’s done just by reading the code and comments. However, here we need to understand the APIs used in createImageClassifier function.

The FilesetResolver.forVisionTasks loads the wasm loader and binary files that will be used for this task. ImageClassifier.createFromOptions takes a bunch of options along with baseOptions.modelAssetPath, modelPath, and runningModeand returns an ImageClassifier object which has classify method which takes the image and returns the prediction result.

The thing to note here is that the ImageClassifier.classify is a synchronous function and may block the UI while running inference so it might be better to use it in another thread/worker.

By uploading an image on the page you should be getting the results like below.

Image Classification Demo Preview

Mediapipe Model Maker

Mediapipe-Model-Maker is a Python library used for customizing models for your use case. You can train a model on a dataset of your own. It leverages a popular technique from machine learning known as Transfer Learning which makes the models learn on new data with the same behavior and in the same domain.

Being a Python library you will have to work in Python a little bit and because it’s a low-code library it will be easy. You can set up a local environment or use Google Colab to get started easily and head over to the docs to learn more about customizing the model.

Mediapipe Studio

Mediapipe Studio is a web app full of demos of all types of solutions available with Mediapipe. It lets you upload your data so you can figure out if you can use an existing model or if you need to train your own.

Mediapipe Studio also lets you upload your own trained model too so you can run inference and practically test the trained model immediately and easily.

Summary

Mediapipe offers low-code solutions to machine-learning problems and is easy to get started and work with even if you have no background in the machine-learning. Making it easier for JavaScript developers to integrate AI features within their applications.

Being cross-platform and also available on the web is an excellent feature of it. On the web where machine-learning models are mostly deployed on the server, you can load and execute the model on the client side.

If you have liked my writings you can follow me or support me on my BMC page. Thank you for Reading.

More content at PlainEnglish.io.

Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.

--

--