The Science Behind The JavaScript Engine: How Machines Read Your Code

Explore the workings of the JavaScript engine, from parsing code to optimizing its performance, and also key techniques like hidden classes and memory management.

Yuri Bett
JavaScript in Plain English

--

Car Engine that Illustrates JavaScript Engine
Photo by Chris Carzoli on Unsplash

Hi readers!

A few days ago, I published an article discussing the JavaScript Runtime environment, which seemed to resonate well with readers. Here’s the article in case you want to have a look:

Given the depth and breadth of this topic, I’ve decided to write a follow-up article, this time going deeper into the JavaScript Engine itself. The goal here is to explore how high-level code is interpreted by our machines.

So, what exactly happens to the code during this process?

While there are several engines available, I’ll primarily focus on the most popular one: V8, which powers both Google Chrome and Node.js.

If you’re someone who doesn’t just enjoy coding, but also appreciates the underlying science, fasten your seatbelts. When it comes to a V8 Engine, the journey is bound to be exhilarating (was that a good pun?).

A Brief History of JavaScript Engines

In the early days of the web, web pages were mostly static. They were a mix of HTML and CSS, with only a sprinkle of JavaScript for basic interactivity. During this time, JavaScript was purely interpreted. This means the code was read and executed line-by-line, without any prior optimization. It was akin to reading a book aloud without first skimming through it. The simplicity of web pages didn’t require super-fast JavaScript execution.

However, as the web evolved, so did its ambitions. Web applications began to rival traditional desktop applications in complexity and functionality. As websites transformed into web apps, the demand for faster JavaScript execution grew. Interpretation wasn’t going to cut it anymore. The web needed something more robust.

Enter the era of modern JavaScript engines. The need for speed gave birth to engines like:

  • V8 from Google, known for its use in Chrome and Node.js.
  • SpiderMonkey, the heart of Mozilla Firefox.
  • JavaScriptCore (or Nitro) from Apple, driving Safari.

Here is a list of all JavaScript Engines: https://en.wikipedia.org/wiki/List_of_ECMAScript_engines

These engines transformed the way JavaScript was executed. They introduced innovations such as Just-In-Time (JIT) compilation, which straddled the line between interpretation and full compilation, aiming to get the best of both worlds. We went from reading the book aloud to first skimming through, identifying the important parts, and then delivering a passionate narration.

Anatomy of a JavaScript Engine

In the world of computing, the ultimate goal of any code, regardless of the language, is to communicate with the machine. To achieve this, our high-level JavaScript code must undergo a transformative journey, translating human-friendly syntax into a format that the computer can understand and act upon.

Imagine you’ve written a beautiful piece of JavaScript code, ready to bring your application to life. But how does this human-readable script get transformed into actionable instructions for a computer?

Let's understand each step, starting by taking a look at the diagram below:

The Flow from a JavaScript file to Machine Readable Code. Steps: Parser, AST, Interpreter, Profiles, and Compiler.

The Parser: Decoding the Language

Before the machine can follow any instructions, it first needs to understand them. That’s where the parser comes into play. Acting as a deciphering tool, the parser systematically examines your code, splitting it into discrete elements known as ‘tokens’. Think of this process as breaking down a sentence into individual words to understand its meaning.

Abstract Syntax Tree (AST): Building the Framework

These tokens then serve as the building blocks for the Abstract Syntax Tree (AST). The AST is analogous to a diagram that represents the syntactical structure of your code. It captures the relationships, hierarchies, and the flow, acting as a bridge between the high-level language and the next steps of execution.

There is a pretty cool website called AST Explorer where you can check how the code is pared into AST. Check this out: https://astexplorer.net/

Interpreter: The First Draft of Execution

The AST paves the way for the interpreter (aka Ignition in V8). As the name suggests, the interpreter reads and executes the AST line by line, producing a basic ‘bytecode’. This bytecode, while being faster than direct interpretation of JavaScript, doesn’t fully harness the potential speed of execution.

Interpreted Code vs. Compiled Code:

Diving deeper into code execution, there are traditionally two paradigms: interpreted and compiled. Interpreted languages like early JavaScript are processed on the go, which means they start executing immediately but can be slower in long-running tasks. Compiled languages take a different route. Before execution, the entire code is transformed into machine-level instructions, making runtime operations swift. However, this compilation phase can introduce a delay at the start.

See the diagram below. While interpreted Javascript outputs 'Bytecode', the compiled Javascript outputs 'Machine Code'.

Comparison among high-level language (like a JavaScript code), interpreted ByteCode, and compiled Machine Code

How then, does JavaScript, a language known for its web responsiveness, manage to strike a balance?

JIT: The Best of Both Worlds

Modern JavaScript engines shines with Just-In-Time (JIT) Compilation. JIT is not a static process; it’s dynamic and adaptive. It allows JavaScript to be primarily interpreted for that instant start and then, as the code runs, to be compiled for optimized performance.

Here is where the Profiler comes in.

Profiler: The Monitor

Within the JIT system, the profiler plays an important role. Acting as a watching system, it continuously monitors the running code to identify patterns, especially sections (hotspots) that are either frequently executed or computationally intensive.

The Compiler: Tailoring for Efficiency

Having identified the hotspots, the compiler (aka Turbofan in V8) steps in to elevate performance. It takes the rudimentary bytecode produced by the interpreter and refines it into highly efficient machine code, specifically for these hotspots. This ensures that your application, while starting swiftly, also operates at peak efficiency during its most demanding tasks.

But, what kind of optimization? Let's find out some of them.

Hidden Classes and Optimizations

In JavaScript engines, especially engines like V8, optimization is a continuous effort. One of the lesser-known yet highly influential strategies for speeding up JavaScript execution involves the use of “hidden classes.”

What are Hidden Classes?

  • Definition: At a high level, hidden classes are internal constructs used by some JavaScript engines to streamline and expedite property access in objects.
  • Why they Exist: JavaScript is a dynamic language. You can add or remove object properties on the fly. This dynamism, while powerful, can be a pain for optimization. How can the engine efficiently predict and handle property accesses when the shape of an object keeps changing? Enter hidden classes. They provide a mechanism to represent and track the current “shape” of an object, allowing the engine to make educated guesses and optimizations about property accesses.

Faster Property Access and the Role of Hidden Classes

  1. Object Evolution: When you instantiate an object and start adding properties, the JavaScript engine assigns a hidden class to the object. As you add or modify properties, the engine transitions the object from one hidden class to another, tracking its evolution.
  2. Property Lookup: When accessing a property, instead of searching through the object’s properties, the engine can utilize the hidden class as a roadmap to quickly locate the property’s position.

Example:

let obj = {};
// Hidden Class A assigned.
obj.x = 10;
// Transition from Hidden Class A -> B
obj.y = 20;
// Transition from Hidden Class B -> C

If the engine sees another object following the same pattern (first adding x, then y), it can intelligently predict the object's shape and access properties more rapidly.

Inline Caching (IC)

What is Inline Caching?

  • Definition: Inline Caching is an optimization technique where the results of a specific operation (like property access) are cached directly in the bytecode, leading to faster subsequent accesses.
  • The Need for IC: Given that many operations in JavaScript, especially property accesses, are repeated often, having to fully compute them every single time is inefficient. IC provides a shortcut, remembering the results of recent operations.

The Co-Work Between Hidden Classes and IC

  • Predictable Shapes: With hidden classes tracking object shapes, the engine can make educated predictions. When a property is accessed, the engine uses the hidden class to swiftly find its position.
  • Caching the Result: Once the property’s position is identified, this information is cached directly in the bytecode via IC. The next time this property is accessed, the engine can bypass the whole lookup process and retrieve the value immediately.

Example:

function getColor(car) {
return car.color;
}
const myCar = { color: 'red' };
getColor(myCar); // First access: normal lookup
getColor(myCar); // Subsequent access: rapid retrieval via IC

During the first function call, the engine uses hidden classes to locate the color property. With IC, this location is cached, making the second call significantly faster.

Real-world Performance Improvements

The combination of hidden classes and IC can lead to great performance boosts. In scenarios like animation loops, DOM manipulations, or data processing where the same properties are accessed repeatedly, IC ensures that the engine operates at peak efficiency, keeping applications smooth and responsive.

Garbage Collection and Memory Management

The garbage collector in JavaScript is a form of automatic memory management. The primary function of the garbage collector is to reclaim memory that is no longer in use by the program, which helps to prevent memory leaks and manage the limited memory resources, especially in web browsers.

The garbage collector follows several algorithms and strategies to determine what memory can be safely reclaimed. One of the primary strategies used in many modern JavaScript engines, such as V8 (used in Google Chrome and Node.js), is called “mark-and-sweep.”

The mark-and-sweep algorithm works in two phases:

  1. Mark: The garbage collector scans through the memory starting from ‘roots’ (global variables, currently executing function’s variables, etc.) and marks all reachable objects. Reachable objects are those that are accessible directly or indirectly from the roots. Each object that can be accessed is marked as ‘in use.’
  2. Sweep: Once all reachable objects have been marked, the garbage collector then sweeps through the memory, identifying objects that have not been marked. These unmarked objects are considered unreachable and therefore not needed by the program. The memory occupied by these unmarked objects is then reclaimed.
Garbage Collection mark-and-sweep strategy animation

The mark-and-sweep algorithm helps to ensure that memory is not reclaimed prematurely, which would occur if an object that is still needed by the program were collected.

I have another article about Garbage Collection and Memory Leak if you are interested:

Best Practices for JavaScript Developers

Understanding the intricacies of the JavaScript engine not only demystifies the magic behind our applications but also equips developers with the knowledge to write more efficient, performant code. With this engine-centric view, let’s see a few practices that can enhance the performance of JavaScript applications.

Writing Engine-Friendly JavaScript

  • Avoid Global Variables: While convenient, excessive use of global variables can slow down property lookups. Stick to local scope as much as possible.
  • Limit Use of Closures: Closures are powerful, but they can also inadvertently prevent garbage collection of objects, leading to memory leaks.
  • Batch DOM Manipulations: Direct DOM interactions are costly. Instead of making multiple small changes, batch them together to minimize reflows and repaints.

Common Pitfalls & Their Avoidance

  • Memory Leaks: Always be cautious when setting up event listeners or storing large data structures. Clean up listeners when they’re not needed.
  • Non-optimized Loops: For example, avoid doing the same calculation each iteration. Cache it outside the loop.
  • Forcing Synchronous Layouts: Accessing certain properties (like offsetWidth or scrollTop) can force the browser to perform synchronous layout calculations. Be mindful of when and how often you're accessing these properties.

Conclusion

I know this is a dense topic, and it could go even deeper. Knowing the 'Science of Javascript Engines' will give you a more in-depth understanding of how everything works under the hood, and consequently, write better code that won't trick the compilers.

If you have any questions, drop a comment and I will be glad to help.

Let’s get connected! You can find me on:
- Medium: https://medium.com/@yuribett
- Linkedin: https://www.linkedin.com/in/yuribett/
- X (formerly Twitter): https://twitter.com/yuribett

In Plain English

Thank you for being a part of our community! Before you go:

--

--

Senior Software Engineer | Technical Lead | Technical Writer - I love everything about Javascript, React.js, Next.js, and Node.js