JavaScript: Garbage Collection

JavaScript: Garbage Collection
JavaScript: Garbage Collection

What is Garbage Collection?

Overview

Garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim memory which was allocated by the program, but is no longer referenced also called garbage. Garbage collection was invented by American computer scientists around 1959 to simplify manual memory management in LISP.
Garbage collection may take a significant proportion of total processing time in a program and, as a result, can have significant influence on performance.

In JavaScript

JavaScript uses Tracing garbage collection strategy in which objects should be garbage collected by tracing which objects that are reachable by a chain of references from certain root objects, and considering the rest as garbage and collecting them.

Memory Lifecycle

  • Allocate memory — memory is allocated by the operating system which allows your program to use it. In low-level languages (e.g. C) this is an explicit operation that you as a developer should handle. In high-level languages, however, this is taken care of for you.
  • Use memory — this is the time when your program actually makes use of the previously allocated memory. Read and write operations are taking place as you’re using the allocated variables in your code.
  • Release memory — now is the time to release the entire memory that you don’t need so that it can become free and available again. As with the Allocate memory operation, this one is explicit in low-level languages.

Memory Management

JS uses STACK for STATIC ALLOCATION & HEAP for DYNAMIC ALLOCATION.

A lot of things are stored in this memory:-

1.All variables and other data used by all programs.
2.The programs’ code, including the operating system’s.

When you compile your code, the compiler can examine primitive data types and calculate ahead of time how much memory they will need. The required amount is then allocated to the program in the call stack space. The space in which these variables are allocated is called the stack space because as functions get called, their memory gets added on top of the existing memory. As they terminate, they are removed in a LIFO (last-in, first-out) order. Heap is a memory area where variables are stored in an unordered fashion.

If we write a code with primitive values like this:

const num1 = 90 
const num2 = 100

When we write a JavaScript script with Object values:-

const obj1 = { val: 80 } 
const obj2 = { val: 90 } 

MARK AND SWEEP ALGORITHM FOR GC IN JS

In order to decide whether an object is needed, this algorithm determines whether the object is reachable.

The Mark-and-sweep algorithm goes through these 3 steps:
1.Roots: In general, roots are global variables which get referenced in the code. In JavaScript for example, a global variable that can act as a root is the “window” object. The identical object in Node.js is called “global”. A complete list of all roots gets built by the garbage collector.
2.The algorithm then inspects all roots and their children and marks them as active (meaning, they are not garbage). Anything that a root cannot reach will be marked as garbage.
3.Finally, the garbage collector frees all memory pieces that are not marked as active and returns that memory to the OS.

EXAMPLE :

function marry(man, woman) { 
woman.husband = man;  
man.wife = woman;
return { father: man, mother: woman } }  
let family = marry({ name: "John" }, { name: "Ann" });

The resulting memory structure:

**If we do: **
family = null;

It's obvious that John and Ann are still linked, both have incoming references. But that’s not enough.
The former “family” object has been unlinked from the root, there’s no reference to it any more, so the whole island becomes unreachable and will be removed.

Some of the optimizations:

  • Generational collection– objects are split into two sets: “new ones” and “old ones”. Many objects appear, do their job and die fast, they can be cleaned up aggressively. Those that survive for long enough, become “old” and are examined less often.
  • Incremental collection– if there are many objects, and we try to walk and mark the whole object set at once, it may take some time and introduce visible delays in the execution. So the engine tries to split the garbage collection into pieces. Then the pieces are executed one by one, separately. That requires some extra bookkeeping between them to track changes, but we have many tiny delays instead of a big one.
  • Idle-time collection– the garbage collector tries to run only while the CPU is idle, to reduce the possible effect on the execution.

V8 ENGINE GENERAL COLLECTION

According to v8 developer, In the vast majority of programs, objects tend to die young: most objects have a very short lifetime, while a small minority of objects tend to live much longer. To take advantage of this behavior, V8 divides the heap into two generations. Objects are allocated in new-space, which is fairly small (between 1 and 8 MB).
Allocation in new space is very cheap: we just have an allocation pointer which we increment whenever we want to reserve space for a new object. When the allocation pointer reaches the end of new space, a scavenge(minor garbage collection cycle) is triggered, which quickly removes the dead objects from new space. Objects which have survived two minor garbage collections are promoted to old-space. Old-space is garbage collected during a mark-sweep (major garbage collection cycle), which is much less frequent.
A major garbage collection cycle is triggered when we have promoted a certain amount of memory to old space. This threshold shifts over time depending on the size of old space and the behavior of the program.

WeakMap

The WeakMap object is a collection of key/value pairs in which the keys are weakly referenced. The keys must be objects and the values can be arbitrary values. If we use an object as the key in it, and there are no other references to that object – it will be removed from memory (and from the map) automatically.

Why WeakMap?

Properties of an object or elements of an array or another data structure are considered reachable and kept in memory while that data structure is in memory. WeakMaps can provide you with more security other collections or data structures can’t.
This benefit of security goes even further if you take into consideration garbage collection. Remove all references to an object and all “sensitive” data associated with that object will be gone as well, sooner or later.
For instance, if we put an object into an array, then while the array is alive, the object will be alive as well, even if there are no other references to it as arrays are objects in javascript so object of array will have reference to the other object.

Example

let john = { name: "John" }; 
let arr = [ john ]; 
john = null; 
// the object previously referenced by john is stored  inside the array 
// therefore it won't be garbage-collected // we can get it as arr[0] 

So we can use WeakMap instead, example:

let john = { name: "John" }; 
let weakMap = new WeakMap();  
weakMap.set(john, "...");  
john = null;  
// overwrite the reference 
// john is removed from memory!

WeakMap Use cases:

  • Additional Data: If we’re working with an object that “belongs” to another code, maybe even a third-party library, and would like to store some data associated with it, that should only exist while the object is alive – then WeakMap is exactly what’s needed.
  • Caching: We can store cache results from a function, so that future calls on the same object can reuse it and can remove that object when not needed.

Example 1
Say you have a DOM element and want to associate some data with it and use a WeakMap for that: weakMap.set(domElement, data);. When the DOM element gets deleted then the entry in the weak map gets deleted too. On the other hand you would not want the data to be deleted as long the DOM element exists, just because there is no other reference to it outside the weak map.

Example 2
With WeakMaps, you can associate previously computed results with objects without having to worry about memory management. The following function countOwnKeys() is an example: it caches previous results in the WeakMap cache.

const cache = new WeakMap();` 
`function countOwnKeys(obj) { 
 if (cache.has(obj)) { 
 return [cache.get(obj), 'cached']; 
 } else { 
 const count = Object.keys(obj).length; 
 cache.set(obj, count); 
 return [count, 'computed']; 
 } 
} 

If we use this function with an object obj, you can see that the result is only computed for the first invocation, while a cached value is used for the second invocation:

> const obj = { foo: 1, bar: 2}; 
> countOwnKeys(obj)
[2, 'computed'] 
> countOwnKeys(obj) 
[2, 'cached'] 

WeakSet

Being “weak” like WeakMap, it also serves as additional storage. But not for arbitrary data, rather for “yes/no” facts. A membership in WeakSet may mean something about the object.
It only supports add(), has() and delete() methods where add() can have only 1 argument i.e object.
EXAMPLE:

let john = { name: "John" };  
const weakSet = new WeakSet() 
weakSet.add(john) 
let result = weakSet.has(john) // prints true 

Note: WeakSet & WeakMap does not support iteration and methods like keys(),values(),entries(), so there’s no way to get all keys or values from it.

Because technically it’s not exactly specified when the cleanup is done by JavaScript engine. It decides that so It may choose to perform the memory cleanup immediately or to wait and do the cleaning later when more deletions happen.
So, technically, the current element count of a WeakMap or WeakSet is not known. The engine may have cleaned it up or not, or did it partially. For that reason, methods that access all keys/values are not supported.

WeakSet Use Case

One simple example can be a login system. You could keep track of users (objects) that are online by adding them to a WeakSet. When any of those users leaves you remove appropriate object. Later, you can use the has() method to check if specific user is still online, appropriate object exists, or not.

Example

Summary

Garbage essentially refers to things which are no longer of use. Javascript. For a high level programming language like JavaScript, the memory management process is automated. The browser takes care of that thing for us. JavaScript uses Tracing garbage collection strategy in which objects should be garbage collected by tracing which objects that are reachable by a chain of references from certain root objects. WeakMap & WeakSet data structures comes to rescue for optimizing memory management which checks properties of an object or elements of an array or another data structure. If they are not reachable then these properties are garbage collected.