Jesse's Software Engineering Blog
Understanding the NodeJS Event Loop
While there are numerous articles and documentation written about the NodeJS event loop, I wanted to a write about the event loop to help solidify the concept for myself. Being a Node developer it is important to understand what the “event-driven, non-blocking I/O model” paradigm is and how asynchronous functions are being executed.
Simply put, the event loop is “an entity that handles and processes external events and converts them into callback invocations”. The event loop can be thought of as a message queue dispatcher and callbacks as event handlers. The event driven model implies that application flow is determined by events and/or change of state, meaning that listeners can be attached to to events (the callbacks), and when an event fires (completion of an asynchronous I/O request) the listener executes the functionality (callback) provided.
Mozilla offers a simple image of how the event loop works. The stack is where the execution contexts are placed. An execution context can be thought of as the current state or scope of an executing function. The first item on the stack is always the global execution context, or global scope. As functions are called, their contexts’ are pushed onto the stack, containing info about state and the commands to be executed. Because Node is lexically scoped, the execution contexts will inherit scope from their parents, with closures ensuring that nested scopes will still have access to parent’s scopes even after parent execution has been completed.
For example, when running synchronous code, the execution contexts are placed onto the stack and executed in LIFO order. Writing code like this:
function one() { two(); const f = 'one'; console.log(`inside function ${f}`); } function two() { const f = 'two'; console.log(`inside function ${f}`); } const f = 'init'; one(); console.log(`executed successfully with ${f} preserved`);
will cause the initial context (the global execution context) to be pushed onto the stack, followed by function one’s context being pushed onto the stack then function two’s. Once function two completes, it’s context will be popped off the stack and the Node process will begin executing function one’s context. The process of pushing new contexts onto and popping off of the stack is the only execution flow until the stack is empty i.e. all instructions in the global context have been executed.
Once the stack is emptied by the Node process, Node begins polling the message queue for tasks (typically I/O tasks) that are completed. Those tasks, through the C library libuv, have been delegated to the system kernel via the operating system’s event notification system. Since Node is single threaded, and I/O is typically the slowest operations a program will execute, Node leverages the kernel to run I/O tasks in parallel with the Node process. When a message is pulled off the queue, the message’s execution context (scope and state) is pushed onto the stack and Node begins working through the contexts, starting with the message’s associated callback. Depending on the nature of the callback, this can push more synchronous execution onto the current stack, or push more messages onto the message queue. Once the stack is cleared, Node will poll the message queue again until there are no more messages on the queue and the stack is empty, in which case the process will come to an end. While this is a bit of an oversimplification, it should demonstrate the fundamentals of how the Node event loop works and how asynchronous I/O tasks are run in parallel to the Node process via the operating system’s kernel.
Single Threaded
It is very important to understand that Node is single threaded, meaning that the Node process can only execute a single command at a time. While the event loop, and the asynchronous nature of non blocking I/O may appear to be multithreading, Node is actually only executing one task at a time while pushing off certain tasks to the kernel to be run in parallel. Let’s take a look at a simple example of this. The setTimeout function pushes a message onto the message queue after the minimum number of ms have passed:
setTimeout(() => { console.log('executing first large task...'); for (var i=0; i<5000000000; i++) { } console.log('finished first large task...'); }, 0); console.log('executing second large task...'); for (var i=0; i<5000000000; i++) { } console.log('finished second large task...');
With the following output
executing second large task... finished second large task… executing first large task... finished first large task...
By using the setTimeout function we pushed the callback function along with its execution context onto the message queue after 0 ms. The Node process moved on to the next execution instruction in the current context (the global context), which is the “second large task". Once that task is completed the global execution context has no more instructions, therefore the stack is cleared. Node begins to poll the message queue and finds the message and callback for the setTimeout function, even though several seconds have passed despite the 0 ms parameter.
This example demonstrates a couple key points. First, it shows that there is only a single thread. Since CPU bound operations are not asynchronous, the Node process did not start executing the callback until the current execution context was finished, the stack cleared, and the setTimeout context could be pulled onto the stack. This also demonstrates that even though we are defining a timeout of 0, or immediate, it does not get executed until the message (and its callback) have been pulled off of the message queue. If there were several “tasks" all pushed to the message queue via setTimeout, they would be executed in the order that that they were enqueued (after the ms value has passed), and processed FIFO, demonstrating the nature of the message queue.
Since Node is single threaded, and only I/O tasks can be ran parallel on the OS kernel, it makes non-blocking CPU bound tasks in Node difficult. To demonstrate we set an I/O bound task onto the message queue on an interval, which should be getting hit every 200ms:
setInterval(function() { console.log('io bound work completed') }, 200); cpubound(); cpubound(); function cpubound() { console.log('executing cpu bound task...'); for (var i=0; i<5000000000; i++) { } console.log('finished cpu bound task...'); }
Since the CPU bound tasks take so long to complete, the I/O callback and the event loop are blocked until the CPU bound tasks are completed, which takes several seconds:
executing cpu bound task... finished cpu bound task... executing cpu bound task... finished cpu bound task... io bound work completed io bound work completed ...
Or as another example, we have a web server where one of the routes runs a CPU bound task. While the Node process is processing the request, the web server is unable to accept any more connections
const http = require('http') const requestHandler = (request, response) => { if (request.url === '/block') { for (var i=0; i<5000000000; i++) { } } response.end('Response Sent!\n') } const server = http.createServer(requestHandler) server.listen(3000, (err) => { console.log('server is listening on 3000') });
A request to the /block route will completely lock the server, not allowing any more requests until the CPU bound task is completed. This can be very disruptive a web server or imagine a web application using JavaScript where the application becomes unresponsive to user input because the JavaScript process (event loop) is blocked.
Non Blocking CPU Bound Operations
So what can be done to prevent this? Falling back on the event-driven nature of Node, we can try and use an event emitter to get around the blocking nature of CPU bound workloads
const events = require('events'); const emitter = new events.EventEmitter(); emitter.on('cpubound', (data) => { console.log('cpubound starting....'); cpubound(); }); emitter.on('iobound', (data) => { console.log('iobound starting....'); iobound(); }); emitter.emit('cpubound', {}); emitter.emit('iobound', {}); console.log('finished script flow...'); function cpubound() { for (var i=0; i<5000000000; i++) { } console.log('cpu bound work completed') } function iobound() { setTimeout(function() { console.log('io bound work completed') }, 0); }
cpubound starting.... cpu bound work completed iobound starting.... finished script flow... io bound work completed
Unfortunately this will not work either, firing events do not push CPU bound tasks or synchronous code onto the queue, even when wrapped as event listeners. Emitted events are executed synchronously in the same execution context in which they are emitted.
What about Promises? According to the Mozilla documentation, “The Promise object is used for asynchronous computations". We can try wrapping the task in a Promise to try and get the cpu task onto the queue:
cpubound().then(d => { console.log('cpu bound task completed...'); }); console.log('global context finished...'); function cpubound() { return new Promise((resolve, reject) => { for (var i=0; i<5000000000; i++) { } resolve(); }); }
cpu bound task completed... global context finished...
Nope, same as with the event emitter, Promises are executed synchronously unless they are making an async I/O call, in which case the message will be pushed onto the queue just as any other I/O callback.
By using the setInterval command, it is possible to break CPU bound tasks up by pushing chunks onto the message queue opposed to blocking the entire thread. This allows for other requests to be put onto the message queue during large CPU bound tasks:
function cpubound() { let i = 0; let max = 5000000000; let batch = 50000000; console.log('starting cpu bound'); var loop = setInterval(() => { if (i >= max) { console.log('cpu bound done'); clearInterval(loop); return; } for( var j = i; j < i + batch; j++ ) { } console.log('cpu iteration done'); i += batch; }, 0); } cpubound(); setInterval(function() { console.log('io bound work completed') }, 800);
Notice the intermixing the I/O bound tasks in between CPU bound workload.
starting cpu bound cpu iteration done io bound work completed cpu iteration done cpu iteration done io bound work completed cpu iteration done cpu iteration done io bound work completed cpu iteration done io bound work completed io bound work completed ...
This breaks large CPU bound tasks into batches, pushing them onto the message queue allowing the process to pick up other tasks, like incoming http requests, as well as allowing new messages onto the queue in between batch iterations. While this works, it is usually not ideal to have to break large tasks into setInterval callbacks throughout an application.
Conclusion
While having a single threaded, event driven architecture in Node may seem a bit limiting, its actually much more efficient than other thread or process per request models, used by the Apache web server, which incurs the overhead of dealing with thread management and is limited by max processes or threads per core. The event driven architecture is very similar to that used by the Nginx web server, and when coupled with Node can be a very robust, scalable web server architecture. As long as developers are conscience of the event loop and not running large tasks blocking the Node thread, Node is extremely efficient, especially when taking into consideration I/O and disk operations can be an order of magnitude slower than data processing and can be ran in parallel on the OS kernel.
Hopefully this overview has helped solidify the concept of the event loop and event-driven paradigm of Node. It should have also demonstrated the importance of understanding how the stack and message queues work with one another, and how catastrophic it can be to performance if the event loop is blocked by a large cpu bound task. Next I’ll dive into more robust solutions on handling large CPU bound tasks opposed to wrapping everything in setInterval calls.