Note: From this post on, I am gonna number the concepts for easier organization and reference. The old post titles will also be updated to reflect this.

NodeJS is a platform running on the Chrome V8 Javascript runtime that allows for asynchronous, non-blocking I/O operations for implementing a web server. If that whole thing just sounded like a bunch of Greek to you, I feel you. Jargon-filled websites aren't what I like to see, either. So here I am, trying to explain what is (or rather what I understand of it) the asynchronous programming model and what it means in the context of the world wide web.

Now this is kinda different from the posts I have made so far on this blog, in being that I don't really know a lot of what the hell I'm talking about right now. It's not a concept like algorithms, where it is actually possible to visualize and understand theoretical stuff, despite having... ehh, not much practical knowledge in the scheme of things. This is something I can only learn from experience, and there most definitely have been (and will be) mistakes.

However, on the other hand, this kind of post is EXACTLY what I started this blog for. I'm no algorithm tutor. This blog is meant for me to share what I know, figure out how I think, articulate myself, and record my mistakes here for posterity. So if I have made any mistakes, or have forgotten something, please, please correct me in the comments. I will update the post to reflect the error, and put in the necessary explanations. Plus, you win a free virtual cookie.

Okay, enough rambling. Time to get on with the topic.

Prerequisites

You will need to know some basic JavaScript to follow this post. It would be helpful if you have done some basic webdev and at least know what a server and a client is.

Asynchronous Development: The Basics

A lot of people find it hard to work with asynchronous systems, simply because they are so different from normal, sane, synchronous systems. So what exactly is the difference? Consider the following code in Javascript.

console.log("Hey");
delay(1000); // delays execution for 1000 ms
console.log("What");
delay(1000);
console.log("Up");

With Synchronous code, we can expect the outputs as follows:

After 0 ms:

Hey

After 1000 ms:

Hey
What

After 2000 ms:

Hey
What
Up

However, on Async code, we will see the following:

After 0 ms:

Hey
What
Up

Wait. What?

Yes, all three outputs arrive at the same time (almost), and arrive immediately when the code is run. There is no waiting for those delays. In fact, what happens behind the scenes, is this:

> register that "Hey" has to be printed, start printing
> register that a delay of 1000 ms has to be done, start the delay
> register that "What" has to be printed, start printing
> register that a delay of 1000 ms has to be done, start the delay
> register that "Up" has to be printed, start printing

Execution
---------

> 0 ms: print completed: "Hey"
> 0 ms: print completed: "What
> 0 ms: print completed: "Up"
> 1000 ms: delay completed
> 1000 ms: delay completed

Does the above make sense? No? Well, the process is simple. The server acknowledges that an action has to be done when it encounters the line of code that starts that action. The server always starts that action immediately. When the action completes, its results are printed / states modified, depending on what the function does. In the above example, printing is an (almost) instantaneous task, and hence the print jobs execute immediately. The delay takes longer to complete, and thus run in the background such that both delays end (nearly) simultaneously after 1000 ms of their registering.

Again, wait, what?

Okay, I admit. Maybe I still don't make any sense. In this case, the right thing to do would be to introduce a graphic.

The above diagram shows something called the Event Loop. What the event loop does is to run multiple actions in the background simultaneously. When a function is called, the server registers it in the event loop. When the function is done executing, its results are displayed as required. Note that the timeline displayed is not to scale -- the actions under the 0 ms mark execute almost simultaneously, as do the actions under the 1000 ms mark.

So what is the point of all this? Why code like this? Doesn't this seem like unnecessary confusion for not much of a purpose?

Well, the true use of asynchronous programming is its ability to run multiple actions simultaneously. A little thought can expose the advantages of such a method in the context of web servers. The web server is not required to wait for anything while an action executes in the background: for example, a database query or a file retrieval. Consider the following series of steps in a web server:

- Client requests HTML file a (100 ms)

- Client performs database query to obtain a string for a file b (300 ms)
- > Server uses string to search for file b  and returns it (200 ms)

- Client requests HTML file c (100 ms)

- Client requests JPEG file d (200 ms)

The above series of steps will be executed as such on a synchronous server (Note that steps depending on preceding steps are prefaced with a <):

However, on an asynchronous server, the results are quite different:

As plainly visible, the working of the above code on the async server is a lot faster, with the total latency being equal to just 500 ms compared to the 900 ms required by the synchronous server. This means faster loading times and no requirement to wait for text and other content to load while the page queries the database for some banner ad or something. When two commands do not really depend on each other, they can be executed in parallel. This reduces latency and loading time by a lot, as clearly visible from the example.

What about when the successive command does depend on the current command? In this case, what we must do is to rely on callbacks to execute our code in the correct order, as we want to run some things only after certain conditions are met. For example, in the above example, the search for file b can only be worked on AFTER the database query was completed. In this case, and others that require sequential execution, we use the power of callbacks to get the code to work in the intended manner.

Callbacks work quite like the following example:

In the above example, the following function set is being executed:

X ( Z () {} )
Y ()

According to the above code, Z is registered as the callback function of X. In other words, when X completes execution, the event loop starts execution of Z immediately. This way, we can ensure that Z is executed only after X is done.

However, if we have a lot of sequential code, this can become worrisome if we have to write it using callbacks. Quite soon, your code will become a clustered mess of curly braces, as you will be nesting function inside function inside function.

Not to fear. NodeJS has tons of libraries for that purpose. For example, there's async. These will allow you to list functions within blocks that execute sequentially. However, it is recommended to use asynchronous operations wherever possible, as they increase performance of the NodeJS server.

How does Node implement Async?

Not all operations in NodeJS are async. This is a common source of confusion for a lot of people using Node (including yours truly, when I first started out). For example, the following code will run just fine:

a = b+c;
d = a+c;

If all code ran in an async manner, the above code would use the old value of a while calculating d. This, however, does not happen. The code waits for the first statement to execute before running the second.

However, all time-consuming operations like database and file accesses are generally async in nature. It is often helpful to look at the API of any library you use. As a rule of thumb, functions with callbacks specified as parameters in the API will be asynchronous in nature, though you should always try and test it first to be sure.

That's all on asynchronous operations and NodeJS for now. I know there have been no real NodeJS examples on this, but this post has been aimed primarily at tackling the concept of async development and code. Expect further posts in the future about NodeJS and related technology.

Writer's note: I know this has come two days late, and I sincerely apologize. I have been busy traveling and I've not really had enough time to articulate my thoughts for this purpose. I am currently lagging behind schedule by FOUR posts (Gasp!). I'll try to get back on track over the next few days on this, so stay tuned.

Written with StackEdit.

The Programmer [In Training]

My Blog

Search This Blog

Thursday, 4 December 2014

Concept 6: Asynchronous Programming (nodejs)

Prerequisites

Asynchronous Development: The Basics

How does Node implement Async?

1 comment: