Testing nodejs streams

Testing `nodejs` streams

June 16, 2021

Asynchronous computation model makes nodejs flexible to perform heavy computations while keeping a relatively lower memory footprint. The stream API is one of those computation models, this article explores how to approach testing it.

In this article we will talk about:

Difference between Readable/Writable and Duplex streams
Testing Writable stream
Testing Readable stream
Testing Duplex or Transformer streams

Even though this blog post was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book.

Show me the code

//Read + Transform +Write Stream processing example
var gzip = require('zlib').createGzip(),
    route = require('expressjs').Router(); 
//getter() reads a large file of songs metadata, transform and send back scaled down metadata 
route.get('/songs' function getter(req, res, next){
    let rstream = fs.createReadStream('./several-tb-of-songs.json'); 
    rstream.
        .pipe(new MetadataStreamTransformer())
        .pipe(gzip)
        .pipe(res);
    // forwaring the error to next handler     
    rstream.on('error', error => next(error, null));
});

//Transformer Stream example
const inherit = require('util').inherits,
    Transform = require('stream').Tranform;

function MetadataStreamTransformer(options){
    if(!(this instanceof MetadataStreamTransformer)){
        return new MetadataStreamTransformer(options);
    }
    // re-enforces object mode chunks
    this.options = Object.assign({}, options, {objectMode: true});
    Transform.call(this, this.options);
}

inherits(MetadataStreamTransformer, Transform);
MetadataStreamTransformer.prototype._transform = function(chunk, encoding, next){
    //minimalistic implementation 
    //@todo  process chunk + by adding/removing elements
    let data = JSON.parse(typeof chunk === 'string' ? chunk : chunk.toString('utf8'));
    this.push({id: (data || {}).id || random() });
    if(typeof next === 'function') next();
};

MetadataStreamTransformer.prototype._flush = function(next) {
    this.push(null);//tells that operation is over 
    if(typeof next === 'function') {next();}
};

The example above provides a clear picture of the context in which Readable, Writable, and Duplex(Transform) streams can be used.

What can possibly go wrong?

Streams are particularly hard to test because of their asynchronous nature. That is not an exception for I/O on the filesystem or third-party endpoints. It is easy to fall into the integration testing trap when testing nodejs streams.

Among other things, the following are challenges we may expect when (unit) test streams:

Identify areas where it makes sense to stub
Choosing the right mock object output to feed into stubs
Mock streams read/transform/write operations

There is an article dedicated to stubbing stream functions. Mocking in our case will not go into details about the stubbing parts in the current text.

Choosing tools

If you haven't already, reading “How to choose the right tools” blog post gives insights on a framework we used to choose the tools we suggest in this blog.

Following our own “Choosing the right tools” framework. They are not a suggestion, rather the ones that made sense to complete this article:

We can choose amongst a myriad of test runners, for instance, jasmine(jasmine-node), ava or jest. mocha was appealing in the context of this writeup, but choosing any other test runner does not make this article obsolete.
The stack mocha, chai, and sinon (assertion and test doubles libraries) worth a shot.
node-mocks-http framework for mocking HTTP Request/Response objects.
Code under test is instrumented to make test progress possible. Test coverage reporting we adopted, also widely adopted by the mocha community, is istanbul.

Workflow

It is possible to generate reports as tests progress.

latest versions of istanbul uses the nyc name.

# In package.json at "test" - add next line
> "istanbul test mocha -- --color --reporter mocha-lcov-reporter specs"

# Then run the tests using 
$ npm test --coverage

Show me the tests

If you haven't already, read the “How to write test cases developers will love”

We assume we approach testing of fairly large nodejs application from a real-world perspective, and with refactoring in mind. The good way to think about large scale is to focus on smaller things and how they integrate(expand) with the rest of the application.

The philosophy about test-driven development is to write failing tests, followed by code that resolves the failing use cases, refactor rinse and repeat. Most real-world, writing tests may start at any given moment depending on multiple variables one of which being the pressure and timeline of the project at hand.

It is not a new concept for some tests being written after the fact (characterization tests). Another case is when dealing with legacy code, or simply ill-tested code base. That is the case we are dealing with in our code sample use case.

The first thing is rather reading the code and identify areas of improvement before we start writing the code. And the clear improvement opportunity is to eject the function getter() out of the router. Our new construct looks as the following: route.get('/songs', getter); which allows to test getter() in isolation.

Our skeleton looks a bit as in the following lines.

describe('getter()', () => {
  let req, res, next, error;
  beforeEach(() => {
    next = sinon.spy();
    sessionObject = { ... };//mocking session object
    req = { params: {id: 1234}, user: sessionObject };
    res = { status: (code) => { json: sinon.spy() }}
  });
    //...
});

Let's examine the case where the stream is actually going to fail.

Note that we lack a way to get the handle on the stream object, as the handler does not return any object to tap into. Luckily, the response and request objects are both instances of streams. So a good mocking can come to our rescue.


//...
let eventEmitter = require('events').EventEmitter,
  httpMock = require('node-mocks-http'),

//...
it('fails when no songs are found', done => {
    var self = this; 
    this.next = sinon.spy();
    this.req = httpMock.createRequest({method, url, body})
    this.res = httpMock.createResponse({eventEmitter: eventEmitter})
    
    getter(this.req, this.res, this.next);
    this.res.on('error', function(error){
        assert(self.next.called, 'next() has been called');
        done(error);
    });
});

Mocking both request and response objects in our context makes more sense. Likewise, we will mock response cases of success, the reader stream's fs.createReadStream() has to be stubbed and make it eject a stream of fake content. this time, this.res.on('end') will be used to make assertions.

Conclusion

Automated testing streams are quite intimidating for newbies and veterans alike. There are multiple enough use cases in the book to get you past that mark.

In this article, we reviewed how testing tends to be more of art, than science. We also stressed the fact that, like in any art, practice makes perfect ~ testing streams is particularly challenging especially when a read/write is involved. There are additional complimentary materials in the “Testing nodejs applications” book.

References

Testing nodejs Applications book

#snippets #tdd #streams #nodejs #mocking