Simple Engineering

There is an ever-ending war about the best tools to use in any developer community. Some tools fit well with the project at hand, but when they sunset for various reasons, remorse starts crippling in.

In this blog, we present a simple framework to adapt when choosing any software development tool, and how to manage the risk that comes with the tools we adopt.

In this article we will talk about:

  • Choosing a (testing/software development) framework
  • Big-bang is fatal, using adoption curve to manage adoption risk

This article has complementary materials to the Testing nodejs Applications book. However, the article is designed to help those who bought the book to set up their working environment, as well as the wide audience of software developers who need the same information.

The framework

We tend to use tools we love and tend to impose those tools on teams or customers we work with. This pull approach works sometimes, but can also create tribalism talk, which most of the time push back from our peers. Adopting a tool for a group of people requires thinking and look beyond our preferences. The question is how do we get there.

This framework of choosing tools shifts focus from our sentiments to problems we are trying to solve instead. The algorithm is simple: instead of starting from a tool suggestion, start from the problem, see if there is not an existing solution, then compare how various tools solve the problem at hand.

Focusing on the problem

It sounds easy to transition from suggesting tools to focus on the problem. From experience though, the suggestion of tools gets as heated as choosing the next problem to tackle first.

One can argue that debate around the most important issue becomes evident when some problem is of utter urgency such as having a database down or downtime due to running out of memory or CPU. However, when everything is operational, subtle elements should be in play.

The first tool is based on systems that have emergency data collection in place. The 80/20 rule. There is a known realization in the economics world known as [The Pareto principle](https://en.wikipedia.org/wiki/Paretoprinciple)_. In computing world, such a realization shows that “roughly 80% of the effects come from 20% of the causes”. From Microsofts early days, “by fixing the top 20% of the most-reported bugs, 80% of the related errors and crashes in a given system would be eliminated”.

The quotes provided in the previous chapter all come from Wiki/Pareto_principle page.

The result of this first exploratory experiment is a cluster of problems that need to be heuristically classified in order of importance. Any arbitrary classification can work. That is where yet another tool comes in handy to identify immediate problems to work on.

The Eisenhower Box is a decision-making tool that helps to determine which problem to take on next. It is a 2x2 matrix, with columns [urgent, not urgent] and two rows of [important, not important]. All of the. Cross-section of [urgent x important] has to be done right away. [not urgent x important] can be scheduled for later. [not important x urgent] can be delegated for immediate resolution. And anything that is [not important x not urgent] can be dismissed. But since all the clusters being analyzed in the 20% that need fixing, the [not important x not urgent] can be re-introduced in the next work iterations.

The new smaller cluster in [urgent x important] can be classified in order of importance using a weighted matrix, but these tools will be discussed a little more when evaluating the tools. It can still be used if the [urgent x important] is in order of 10 or 100 and a consensus cannot be reached.

urgent not urgent
Important do schedule
Not Important delegate eliminate

Table: The Eisenhower box

Choosing the tool

The variety of choices for any tools(libraries, frameworks, etc.) especially in JavaScript space is staggering. That is a good thing in a sense but unfortunately comes with a hefty choice paralysis price to pay.

To build a data-driven consensus, a weighted decision matrix can be used. The tool with the highest score becomes ipso facto the tool of the team/project. The problem is how do identify criteria and weight each criterion exercise on our decision making as a collective?

There are things we all agree on, amongst those are:

  • taste ~ it is OK to like a tool for the sake of loving it.
  • trendiness ~ if the tool is really popular that influence has to be counted as well
  • learning curve: How long it takes to debug/learn versus benefiting from it.
  • integration(plug&play): How easy to integrate with the existing testing framework.
  • community/help: How good is documentation or if the product has a dedicated support team
  • openness/affordability: closed technology, open-source and free
  • stability: How active is the backing community
  • completeness: How good the testing framework is maintained(LTS etc.)

Even though there is a wide range of pretty good tools to choose from, taste being one of them, here is a set of factors you may consider while selecting your killer tool:

Once criteria of adoption are identified, a weight can be chosen(heuristically or objectively). In the case of dissents, a vote can be required to resolve issues really fast. In case there are not enough, or folks have limited knowledge outside their comfort zone, there are avenues where this kind of information can be gathered: technology radar, existing tools in our codebase, marketplaces, Github, tech blogs, etc.

The following example shows how to choose a testing framework, after identifying that the main issue is the tests are not written due to poor testing overall experience!

Products/criteria taste integration completeness Score Rank
Weight –> 4 3 2
ava 1x4=4 1x3=3 0x2=0 7 2
jest 1x4=4 0x3=0 1x2=2 6 3
mocha 1x4=4 1x3=3 1x2=2 9 1
assert 0x4=0 1x3=3 1x2=2 5 4
jasmine 1x4=4 1x3=3 0x2=0 7 2

Table: Weighed Decision Matrix

The score can be a Yes/No on each criterion, a vote count. We used 0 and 1 based on feeling(or majority votes). In case results are not conclusive, we can use a different scoring board, or eliminate those not well-performing and start over.

Managing Adoption risks

Buyers remorse is one of the risks associated with adopting new tools. In addition to that, a product that we just adopted can be abruptly sunset. A change in policy(price increase, stop serving a category of an industry, etc.)

We need a strategy that allows us to adopt a tool, but also to limit the damage caused by shortcomings of the tool, especially when tools are proven to be not as advertised. Or being caught up between sandwiches when a product is sunset. Or deal with any other price we pay when adopting an open-source tool(such as unresolved bugs).

Adopting a tool in steps instead of big-bang, proves to be effective when it comes to managing risk. On our adoption graph, we have time on the x-axis and adoption on the y-axis. The three phases are the exploratory phase, expansion phase, and adoption organization-wide. The adoption curve is an S-shaped curve. The opposite of the adoption curve is the sunset curve where we pull the plug on the existing tool gradually.

Example: How Big Technical Changes Happen at Slack

Conclusion

Adopting new tools requires looking beyond tribalism, it is hard to imagine any developer giving up on their beloved tool. Adding a new tool to a shared toolkit goes beyond one’s individual choice, and comes with shared risk, that has to be managed as a collective.

In this blog, we discussed a simple adaptable framework that can be used when choosing any framework.

References

Full stack visibility ~ defined as the possibility to have a timely(time-series) birds eye view of logs from Clients(Web/Mobile/SDKS) to Middleware to Databases. Logs~: pre-defined trackable events that can be tied to metrics measuring overall software health.

In this article we will talk about:

  • Anomaly detection in nodejs applications
  • Full-stack visibility data collection tools
  • Full-stack visibility data visualization tools
  • Security risks when logging sensitive data

Even though this blogpost was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

Logging tools

Distributed logs

Analyzing logs can be a daunting task in nodejs environment. Some strategies to detect and correct issues found via logs can be found in the following articles.

This article is unfinished business and will be adding more content as I experience logging problems, or find some interesting use cases in the SRE community.

Conclusion

In this article, we revisited how to aggregate nodejs logs for full-stack visibility, with aim at detecting hidden anomalies. We used the term full-stack to refer to adjacent applications such as middleware, databases alongside client applications such as SDKs, widgets, mobile and webapps. There are additional complimentary materials in the “Testing nodejs applications” book.

References

This blog explores common use cases that make expressjs applications and for that matter some nodejs applications, slow. We will take a different approach to only list known facts, and provide quick fixes whenever possible.

In this article we will talk about:

  • Module loader is slow
  • Modular layout improves performance
  • Serving requests under 9 seconds

Even though this blogpost was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

nodejs Module Loader is slow

  • Starting nodejs server is slow, or starting test cases is really slow. The code structure may be to blame. The leading cause of memory leak in GC is using anonymous callbacks. In case anonymous callbacks are really to blame, a quick and dirty technique to save a couple of milliseconds would be to name all, if not most anonymous functions — including promises, callbacks, and common handlers.
  • require() — the nodejs module loader is really slow and may be to blame. require() is synchronous, which means blocking when loading each module. The quick and dirty technique is to use as fewer require() statements as possible. Alternatively, adding a caching module caching mechanism can solve the issue. There is a quick drop-in require replacement that Dr. Gleb Bahmutov, PhD put together: cache-require-paths.
  • Some utility functions are simply slow. To figure out the culprit, we either have to run a benchmarking test or are lucky enough and the community already identified some utilities that are really slow. That is the case of of url-parse (~> url-parse is deadly slow). Quick and dirty trick: find a performant alternative to the library that is really slow. In case of url-parse, there is a faster alternative: faster-url-parser.
  • Using debug() from the debug nodejs debugging utility, instead of console.log() for debugging purposes. Quick and dirty trick: replace all console.log(x) with debug(x) wherever possible, make sure to get rid of other slow synchronous loggers. > The major difference between debugging and logging is at a persistence level, or the way the application is supposed to use data it harvested during an execution period. Log messages are permanent and used for forensic purposes, debug messages are ephemeral and used for debugging purposes.
  • In case console.log() has been used for logging, using winston, or bunyan, for logging purposes instead of dumping messages to the console using console.log() may be a wise alternative. Quick and dirty trick: replace all logging with asynchronous stream-based logger.log() instead of console.log().
  • The previous was just the beginning. Here is a list of other articles that may help fixing some performance issues: 1) On IBM – Blog: Tips optimizing slow code in nodejs 2) On expressjs – Blog: Best practice performance. 3) Four nodejs Gotchas that Operations Teams Should Know about
  • Before any performance improvement, it is better to start with a profiling session. Profiling code gives insights on how functions interact, helps identify which functions are mostly occupying the CPU.
  • Profiling is hard. Luckily, there are good visualizations that put collected data to graphs the rest of us can read and understand. For that though, you will need to know how to read the flame chart. There are more introductory writings about Flame Graphs, and Flame Chart on – The Mozilla Developer Networks. The next article provides more insights on the CPU flame graphs. > There is a technique used to log profiles and visualize results in a sunburst diagram.
  • Least but not last, there is Profiling slow nodejs apps using v8-profiler and Paul Irish quick introduction on How to debug nodejs with Chrome DevTools.

Is it possible to include performance analysis in the development workflow?

Conclusion

In this article, we revisited reasons why some parts of nodejs project may be slow. We also suggested measures to take to mitigate some of the common use cases. The format and size of this blog post barely scratch the surface. However, there are additional complementary materials in the “Testing nodejs applications” book that can help understanding more on the subject.

References

There is always a discrepancy between the latest version of the codebase next to its documentation. What if it was possible to sync code with its documentation? This blog is about eliminating some barriers that hinder the documentation to evolve at the same pace as the codebase.

In this article we will talk about:

  • Automating release notes
  • Automating Version updates
  • Continuously documenting API
  • Continuously documenting REST APIs
  • Continuously documenting release notes

Even though this blogpost was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing `nodejs` Applications Book Cover

The documentation done ahead of code change becomes outdated as soon as any new feature implementation is completed.

The documentation after code change is somehow late documentation. During code change, there is a possibility that critical information is going to be missing, till code change is complete, and documentation is updated.

The right documentation strategy to adopt is the one that reduces a disconnect between the documentation and the code it is supposed to document. Ideally, documentation that goes hand in hand with and as code change happens, as it comes in at the right time and reflects the actual status of the code.

Down below, we present key areas with opportunities to automate the documentation process or to move documentation in sync with code change.

Release notes

Automating release notes makes it possible to continuously delivering release notes as each code version goes out the door.

Versioning

To Automating Version updates, there is quite a catalog of tools that helps to bump version every time a new tag becomes available or a merge is done on a branch tracked as the main branch.

API Documentation

Continuously documenting API. The top comment on every function/method is a good place to start. It is possible to put some effort to update the documentation as new sections of the code get introduced in the codebase. Automation and documentation generation tools help to achieve this. Tools such as documentation.js extract the code comments into human-readable, well-formatted documents, making the continuous documentation a reality.

REST API Documentation

To continuously documenting REST APIs, there are multiple tools and methodologies that have been in place for a very long time. There are tools such as swagger that helps model, as well as documenting the REST APIs as they are being developed. REST APIs documentation have to be beautiful, informative, updated as new endpoints are created or versions are updated. Slate is one of the tools that can help to achieve this goal.

Conclusion

Above are areas that present opportunities to automate the documentation process. How to approach documentation and choice of tools to use are the responsibility of the code maintainer. This article was served food for the soul.

In this article, we revisited tools and methods that help documentation evolve alongside codebase iterations. There are additional complimentary materials in the “Testing nodejs applications” book, regarding good practices around documentation.

References

In this blog, we talk about challenges and opportunities associated with the deployment of nodejs applications on a production server — in a non-cloud environment as well as in a cloud-native environment.

The technique we are exploring is to run on a production server as we do on localhost, but expose the application to the world using an nginx reverse proxy server.

This blog post is a followup to two blog posts: “how to install nginx server” and “how to configure nginx as a nodejs reverse proxy server” and Easy nodejs deployment

In this article we will talk about:

If you are looking how to achieve nodejs zero downtime deployments, read the “How to achieve zero downtime deployment with nodejs instead.

  • Basic configurations
  • Manual deployment on a live server
  • Deploying on cloud infrastructure
  • Leveraging push-to-deploy technique
  • Leveraging git WebHook to deploy new versions
  • Atomic deployment
  • Canary deployment
  • Blue/Green deployment

Even though this blog post was designed to offer complementary materials to those who bought my book Testing nodejs Applications book book, the content can help any software developer to level up their knowledge. You may use this link to buy the book. Testing nodejs Applications Book Cover

nodejs configuration

There is a series of operations that takes place before the code hits the production environment and lands in hands of customers. One of those actions is packaging and will be discussed in the next sections.

While keeping in mind that some of those series of action may be of interest of the reader, we also have to be mindful that they cannot all be covered in this one single piece. But here is the deal, those steps have been covered in the following blog posts:

Now that we have an idea of how configuration works, let's revisit some release strategies.

Reducing Friction

The ” reducing the friction” idea comes from the need to make releases, packaging, and deployments easy, repeatable processes, easy to automate the whole release pipeline.

Reducing friction at deployment time involves reducing steps it takes from getting the binaries to making applications available to the world for use. It is now quite a jargon in these series, to use divide and conquer when simplifying a complex process. The following paragraphs are highlight reducing friction strategy at various states in the deployment process.

Reducing friction when releasing new software, involves reducing steps it takes from raw code to assembling a deployable package.

One way to reduce friction when releasing new software is to package the application into bundles. The bundles are going to be defined as ~ application code bundled together with its dependency libraries to form one unified software entity that can be deployed as a whole. The same approach is used in stacks written in Java, via .jar|.war files based releases.

The packaging strategy in nodejs context makes sense when the application is either destined to be deployed offline or in environments where downloading dependencies either constitutes a bottleneck or outright impossible. When that is the case, a .tar package is produced as a single versioned independent release.

Alternatively, a release tag that tells the installer how and where to get the application and its dependencies ~ npm and yarn package managers are some of the examples that use this approach. When releasing such packages, dependency libraries are not bundled together with the application, rather the package manager knows how to get the dependencies and how to install those at deployment time.

As it stands, steps similar to the ones described below can be followed:

  • Download source code ~ git/wgetin case of tarball package, or using npm/yarn discoverable dependency packages. Discoverable packages using orchestration tools such as kubernetes or docker swarm use an approach similar to the one discussed in the previous two cases.
  • Install source code ~ Download and binary installation can be comprised in one step. Our current step switches global directories so that baseline services can find them from the same address. For that, symlink directories such as /logs, /configs, and application top-level directory.
  • Switch Servers on ~ This step restarts servers underlying services that make the application visible to the world. Such services can be, but not limited to database service(mongod, pg, etc), redis, nginx and at last the application service itself.
  • Validation ~ This step involves checking for failures in the system, post-release. It expects us to have a rollback strategy, in case something goes wrong at the last minute. In case of unacceptable failures, the strategy described above prescribes rolling back symlinks to previously deployed versions, and restart services. This will guarantee that the existing application keeps running, and the release is technically called off. In the case of containerized software, the orchestration makes it even easier, as failing nodes can simply be switched off, and the existing application keeps running as usual.

Automated release strategies

Releases are like historical records of the state of the software at any given release date. The state of software can seat in for feature status and bug fixes.

As an example, if delivery is due every 1st day of a month, we would have trouble referencing the package of software in any discussion. A version number associated with software state|change-log makes it possible to refer to a particular delivery in the history of a project.

The widely adopted versioning follows Semantic Versioning (SemVer) specification. Three parts of SemVer are MAJOR.MINOR.PATCH. We anticipate breaking change to be introduced into the system when MAJOR is incremented. We anticipate new features and enhancements being introduced when MINOR is incremented. And we anticipate security patches and bug fixes when PATCH is incremented.

These conventions present an opportunity to design a plan of attack — or reaction — when deploying a patch, a minor, or upgrading to a new version of the application, in a consistent and predictable way. Reducing the friction when releasing new software, or automating updates that do not harm the system.

You broke my code: understanding the motivations for breaking changes in APIs

Build servers

The role of build servers is to provide a deployable build, every time there is a push to a tracked branch. The build can be of bundle, as well as managed package nature. Build servers coupled to [version control system](https://en.wikipedia.org/wiki/Versioncontrol)_ constitute the backbone of a Continuous Integration pipeline.

Non-exhaustive list of CI servers, and services: Distelli, Magnum and Strider

Automated deployment strategies

Making the application available to the world

The final step in working on any project is the ability to deploy it see it shine, or crash Deployment is moving code from a development environment to a production environment.

A deployment can be as simple as adding assets to a static assets server, and as complex as upgrading a database engine server. Downtime caused by complex deployments ranges from sub-system disruption to entire system. The keyword is the number of system changes involved in the deployment and the width of deployment window.

The following deployment strategies can be leveraged alone, or in combination, to deliver a deployment experience with less friction.

friction is defined by the number of moving parts in the system. The least systems involved while delivering a new software version, the better, the less friction.

The following are some of the deployment strategies that remove the need to have a rollback strategy.

  • Atomic deployment
  • Canary deployment
  • Blue/Green deployment

The following are some of the tools that make deployment automation feasible.

  • Using Push-to-deploy
  • Using git WebHook to deploy new applications
  • Using a deployment server
  • Using asynchronous scheduled deployment jobs

When these two strategies are put together, they constitute a baseline for a continuous deployment model, that is cheap or free.

[How to install nginx server](), [Why coupling nodejs server to an nginx proxy server](), [How to configure nginx as a nodejs application proxy server]()

Push to deploy

The push-to-deploy model is a classic signature of Heroku. We should note that this technique is made possible by git.

In the push-to-deploy model, a push to a designated branch, say live or master triggers a task responsible to initiate the deployment sequence on another remote server. There are two sides of the coin we have to look at to make this model work: The server-side, and a post-receive hook shipped with the code it is supposed to deploy. The role of a post-receive hook is to detect the end of git source code download and run symlink+restart binaries.

## Server Side
#first time on server side  
apt-get update
apt-get install git

#updating|upgrading server side code
apt-get update

#create bare repository + post-receive hook 
#first time initiaalization
cd /path/to/git && mkdir appname.git
cd appname.git
git --bare init

Example: source

The post receive hook can be similar to the following snippet.

## Post-receive hook
cd /path/to/git/appname.git/hooks
touch post-receive
#text to add in post-receive
>>>#!/bin/sh
>>>GIT_WORK_TREE=/path/to/git/appname git checkout -f
>>>
#change permission to be an executable  file
chmod +x post-receive

The push-to-deploy idea removes the middleman otherwise necessary to move software from the development environment to any production environment. We will take production with a grain of salt, the production environment is relative to the system end-user. A production environment may in fact be UAT if we take testers as prescribed users. Beta, Alpha, and live environments are production environments from a customer standpoint.

This model may look attractive, but also be chaotic in the case of hundreds of developers shipping on a push keystroke! However, that may not be an issue, if the deployment is targeting a shared development environment.

Git WebHook

Manual deployment on a live server

Manual deployment has a multi-faceted aspect. The obvious fact, is when we use ssh to log into a remote server and execute deployment steps manually. There are some cli tools that make it possible to connect and execute deployment steps from a development environment. This model works, but not scalable, especially when multiple servers have to be managed.

Deploying on Cloud infrastructure.

Almost all cloud players provide infrastructure software. This makes it easy to download and deploy software for our application. The downside of deploying in the cloud is solely based on its pricing model.

Here are options that are available in the industry:

  • Heroku ~ notoriously known for the push-to-deploy model. A push to a tracked branch triggers an automatic deployment. This service provides most of the configuration needs out of the box as well.
  • AWS ~ One of the major players in cloud space, makes it possible to deploy manually by uploading a .tar/.jar file. It also gives a CLI tool that can be turned into a full-fledged pipeline.
  • Google ~ has the same line of offerings similar to Amazon's AWS. The main difference reliability on Open Source software tools and pricing model.
  • PCF, OpenShift, and other OpenStack platforms offer the same or similar capabilities as described in previous sections.

Conclusion

In this article, we revisited strategies to deploy a nodejs application. There are additional complimentary materials on this very subject in the “Testing nodejs applications” book.

References

Most data-driven applications are backed by at least a database, if not many. Schema change and data migration are the driving forces of the data reconciliation nightmare. This article puts a name on commonly known culprits, proposes some remediations, or provides resources that can help find a solution.

There is a more specific “How to version mongodb models” article that gets personal from a model versioning perspective.

In this article we will talk about:

  • Better versioning of mongoose models.
  • Safe procedures for database engine upgrades
  • Release cycles that do not collide schema change with database engine upgrades
  • Synchronizing release(patch vs minors) with database engine releases(patch vs minors)
  • Upgrade mongoose model version in a production environment
  • Upgrading to new mongodb database version in production environment
  • Align continuous deployment with upgrades of underlying infrastructure(database engines, database driver, model upgrades, data access frameworks, etc)

Even though this blogpost was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

Introduction

First iteration of the application is always easy to pull off. When subsequent iterations have schema changes, data migration or some major upgrades then problems start to creep in. This article name what to anticipate after deploying a database to production, and how to mitigate issues related to running a database production deployment.

The database migration term is seen from the following perspectives, at least in this article:

  • Patching database engines (patch versions)
  • Updating database engines (minor versions)
  • Upgrading database engines (major versions)
  • Patching database driver/ORM/ODM (patch versions)
  • Updating database driver/ORM/ODM (minor versions)
  • Upgrading database driver/ORM/ODM (major versions)
  • Automated backups and restore
  • Data migration from one database engine version to the next
  • Model schema change on same database engine versions
  • Model schema change on different database engine versions
  • State vs Migration driven database delivery

Database vendors have a pre-determined release cycles. That makes it easy to anticipate when a new release will be available. This makes it easy to align application development with supporting software release cycles. Aligning risky feature changes with cycles we don't expect huge changes from the database vendor. Such a model should work well within a continuous deployment environment as well.

Strategy

The database provider release cycle suggests that decoupling would make things a little easier, especially in data driven applications where schema change is inevitable and frequent.

As database engine changes have their own upgrade time, model changes should also have their own model version update time. With this knowledge, we have two granular scenario, one regarding underlying software upgrades and another regarding code change.

For database engine version upgrade, we can expect the reverse operation which is to downgrade to previous versions. For model revision updates, we can expect rollback as a reverse operation

It is better to have upgrades and model updates at separate times.

There is state-driven and migration-driven database migration strategies. State vs migration-driven database delivery helps to understand more on those two strategies.

Challenges when upgrading database engines

To clear things up, we will have to answer the following question:

When is the right time to migrate the database? When is the right time to switch the application to use the new version of the database?

When Upgrading mongodb versions, a challenge is how to achieve that in a production environment. The gravity of the problem goes from no expected issue to a more insane scenario such as incompatibility in data format, or changes in data access APIs.

Database engine upgrade presents three opportunities to do the upgrade:

  • Patching database engines (patch versions)
  • Updating database engines (minor versions)
  • Upgrading database engines (major versions)

On each opportunity has its own challenges, offer different opportunities and procedures. We always have to strive for safe procedures for database engine upgrades.

What are implications of driver API change when migrating to a new database version?

The database engine upgrade(major version) qualifies to keep the title of “database upgrade”. The following list has additional resources about the database upgrade subject:

Challenges when versioning mongoose models

Model versioning goes hand in hand with data migration. From the first perspective because change in model schema requires data structures to reflect definitions on the model.

However, given the inconsistencies that can be introduced in minor and major versions, it makes sense to do models versioning(mongoose schema changes) when the risk to have database engine related issues is really low: meaning only when patches are being released, before a new minor or major release.

Another issue to figure out, is making sure transformations are a part of booting from backups. That way, applying schema changes becomes a ritual, reducing the risk to have corrupt data at any single time.

What kind of problems to expect, when forced to run one migration script twice or thrice? Is the state going to be changed or preserved ~ keyword: the scripts should be re-runnable.

Release policy that aligns with database engine releases

The database in our context is mongodb, but tactics described here can be applied to any other kind of database.

To adopt a release policy that aligns with database engine releases, instead of colliding with it, helps making operations a little smooth.

  • Release cycles that do not collide schema change with database engine version bumps
  • Synchronizing release(patch vs minors) with database engine releases(patch vs minors)

Automated backups/restore

mongodb provides two mechanisms that we can tap into to automate backups and restore. mongodump/mongorestore and mongoexport/mongoimport.

Our task is to figure out how to fit transform operations when executing mongoimport or mongorestore operations.

What are the known problems that mongodump + mongorestore utilities may face when migrating to a new version of mongodb.

What are the known problems that mongoexport + mongoimport utilities may face when migrating to a new version of mongodb.

Additional resources about backup and restore:

Key tasks to execute when upgrading database engine

  • [ ] task one ~ backup and lock writes
  • [ ] task two ~ upgrade packages
  • [ ] task three ~ stop/restart the database server
  • [ ] task four ~ execute migration scripts
  • [ ] task five ~ booting the database server from new version

These tasks do not include containerized environments, such as docker or orchestrated environments such as kubernetes.

Finding and updating Packages

Every operating system has its own way of finding and upgrading its package. The next example, showcase how the same thing is accomplished in Linux environment, more specifically, Ubuntu OS.

# Fetches latest packages from PPA
$ apt-get update

# Upgrades and install latest versions.
$ apt-get upgrade
$ apt-get dist-upgrade

# Installing `mongodb` alone 
$ apt-get install -y mongodb-org

# Reloading newly installed package
$ service `mongodb` reload

Example:

Running MongoDB

Most of the time, the following commands are going to help for most start, restart or stop mongodb run as a service. The commands are usable for both Ubuntu and MacOS, and may be executed right after updating mongodb packages.

# not required all the time  
$ killall mongod && sleep 3 && service mongod start
$ service `mongodb` start

# macOS   
$ mongod # using mode is up to you, but it is not advised

# Finding if a service, in this case, `mongodb` is available
$ service --status-all

Example:

redis Maintenance and Migration

Key takeaway

Or things that are good to know

  • To avoid “loss of fidelity” in migration tasks, use BSON instead of JSON
  • mongodb can detect Engine if a collection or database exists and take action accordingly.
  • Run migration during non-peak hours: migrations are CPU/Memory/Disk intensive tasks. They can bring the system down.
  • Lock writes, to preserve data consistency across shards+datacenters(non-peak hours|maintenance window)
  • db.fsyncLock() + db.fsynchUnlock() to lock and unlock database writes during migration task 1, chunk migration
  • Migration based on Export + Import functions do not guarantee data integrity
  • Migration based on mongodump and mongorestore guarantee data integrity

Conclusion

In this article, we revisited data reconciliation issues associated with schema change, database engine versions, data migration issues in development and production settings. There are additional complimentary materials in the “Testing nodejs applications” book.

References

nodejs servers have flaws capable of causing downtimes at any time. In this article, we will explore early incident detection techniques, with a special focus on preventing deployment time service disruption.

This blog post is a followup to: Easy nodejs deployment article, an article that cover deployment on traditional servers, and “Deploying nodejs applications” an article that cover deployment in native an non-native cloud environments. This blog may include leveraging containers(Docker and Kubernetes) to achieve zero deployment downtime in future iterations.

In this article we will talk about:

  • How to deploy a new nodejs application
  • How to serve nodejs applications via nginx proxy server
  • How to leverage cluster to mitigate downtimes
  • How to leverage nginx streams to mitigate downtimes
  • Early incident detection using monitoring tools — self healing application architecture
  • How to table a good rollback strategy

Even though this blog post was designed to offer complementary materials to those who bought my Testing nodejs Applications book book, the content can help any software developer to level up their knowledge. You may use this link to buy the book. Testing nodejs Applications Book Cover

nodejs configuration and deployment

There is a series of operations that takes place, before the code hits the production environment and lands in hands of customers, for battle-hardening purposes.

While keeping in mind that those series of actions may be of interest to the reader, we also have to be mindful that we cannot cover everything in this piece. But here is the deal, those steps have been covered in the following blog posts:

Now that we have an idea of how deployment and configuration work, let's identify the source of downtime.

Genesis of downtimes

Downtimes come for a variety of reasons. We will revisit intentional and unintentional downtimes, as well as resilient highly available systems.

Intentional downtime is downtimes resulting from an upgrade, or update that requires a system shutdown. Traditionally, new deployments require shut-down of the system that is about to be replaced, so that a new system can be booted, especially when those two systems use the same port.

Unintentional downtimes are unexpected either resulting from fatal unhandled errors, crashes resulting from running out of resources(disk/CPU/RAM). In either case, in multi-process systems, for the downtime to happen, all server faulty processes have to die and never be replaced with new healthy processes.

Resilient systems are designed in a way that some faulty processes crash and the self-healing mechanism boots up new healthier processes. systems. This can either be achieved by leveraging cluster API, a combination of reverse proxy/load balancing to healthier instances, or more recently containerization.

Monitoring

Monitoring, custom alerts, and notifications systems

Monitoring overall system health makes it possible to take immediate action, as soon as something unusual happens. Key metrics to looks at are CPU usage, memory availability, disk capacity, read/write failure rate, particular errors/failures, and software error rates.

Monitoring systems makes it easy to detect, identify and eventually repair or recover from a failure in a reasonable time. When monitoring production applications, the aim is to be quick to respond to incidents. Sometimes, incident resolution can also be automated. For instance, a notification system also triggers a prescribed script that remediates the problem. These kinds of systems are also known as self-healing systems.

For article on how to install reporting tools, please read this articles: “How to install reporting tools”. More on customizing monitoring tools ~ How to monitor deployed applications for reporting and quick response time and Notifications via email or custom scripts 3) logging for issue discover and traceability. Monitoring nodejs applications.

It is a good idea to use a monitoring tool outside the application environment(or server). This strategy bails out when downtime originates either from same rack of servers, or to an entire data center. However, monitoring tools deployed on a same server, have advantage of better taking the pulse of the environment on which the application is deployed on. A winning strategy is deploying both solutions, so that notifications can go out even when an entire data center has a downtime.

Other tools monitoring Uptime ~ Monitoring-dashboard

Recovering from failure

Recovering from a failure is really a broad term. One of the best way to recover from a failure, is to have no failure at all. Alternatively, it is possible to spread the risk of having failure across multiple layers, and design recovery mechanism around each individual layer.

To elaborate more on what was stated above: a database server crash creates a domino effect that results in application server to crash as well. When a database server is indeed a cluster of servers, it becomes hard for all database servers in the pool to crash at the same time. The risk of having a database server crash is spread across multiple database servers, therefore minimal, consequently, reducing chances to crash entire service at the same time. The vulnerability of this approach is linked to region the database servers may be hosted in, or data center.

Another example in nodejs world, would be instead of having one single process server, instead span out multiple processes. That way, when one server process dies, the remaining server processes take over while waiting for the crashed server to recover, or the system to spin up a new server process. This is make easy with cluster API, or leveraging nginx load balancer to redirect traffic to processes that are still alive.

How to leverage cluster API to avoid deployment time downtimes. For more on configuring nginx for resiliency discusses how to reboot nginx every time it dies for some reasons. how to configure nginx to serve nodejs applications. How to leverage streams to avoid deployment downtimes

Rollback Strategy

How to rollback a deployment from a version that fails to a version that works

The best strategy to rollback a failing deployment is obviously not having a failing deployment in the first place. But when that happens, the backup plan is to have those failures as early as possible and flip the switch back to systems that work. This kind of strategy was made popular in the blue/green deployment strategy.

It is possible to run a canary version, where a less mature product gets the taste of the harsh environment it will be running on, on a limited number of customers. The cohort enrolled in running canary build, should be tolerant to shortcomings of the canary product they are using. It is quite reasonable to call these customer beta testers as well.

Another alternative is to release and deploy versioned products. This is especially the case for static assets(SPAs and API endpoints, to name a few). In case a version is broken, the client can only switch to a version that works while waiting for a patch to land and fix the broken version.

Conclusion

In this article, we revisited how to achieve zero downtime by leveraging tools that already exist in both the nodejs runtime or available free of charge as open-source software. There are additional complimentary materials in the “Testing nodejs applications” book.

References

It is quite a challenge to add good looking code snippets documentations, presentations or books.

When adding code in documentation, the classic move is to take a screenshot, use tool X to add extra pictogram to the code, and upload a picture and hope for the best that there is no bug in our example. To solve the versioning issue, some copy/paste from gist to content, just to avoid penalty that comes with using iframes. Both approaches are not sustainable in terms of workflow.

This article will explore scalable, sane ways to share good looking code snippets, with possibility to edit shared code, without fear or hustle of keeping track of all places the code has been shared.

In this article you will learn about:

  • Tools available to make good code snippets to add in various presentation media
  • Techniques to leverage tools without breaking the piggy bank

Even though this blogpost was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing Nodejs Applications Book Cover

Using screenshots. Screenshots are the most obvious and ubiquitous form of code snippets sharing mechanism. The image produced by the snippets can easily be embedded in any form of medium, be a social media post, blog post, presentation or even a book. However, when time comes to add an addition to the code, or correct an error, then the scalability/sustainability of this practice manifest. It is not effective just to snap another print-screen and re-share everywhere else the print screen has been shared. Moreover, the code hardly looks good inside the editor. Developers editors are most of the time optimized for work, and not for presentations.

Using carbon. Carbon is really powerful tool. It made it possible to add, edit and share quite stunning representation of code, as pictures/svg. It allows to format code at taste of the author, and makes the code easily shareable as svg or image elements. Since the platform that generate the code is hosted on somebody else's servers, it becomes quite a challenge to keep track of all code change in every shared snippet.

Using gist. Gist allows to host/track, edit and share code snippets with the world. To share the code within blogposts and heavy presentations comes with a downside: gist uses inline iframe, which sometimes do not look good on mobile, but also are really a pain in ass performance-wise.

There are solutions for sharing the code snippets around the internet. Some of those resources do not allow their users to such as using gist(this looks good)

Synopsis

Quite often code snippets are posted on this and other dev platforms. Graphical annotations(drawings and text) on top of snippets, make code easy to digest, therefore conveying more knowledge to readers.

Hopes

Codepen.io gives a way to take and export a screenshot. There is also highlighter libraries like highlight.js, in additions to plugins that allow highlighting in presentation(text processors).

Examples

teropa article this gem comes from “refactoring angular apps”

Problem

From previous examples, and statements, one may ask “How do people add graphical annotations on code snippets?”. In case of an error in code sample, we tend to go back and fix the example. This forces authors to go back and edit graphical annotations, before updating the documentation with the right snippets. Is it possible to reduce time it takes to make such modifications.

Tools

Non exhaustive list of some products:

Non exhaustive list of some presentation software, other than Keynote, Google Slides and Microsoft PowerPoint(If this is still a thing)

References

I read most of these links without luck.

Related Community Questions

#snippets #CodeAnnotations #carbon #codepen #pens #jsfiddle #fiddles #HowTo #TestingNodejsApplications

This article revisits essentials on how to install upstart an event based daemon for starting/stopping tasks on development and production servers.

This article has complementary materials to the Testing nodejs Applications book. However, the article is designed to help both those who already bought the book, as well as the wide audience of software developers to setup working environment. Testing Nodejs Applications Book Cover You can grab a copy of this book on this link

There are a plethora of task execution solutions, for instance systemd and init, rather complex to work with. That makes upstart a good alternative to such tools.

In this article you will learn about:

  • Tools available for task execution
  • How to install upstart task execution
  • How to write basic upstart task

Installing upstart on Linux

It is always a good idea to update the system before start working. There is no exception, even when a daily task updates automatically binaries. That can be achieved on Ubuntu and Aptitude enabled systems as following:

$ apt-get update # Fetch list of available updates
$ apt-get upgrade # Upgrades current packages
$ apt-get dist-upgrade # Installs only new updates

Example: updating aptitude binaries

At this point most of packages should be installed or upgraded. Except Packages whose PPA have been removed or not available in the registry. Installing software can be done by installing binaries, or using Ubuntu package manager.

Installing a upstart on Linux using apt

Installing upstart on macOS

upstart is a utility designed mainly for Linux systems. However, macOS has its equivalent, launchctl designed to stop/stop processes prior/after the system restarts.

Installing upstart on a Windows machine

Whereas macOS systems and Linux are quite relax when it comes to working with system processes, Windows is a beast on its own way. upstart was built for *nix systems but there is no equivalent on Windows systems: Service Control Manager. It basically has the same ability to check and restart processes that are failing.

Automated upgrades

Before we dive into automatic upgrades, we should consider nuances associated to managing a mongodb instance. The updates fall into two major, quite interesting, categories: patch updates and version upgrades.

Following the SemVer ~ aka Semantic Versioning standard, it is recommended that the only pair minor versions be considered for version upgrades. This is because minor versions, as well as major versions, are subject to introducing breaking changes or incompatibility between two versions. On the other hand, patches do not introduce breaking changes. Those can therefore be automated.

In case of a critical infrastructure piece of processes state management calibre, we expect breaking changes when a new version introduces a configuration setting is added, or dropped between two successive versions. Upstart provides backward compatibility, so chances for breaking changes between two minor versions is really minimal.

We should highlight that it is always better to upgrade at deployment time. The process is even easier in containerized context. We should also automate only patches, to avoid to miss security patches.

In the context of Linux, we will use the unattended-upgrades package to do the work.

$ apt-get install unattended-upgrades apticron

Example: install unattended-upgrades

Two things to fine-tune to make this solution work are: to enable a blacklist of packages we do not to automatically update, and two, to enable particular packages we would love to update on a periodical basis. That is compiled in the following shell scripts.

Unattended-Upgrade::Allowed-Origins {
//  "${distro_id}:${distro_codename}";
    "${distro_id}:${distro_codename}-security"; # upgrading security patches only 
//   "${distro_id}:${distro_codename}-updates";  
//  "${distro_id}:${distro_codename}-proposed";
//  "${distro_id}:${distro_codename}-backports";
};

Unattended-Upgrade::Package-Blacklist {
    "vim";
};

Example: fine-tune the blacklist and whitelist in /etc/apt/apt.conf.d/50unattended-upgrades

The next step is necessary to make sure unattended-upgrades download, install and cleanups tasks have a default period: once, twice a day or a week.

APT::Periodic::Update-Package-Lists "1";            # Updates package list once a day
APT::Periodic::Download-Upgradeable-Packages "1";   # download upgrade candidates once a day
APT::Periodic::AutocleanInterval "7";               # clean week worth of unused packages once a week
APT::Periodic::Unattended-Upgrade "1";              # install downloaded packages once a day

Example: tuning the tasks parameter /etc/apt/apt.conf.d/20auto-upgrades

This approach works on Linux(Ubuntu), especially deployed in production, but not Windows nor macOS. The last issue, is to be able to report problems when an update fails, so that a human can intervene whenever possible. That is where the second tool apticron in first paragraph intervenes. To make it work, we will specify which email to send messages to, and that will be all.

EMAIL="<email>@<host.tld>"

Example: tuning reporting tasks email parameter /etc/apticron/apticron.conf

Conclusion

In this article we revisited ways to install upstart on various platforms. Even though configuration was beyond the scope of this article, we managed to get everyday quick refreshers out.

References

#nodejs #homebrew #UnattendedUpgrades #nginx #y2020 #Jan2020 #HowTo #ConfiguringNodejsApplications #tdd #TestingNodejsApplications

This article revisits essentials on how to install monit monitoring system on production servers.

This article has complementary materials to the Testing nodejs Applications book. However, the article is designed to help both those who already bought the book, as well as the wide audience of software developers to setup working environment. Testing Nodejs Applications Book Cover You can grab a copy of this book on this link

There are a plethora of monitoring and logging solutions around the internet. This article will not focus on any of those, rather provide alternatives using tools already available in Linux/UNIX environments, that may achieve near same capabilities as any of those solutions.

In this article you will learn about:

  • Difference between logging and monitoring
  • Tools available for logging
  • Tools available for monitoring
  • How to install monitoring and logging tools
  • How to connect end-to-end reporting for faster response times.

Installing monit on Linux

It is always a good idea to update the system before start working. There is no exception, even when a daily task updates automatically binaries. That can be achieved on Ubuntu and Aptitude enabled systems as following:

$ apt-get update # Fetch list of available updates
$ apt-get upgrade # Upgrades current packages
$ apt-get dist-upgrade # Installs only new updates

Example: updating aptitude binaries

At this point most of packages should be installed or upgraded. Except Packages whose PPA have been removed or not available in the registry. Installing software can be done by installing binaries, or using Ubuntu package manager.

Installing a monit on Linux using apt

Installing monit on macOS

In case homebrew is not already available on your mac, this is how to get one up and running. On its own, homebrew depends on ruby runtime to be available.

homebrew is a package manager and software installation tool that makes most developer tools installation a breeze.

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Example: installation instruction as provided by brew.sh

Generally speaking, this is how to install/uninstall things with brew

$ brew install wget 
$ brew uninstall wget 

Example: installing/uninstalling wget binaries using homebrew

We have to to stress on the fact that Homebrew installs packages to their own directory and then symlinks their files into /usr/local.

It is always a good idea to update the system before start working. And that, even when we have a daily task that automatically updates the system for us. macOS can use homebrew package manager on maintenance matters. To update/upgrade or check outdated packages, following commands would help.

$ brew outdated                   # lists all outdated packages
$ brew cleanup -n                 # visualize the list of things are going to be cleaned up.

$ brew upgrade                    # Upgrades all things on the system
$ brew update                     # Updates all outdated + brew itself
$ brew update <formula>           # Updates one formula

$ brew install <formula@version>    # Installs <formula> at a particular version.
$ brew tap <formular@version>/brew  # Installs <formular> from third party repository

# untap/re-tap a repo when previous installation failed
$ brew untap <formular> && brew tap <formula>   
$ brew services start <formular>@<version>

Example: key commands to work with homebrew cli

For more informations, visit: Homebrew ~ FAQ.

Installing a monit on a macOS using homebrew

It is hard to deny the supremacy of monit on *NIX systems, and that doesn't exclude macOS systems. Installation of monit on macOS using homebrew aligns with homebrew installation guidelines. From above templates, the next example displays how easy it is to have monit up and running.

$ brew install monit        # Installation of latest monit
$ brew services start monit # Starting latest monit as a service 

Example: installing monit using homebrew

Installing monit on a Windows machine

Whereas macOS systems and Linux are quite relax when it comes to interacting with processes, Windows is a beast on its own way. monit was built for *nix systems but there is no equivalent on Windows systems: Service Control Manager. It basically has the same ability to check and restart processes that are failing.

Automated upgrades

Following the SemVer ~ aka Semantic Versioning standard, it is not recommended to consider minor/major versions for automated upgrades. One of the reasons being that these versions are subject to introducing breaking changes or incompatibility between two versions. On the other hand, patches are less susceptible to introduce breaking changes, whence ideal candidates for automated upgrades. Another among other reasons, being that security fixes are released as patches to a minor version.

In case of a critical infrastructure piece that is monitoring, we expect breaking changes when a new version introduces a configuration setting is added, or dropped between two successive versions. Monit is a well thought software that provides backward compatibility, so chances for breaking changes between two minor versions is really minimal.

We should highlight that it is always better to upgrade at deployment time. The process is even easier in containerized context. We should also automate only patches, to avoid to miss security patches.

In the context of Linux, we will use the unattended-upgrades package to do the work.

$ apt-get install unattended-upgrades apticron

Example: install unattended-upgrades

Two things to fine-tune to make this solution work are: to enable a blacklist of packages we do not to automatically update, and two, to enable particular packages we would love to update on a periodical basis. That is compiled in the following shell scripts.

Unattended-Upgrade::Allowed-Origins {
//  "${distro_id}:${distro_codename}";
    "${distro_id}:${distro_codename}-security"; # upgrading security patches only 
//   "${distro_id}:${distro_codename}-updates";  
//  "${distro_id}:${distro_codename}-proposed";
//  "${distro_id}:${distro_codename}-backports";
};

Unattended-Upgrade::Package-Blacklist {
    "vim";
};

Example: fine-tune the blacklist and whitelist in /etc/apt/apt.conf.d/50unattended-upgrades

The next step is necessary to make sure unattended-upgrades download, install and cleanups tasks have a default period: once, twice a day or a week.

APT::Periodic::Update-Package-Lists "1";            # Updates package list once a day
APT::Periodic::Download-Upgradeable-Packages "1";   # download upgrade candidates once a day
APT::Periodic::AutocleanInterval "7";               # clean week worth of unused packages once a week
APT::Periodic::Unattended-Upgrade "1";              # install downloaded packages once a day

Example: tuning the tasks parameter /etc/apt/apt.conf.d/20auto-upgrades

This approach works on Linux(Ubuntu), especially deployed in production, but not Windows nor macOS. The last issue, is to be able to report problems when an update fails, so that a human can intervene whenever possible. That is where the second tool apticron in first paragraph intervenes. To make it work, we will specify which email to send messages to, and that will be all.

EMAIL="<email>@<host.tld>"

Example: tuning reporting tasks email parameter /etc/apticron/apticron.conf

Conclusion

In this article we revisited ways to install monit on various platforms. Even though configuration was beyond the scope of this article, we managed to get everyday quick refreshers out.

Reading list and References

#nodejs #homebrew #UnattendedUpgrades #monit #y2020 ,#Jan2020 #HowTo #ConfiguringNodejsApplications #tdd #TestingNodejsApplications

Enter your email to subscribe to updates.