Simple Engineering

mongodb

Sometimes, changes in code involves changes in models. Fields can be added or removed depending on the requirements at hand. This blog post explores some techniques to make versioning work with mongodb models.

There is a more generalist Database Maintenance, Data Migrations, and Model Versioning article that goes beyond mongodb models.

In this article we will talk about:

  • Model versioning strategies
  • Avoiding model versioning colliding with database engine upgrades
  • Migration strategy for model upgrades with schema change
  • Migration strategy for models with hydration
  • Tools that makes model migrations easier

Even though this blog post was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

Show me the code

These snippets illustrate the evolution of one fictitious UserSchema. A schema describes how a model will look like once compiled and ready to be used with the mongodb database engine.

//Data Model Version 1.0.0
var UserSchema = new mongoose.Schema({name: String});

//Data Model Version 1.0.1
var UserSchema = new mongoose.Schema({name: String, email: String});

//Data Model Version 1.1.0
var UserSchema = new mongoose.Schema({
    name: String, 
    email: {type: String, required: true}
});

//Data Model Version 2.0.0
var UserSchema = new mongoose.Schema({ 
    name: {first: String, last: String},
    addresses: [Address],
    orders: [Order]
});

module.exports = mongoose.model('User', UserSchema);

Example: Evolution of a mongoose data model

What can possibly go wrong?

It common to execute software updates in a bulk, especially when the application is a monolith. The term bulk is used for lack of a better word, but the idea behind it can be resumed in a need to update data models, coupled with data hydration to new data models, with a potential of updating the database engine, all of those tasks at the same time.

It becomes clear that, when we have to update more than two things at the same time, complex operations will get involved, and the more complex the update gets, the nastier the problem will become.

When trying to figure out how to approach either migration from one model version to the next, from one low/high level ORM/ODM(mongoose, knex, sequelize) version to the next, from one database engine version to the next, or from one database driver version to the next, we should always keep in our mind some of these challenges(Questions):

  • When is it the right time to do a migration
  • How to automate data transformations from one model version to the next
  • What is the difference between update, and upgrade in our particular context
  • What are the bottlenecks(moving parts) for the current database update/upgrade
  • How can we align model versioning, data migrations alongside database updates/upgrades/patches

The key strategy to tackle difficult situations, at least in the context of these blog post series, has been to split big problems into sub-problems, then resolve one sub-problem at a time.

Update vs Upgrade

Database updates and patches are released on regular basis, they are safe and do not cause major problems when the time comes to apply them. From a system maintenance perspective, it makes sense to apply patches as soon as they come out, and on a regular repeatable basis. For example, every week Friday at midnight, a task can apply a patch to the database engine. At this point, there is one issue off our plate. How about database upgrades.

Upgrades

Avoiding model versioning colliding with other database-related upgrades ~ Any upgrade has breaking changes in it, some are minor others are really serious such as data format incompatibility and what-not. Since upgrades can cause harm, it makes sense to NOT do upgrades at the same time with model versioning, or worse, data model versioning. We may state the following upgrades: ORM/ODM, database driver, database engine upgrades. Since they are not frequent, they can be planned once every quarter depending on the schedule of software we are talking about. It makes sense to have a window to execute, test, and adapt if necessary. Once a quarter as a part of sprint cleaning makes more sense. As a precaution, it makes sense to NOT plan upgrades at the same time as model version changes.

Model versioning strategies

As expressed in the sample code, the evolution of data-driven applications goes hand in hand with schema evolution. As the application grows, some decisions are going to be detrimental and may also need corrective measures in further iterations. We keep in mind that some new features require revisiting schema. In all cases, the model schema will have to change to adapt to new realities. The complexity of schema change depends on how complex the addition or removal turns out to be. To reduce complexity and technical debt, every deployment should involve steps to update schema changes, and re-hydrate data into new models to reflect the new changes. When possible, features that require schema change can be moved to a minor(Major.Minor.Patch) release, whereas every day (in continuous delivery mode) release can be just patched. Similarly, the Major version releases can include ORM/ODM upgrades, database driver upgrades, database engine upgrades, data migration from an old system to the new system. It is NOT good to include model changes in the major, we can keep that in minor releases.

Migration strategy for model upgrades with schema change

From previous sections, it makes sense to keep model upgrades, with schema change as a minor release task. And that, whether it implies data hydration or not.

Migration strategy for models upgrades with data hydration

Data hydration is necessary, in case the data structure has changed to remove fields, split fields, or adding embedded documents. Data hydration may not be necessary when schema change is relaxed validity or availability. However, if a field becomes required, then it makes sense to add a rehydration strategy. It is better to execute hydration every time there is a minor release, even when not necessary.

Tools that make model upgrade easy

There are some libraries that can be used to execute data migration/hydration as a part of model upgrade operation. node-migrate is one of them. Advanced tools on relational databases such as flywaydb can be used. When it comes to model upgrades, a consistent repeatable strategy pays more for your buck than a full-fledged solution in the wild.

Conclusion

In this article, we revisited how to align schema versioning with mongodb releases, taking into consideration data migration and hydration, as well as tools to make data handling easier. There are additional complimentary materials in the “Testing nodejs applications” book.

References

tags: #mongodb #mongoose #migration #data-migration #model-migration #nodejs

This article revisits essentials on how to install mongodb, one of leading noSQL databases on development and production servers.

This article has complementary materials to the Testing nodejs Applications book. However, the article is designed to help those who bought the book to setup their working environment, as well as the wide audience of software developers who need same information.

In this article we will talk about:

__

  • How to install mongodb on Linux, macOS and Windows.
  • How to stop/start automatically prior/after system restarts

We will not talk about:

__

  • How to configure mongodb for development and production, as that is subject of another article worth visiting.
  • How to manage mongodb in a production environment, either in a containerized or standalone contexts.
  • How to load mongodb on Docker and Kubernetes

Installing mongodb on Linux

It is always a good idea to update the system before start working. There is no exception, even when a daily task updates automatically binaries. That can be achieved on Ubuntu and Aptitude enabled systems as following:

$ apt-get update        # Fetch list of available updates
$ apt-get upgrade       # Upgrades current packages
$ apt-get dist-upgrade  # Installs only new updates

Example: updating aptitude binaries

At this point most of packages should be installed or upgraded. Except Packages whose PPA have been removed or not available in the registry. Installing software can be done by installing binaries, or using Ubuntu package manager.

Installing a mongodb on Linux using apt

Updating/Upgrading or first time fresh install of mongodb can follow next scripts.

sudo may be skipped if the current user has permission to write and execute programs

# Add public key used by the aptitude for further updates
# gnupg should be available in the system
$ apt-get install gnupg 
$ wget -qO - https://www.mongodb.org/static/pgp/server-3.6.asc | sudo apt-key add - 

# create and add list for mongodb (version 3.6, but variations can differ from version to version, the same applies to architecture)
$ echo "deb [ arch=amd64 ] https://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.6 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.6.list

# Updating libraries and make the actual install 
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org

# To install specific version(3.6.17 in our example) of mongodb, the following command helps
$ sudo apt-get install -y mongodb-org=3.6.17 mongodb-org-server=3.6.17 mongodb-org-shell=3.6.17 mongodb-org-mongos=3.6.17 mongodb-org-tools=3.6.17

Example: adding mongodb PPA binaries and installing a particular binary version

It is always a good idea to upgrade often. Breaking changes happen on major/minor binary updates, but less likely on patch upgrades. The versions goes by pair numbers, so 3.2, 3.4, 3.6 etc. The transition that skips two version may be catastrophic. For example upgrades from any 3.x to 3.6, for this to work, there should be upgraded to an intermediate update from 3.x to 3.4, after which the update from 3.4 to 3.6 becomes possible.

# Part 1
$ apt-cache policy mongodb-org          # Checking installed MongoDB version 
$ apt-get install -y mongodb-org=3.4    # Installing 3.4 MongoDB version 

# Part 2   
# Running mongodb
$ sudo killall mongod && sleep 3 && sudo service mongod start
$ sudo service mongodb start           

# Part 3 
$ mongo                                 # Accessing to mongo CLI

# Compatible Mode
$ > db.adminCommand( { setFeatureCompatibilityVersion: "3.4" } )  
$ > exit

# Part 3 
$ sudo apt-get install -y mongodb-org=3.6   # Upgrading to latest 3.6 version 
# Restart Server + As in Part 2.

Example: updating mongodb binaries and upgrading to a version

Installing mongodb on macOS

In case homebrew is not already available on your mac, this is how to get one up and running. On its own, homebrew depends on ruby runtime to be available.

homebrew is a package manager and software installation tool that makes most developer tools installation a breeze. We should also highlight that homebrew requires xcode to be available on the system.

$ /usr/bin/ruby -e \
    "$(curl -fsSL https://raw.githubusercontent.com \
    /Homebrew/install/master/install)"

Example: installation instruction as provided by brew.sh

Generally speaking, this is how to install and uninstall things with brew

$ brew install wget 
$ brew uninstall wget 

Example: installing/uninstalling wget binaries using homebrew

We have to to stress on the fact that Homebrew installs packages to their own directory and then symlinks their files into /usr/local.

It is always a good idea to update the system before start working. And that, even when we have a daily task that automatically updates the system for us. macOS can use homebrew package manager on maintenance matters. To update/upgrade or check outdated packages, following commands would help.

$ brew outdated                   # lists all outdated packages
$ brew cleanup -n                 # visualize the list of things are going to be cleaned up.

$ brew upgrade                    # Upgrades all things on the system
$ brew update                     # Updates all outdated + brew itself
$ brew update <formula>           # Updates one formula

$ brew install <formula@version>    # Installs <formula> at a particular version.
$ brew tap <formular@version>/brew  # Installs <formular> from third party repository

# untap/re-tap a repo when previous installation failed
$ brew untap <formular> && brew tap <formula>   
$ brew services start <formular>@<version>

Example: key commands to work with homebrew cli

For more informations, visit: Homebrew ~ FAQ.

Installing a mongodb on a Mac using homebrew

$ brew tap mongodb/brew 
$ brew install mongodb-community@3.6
$ brew services start mongodb-community@3.6 # start mongodb as a mac service 

Example: Install and running mongodb as a macOS service

Caveats ~ We have extra steps to make in order to start/stop automatically when the system goes up/down. This step is vital when doing development on macOS , which does not necessarily needs Linux production bound task runners.

# To have launchd start mongodb at login:
$ ln -s /usr/local/opt/mongodb/*.plist ~/Library/LaunchAgents/

# Then to load mongodb now:
$ launchctl load -w ~/Library/LaunchAgents/homebrew.mxcl.mongodb.plist

# To unregister and stop the service, use the following command
$ launchctl unload -w ~/Library/LaunchAgents/homebrew.mxcl.mongodb.plist

# When not want/need launchctl this command works fine
$ mongod 

Example: Stop/Start when the system stops/starts

Installing mongodb on a Windows machine

Whereas MacOs, and most Linux distributions, come with Python and Ruby already enabled, It takes extra mile for Windows to make those two languages available. We have to stress on the fact that those two languages are somehow required to deploy a mongodb environment on most platforms, especially when working with nodejs.

The bright side of this story is that mongodb provides windows binaries that we can downloaded and installed in a couple of clicks.

Automated upgrades

Before we dive into automatic upgrades, we should consider nuances associated to managing a mongodb instance. The updates fall into two major, quite interesting, categories: patch updates and version upgrades.

Following the SemVer ~ aka Semantic Versioning standard, it is not recommended to consider minor/major versions for automated upgrades. One of the reasons being that these versions are subject to introducing breaking changes or incompatibility between two versions. On the other hand, patches are less susceptible to introduce breaking changes, whence ideal candidates for automated upgrades. Another among other reasons, being that security fixes are released as patches to a minor version.

We should highlight that it is always better to upgrade at deployment time. The process is even easier in containerized context. We should also automate only patches, to avoid to miss security patches.

In the context of Linux, we will use the unattended-upgrades package to do the work.

$ apt-get install unattended-upgrades apticron

Example: install unattended-upgrades

Two things to fine-tune to make this solution work are: to enable a blacklist of packages we do not to automatically update, and two, to enable particular packages we would love to update on a periodical basis. That is compiled in the following shell scripts.

Unattended-Upgrade::Allowed-Origins {
//  "${distro_id}:${distro_codename}";
    "${distro_id}:${distro_codename}-security"; # upgrading security patches only 
//   "${distro_id}:${distro_codename}-updates";  
//  "${distro_id}:${distro_codename}-proposed";
//  "${distro_id}:${distro_codename}-backports";
};

Unattended-Upgrade::Package-Blacklist {
    "vim";
};

Example: fine-tune the blacklist and whitelist in /etc/apt/apt.conf.d/50unattended-upgrades

The next step is necessary to make sure unattended-upgrades download, install and cleanups tasks have a default period: once, twice a day or a week.

APT::Periodic::Update-Package-Lists "1";            # Updates package list once a day
APT::Periodic::Download-Upgradeable-Packages "1";   # download upgrade candidates once a day
APT::Periodic::AutocleanInterval "7";               # clean week worth of unused packages once a week
APT::Periodic::Unattended-Upgrade "1";              # install downloaded packages once a day

Example: tuning the tasks parameter /etc/apt/apt.conf.d/20auto-upgrades

This approach works on Linux(Ubuntu), especially deployed in production, but not Windows nor macOS. The last issue, is to be able to report problems when an update fails, so that a human can intervene whenever possible. That is where the second tool apticron in first paragraph intervenes. To make it work, we will specify which email to send messages to, and that will be all.

EMAIL="<email>@<host.tld>"

Example: tuning reporting tasks email parameter /etc/apticron/apticron.conf

Conclusion

In this article we revisited ways to install mongodb on various platforms. Even though configuration was beyond the scope of this article, we managed to get everyday quick refreshers out in the article. There are areas we wish we added more coverage, probably not on the first version of this article.

Some of other places where people are wondering how to install mongodb on various platforms include, but not limited to Quora, StackOverflow and Reddit.

Reading list

#nodejs #homebrew #UnattendedUpgrades #mongodb #y2020 #Jan2020 #HowTo #ConfiguringNodejsApplications #tdd #TestingNodejsApplications