Simple Engineering

data

Sometimes, changes in code involves changes in models. Fields can be added or removed depending on the requirements at hand. This blog post explores some techniques to make versioning work with mongodb models.

There is a more generalist Database Maintenance, Data Migrations, and Model Versioning article that goes beyond mongodb models.

In this article we will talk about:

  • Model versioning strategies
  • Avoiding model versioning colliding with database engine upgrades
  • Migration strategy for model upgrades with schema change
  • Migration strategy for models with hydration
  • Tools that makes model migrations easier

Even though this blog post was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

Show me the code

These snippets illustrate the evolution of one fictitious UserSchema. A schema describes how a model will look like once compiled and ready to be used with the mongodb database engine.

//Data Model Version 1.0.0
var UserSchema = new mongoose.Schema({name: String});

//Data Model Version 1.0.1
var UserSchema = new mongoose.Schema({name: String, email: String});

//Data Model Version 1.1.0
var UserSchema = new mongoose.Schema({
    name: String, 
    email: {type: String, required: true}
});

//Data Model Version 2.0.0
var UserSchema = new mongoose.Schema({ 
    name: {first: String, last: String},
    addresses: [Address],
    orders: [Order]
});

module.exports = mongoose.model('User', UserSchema);

Example: Evolution of a mongoose data model

What can possibly go wrong?

It common to execute software updates in a bulk, especially when the application is a monolith. The term bulk is used for lack of a better word, but the idea behind it can be resumed in a need to update data models, coupled with data hydration to new data models, with a potential of updating the database engine, all of those tasks at the same time.

It becomes clear that, when we have to update more than two things at the same time, complex operations will get involved, and the more complex the update gets, the nastier the problem will become.

When trying to figure out how to approach either migration from one model version to the next, from one low/high level ORM/ODM(mongoose, knex, sequelize) version to the next, from one database engine version to the next, or from one database driver version to the next, we should always keep in our mind some of these challenges(Questions):

  • When is it the right time to do a migration
  • How to automate data transformations from one model version to the next
  • What is the difference between update, and upgrade in our particular context
  • What are the bottlenecks(moving parts) for the current database update/upgrade
  • How can we align model versioning, data migrations alongside database updates/upgrades/patches

The key strategy to tackle difficult situations, at least in the context of these blog post series, has been to split big problems into sub-problems, then resolve one sub-problem at a time.

Update vs Upgrade

Database updates and patches are released on regular basis, they are safe and do not cause major problems when the time comes to apply them. From a system maintenance perspective, it makes sense to apply patches as soon as they come out, and on a regular repeatable basis. For example, every week Friday at midnight, a task can apply a patch to the database engine. At this point, there is one issue off our plate. How about database upgrades.

Upgrades

Avoiding model versioning colliding with other database-related upgrades ~ Any upgrade has breaking changes in it, some are minor others are really serious such as data format incompatibility and what-not. Since upgrades can cause harm, it makes sense to NOT do upgrades at the same time with model versioning, or worse, data model versioning. We may state the following upgrades: ORM/ODM, database driver, database engine upgrades. Since they are not frequent, they can be planned once every quarter depending on the schedule of software we are talking about. It makes sense to have a window to execute, test, and adapt if necessary. Once a quarter as a part of sprint cleaning makes more sense. As a precaution, it makes sense to NOT plan upgrades at the same time as model version changes.

Model versioning strategies

As expressed in the sample code, the evolution of data-driven applications goes hand in hand with schema evolution. As the application grows, some decisions are going to be detrimental and may also need corrective measures in further iterations. We keep in mind that some new features require revisiting schema. In all cases, the model schema will have to change to adapt to new realities. The complexity of schema change depends on how complex the addition or removal turns out to be. To reduce complexity and technical debt, every deployment should involve steps to update schema changes, and re-hydrate data into new models to reflect the new changes. When possible, features that require schema change can be moved to a minor(Major.Minor.Patch) release, whereas every day (in continuous delivery mode) release can be just patched. Similarly, the Major version releases can include ORM/ODM upgrades, database driver upgrades, database engine upgrades, data migration from an old system to the new system. It is NOT good to include model changes in the major, we can keep that in minor releases.

Migration strategy for model upgrades with schema change

From previous sections, it makes sense to keep model upgrades, with schema change as a minor release task. And that, whether it implies data hydration or not.

Migration strategy for models upgrades with data hydration

Data hydration is necessary, in case the data structure has changed to remove fields, split fields, or adding embedded documents. Data hydration may not be necessary when schema change is relaxed validity or availability. However, if a field becomes required, then it makes sense to add a rehydration strategy. It is better to execute hydration every time there is a minor release, even when not necessary.

Tools that make model upgrade easy

There are some libraries that can be used to execute data migration/hydration as a part of model upgrade operation. node-migrate is one of them. Advanced tools on relational databases such as flywaydb can be used. When it comes to model upgrades, a consistent repeatable strategy pays more for your buck than a full-fledged solution in the wild.

Conclusion

In this article, we revisited how to align schema versioning with mongodb releases, taking into consideration data migration and hydration, as well as tools to make data handling easier. There are additional complimentary materials in the “Testing nodejs applications” book.

References

tags: #mongodb #mongoose #migration #data-migration #model-migration #nodejs

Systems monitoring is critical to systems deployed at scale. In addition to traditional monitoring services native to the nodejs ecosystem, this article explores how to monitor nodejs applications using third-party systems in a way that covers the entire stack, and provides an overall state in one bird eye view.

In this article we will talk about:

  • Data collection tools
  • Data visualization tools
  • Self-healing nodejs systems
  • Popular monitoring stacks

Even though this blog post was designed to offer complementary materials to those who bought my Testing nodejs Applications book, the content can help any software developer to tuneup working environment. You use this link to buy the book. Testing nodejs Applications Book Cover

Monitoring

Monitoring, custom alerts, and notifications systems

Monitoring overall system health makes it possible to take immediate action when something unexpected happens. Key metrics to looks at are CPU usage, memory availability, disk capacity, and health and software errors.

Monitoring systems makes it easy to detect, identify and eventually repair or recover from a failure in a reasonable time. When monitoring production applications, the aim is to be quick to respond to incidents. Sometimes, incident resolution can also be automated: a notification system that actually triggers some sort of script execution to remediate known issues. This sort of system is also called self-healing systems.

Monitoring goes hand-in-hand with notification ~ alerting the right systems and people about either about what is about to happen(early or predictive detection), or about what just happened(near real-time detection) ~ so that the remediation action can be taken. We talk about self-healing(or resilient) systems when the system under stress makes remediation on its own, automatically — and without direct human intervention.

Complex monitoring systems are available for free and for a fee, open as well as closed source. The following examples provide a couple of some to look into.

It is a good idea to use a monitoring tool outside the application. This strategy bails out when downtime originates from an entire data center or the same rack of servers. However, monitoring tools deployed on the same server, have the advantage of better taking the pulse of the environment on which the application is deployed. A winning strategy is deploying both solutions so that notifications can go out even when an entire data center has a downtime.

Conclusion

In this article, we revisited how to achieve one bird-eye view of full-stack nodejs application monitoring using third-party systems. We highlighted how logging and monitoring complement each other. There are additional complimentary materials in the “Testing nodejs applications” book.

References

#monitoring #nodejs #data-collection #visualization #data-viz