Skip to main content

Add a New Property in Cosmos DB (Existing millions of data) Without Downtime

 



Scenario

You have a Cosmos DB collection with 1 million records, and you need to introduce a new property (e.g., Discount). Your goal is to add this new field without causing downtime, meaning that both the old and new data should work for clients during the transition period.

Step 1: Understand Cosmos DB is Schema-less

Cosmos DB is a NoSQL database, meaning it doesn't enforce a strict schema. Each document (record) can have a different structure. This gives you flexibility in adding or modifying fields without requiring an actual schema migration, unlike relational databases.

In Cosmos DB:

  • Old records can remain as they are (without the new field).
  • New records can be added with the new field.

Your task is to handle this evolution in the application code without downtime.

Step 2: Update Your Application Model (Add the New Field)

You need to modify your application’s data model (e.g., C# class) to include the new field (Discount).

public class Product
{
    public string Id { get; set; }
    public string Name { get; set; }
    public decimal Price { get; set; }
    
    // Add the new field (Version2)
    public string Version2 { get; set; }  // New field, initially null for old records
}
  • Old records will not have the Version2 field.
  • New records will have the Version2 field.

Key Point:

  • Backward Compatibility: Ensure that your code can handle the case where Version2 is null for old records.

Step 3: Modify Read and Write Logic in Your Application

Now, you need to ensure that your application works seamlessly with both old and new data (some records have Version2, and some don’t). Update the read and write logic in your application to handle both scenarios.

Read Logic (Backward Compatibility):

When reading from Cosmos DB, check whether the Version2 field exists.

public async Task<Product> GetProductAsync(string id)
{
    var product = await cosmosDbClient.GetItemAsync<Product>(id);

    // Check if the new field (Version2) is present
    if (string.IsNullOrEmpty(product.Version2))
    {
        // Old record, handle accordingly
        Console.WriteLine("Using old data format, without Version2.");
    }
    else
    {
        // New record, handle Version2 logic
        Console.WriteLine("Using new data format with Version2.");
    }

    return product;
}

Write Logic (Ensure New Records Include Version2):

When writing new records or updating existing records, ensure that the Version2 field is included in the new data.

public async Task SaveProductAsync(Product product)
{
    // Ensure Version2 is populated for new or updated records
    product.Version2 = "some_value_for_version2";
    
    await cosmosDbClient.UpsertItemAsync(product.Id, product);  // Upsert to create or update the document
}

Key Point:

  • This step ensures that your application handles old and new records differently but works seamlessly with both.

Step 4: Gradually Migrate Old Data (Without Downtime)

You can choose one of two methods to gradually populate the Version2 field for the existing 1 million records, without taking your system down.

Option 1: Lazy Update on Read

  • Whenever a record is read by any client, check if the Version2 field is missing.
  • If it's missing, populate the field and update the record in Cosmos DB at the same time.

This way, over time, as records are accessed, they will be updated to include Version2.

public async Task<Product> GetAndUpdateProductAsync(string id)
{
    var product = await cosmosDbClient.GetItemAsync<Product>(id);

    // Check if the new field (Version2) is missing
    if (string.IsNullOrEmpty(product.Version2))
    {
        // Populate the new field
        product.Version2 = "default_value";  // Compute this based on your logic
        
        // Update the record in Cosmos DB with the new field
        await cosmosDbClient.UpdateItemAsync(product.Id, product);
    }

    return product;
}

This lazy update approach ensures that records are updated incrementally, without needing to run a massive migration upfront.

Option 2: Background Migration Job

  • You can write a background process (or a scheduled job) that iterates through the records in Cosmos DB and updates them with the new Version2 field.
  • The job should run in small batches to avoid overwhelming the system.

public async Task MigrateOldDataAsync()
{
    var items = await cosmosDbClient.GetItemsAsync<Product>(); // Fetch all records

    foreach (var product in items)
    {
        if (string.IsNullOrEmpty(product.Version2))
        {
            // Populate Version2 for old data
            product.Version2 = "default_value";  // Use your business logic to set this

            // Update the document in Cosmos DB
            await cosmosDbClient.UpdateItemAsync(product.Id, product);
        }
    }
}
  • Run the migration during off-peak hours to minimize performance impact on live clients.
  • This way, you are migrating the data in batches without taking down your application.

Key Point:

  • No Downtime: Both options (lazy update or background migration) ensure that old records are gradually migrated without affecting live clients.

Step 5: Monitor and Ensure Consistency

Monitor the migration process to ensure that the Version2 field is being added to old records as expected. You can log progress and handle errors as part of the migration job.

  • Check how many records have been updated with the new field over time.
  • Indexing: Cosmos DB automatically indexes all properties, including the new field, which helps with efficient querying.

Step 6: Update Client Applications (If Needed)

If your clients are expected to use the new Version2 field, make sure they are updated once the migration is largely complete.

You can use API versioning or feature flags to control when the new field is exposed to clients, ensuring they are not affected by the migration.

Summary: Zero Downtime Strategy

Here’s a recap of the steps to introduce the new field in Cosmos DB without downtime:

  1. Add the new field (Version2) to your data model without altering the existing structure.
  2. Update the read and write logic to handle both old (no Version2) and new records (with Version2).
  3. Gradually populate the new field:
    • Option 1: Lazy update on read: Add the field when records are accessed.
    • Option 2: Run a background migration job: Update old records in batches during low traffic times.
  4. Ensure backward compatibility: Your application should work with both old and new data formats throughout the migration process.
  5. Monitor and update clients: Once the migration is largely complete, update your client applications to start using the new field.

By following this step-by-step approach, you can evolve your data schema in Cosmos DB without downtime, ensuring both your application and clients continue working smoothly.

Comments

Popular posts from this blog

Implementing and Integrating RabbitMQ in .NET Core Application: Shopping Cart and Order API

RabbitMQ is a robust message broker that enables communication between services in a decoupled, reliable manner. In this guide, we’ll implement RabbitMQ in a .NET Core application to connect two microservices: Shopping Cart API (Producer) and Order API (Consumer). 1. Prerequisites Install RabbitMQ locally or on a server. Default Management UI: http://localhost:15672 Default Credentials: guest/guest Install the RabbitMQ.Client package for .NET: dotnet add package RabbitMQ.Client 2. Architecture Overview Shopping Cart API (Producer): Sends a message when a user places an order. RabbitMQ : Acts as the broker to hold the message. Order API (Consumer): Receives the message and processes the order. 3. RabbitMQ Producer: Shopping Cart API Step 1: Install RabbitMQ.Client Ensure the RabbitMQ client library is installed: dotnet add package RabbitMQ.Client Step 2: Create the Producer Service Add a RabbitMQProducer class to send messages. RabbitMQProducer.cs : using RabbitMQ.Client; usin...

How Does My .NET Core Application Build Once and Run Everywhere?

One of the most powerful features of .NET Core is its cross-platform nature. Unlike the traditional .NET Framework, which was limited to Windows, .NET Core allows you to build your application once and run it on Windows , Linux , or macOS . This makes it an excellent choice for modern, scalable, and portable applications. In this blog, we’ll explore how .NET Core achieves this, the underlying architecture, and how you can leverage it to make your applications truly cross-platform. Key Features of .NET Core for Cross-Platform Development Platform Independence : .NET Core Runtime is available for multiple platforms (Windows, Linux, macOS). Applications can run seamlessly without platform-specific adjustments. Build Once, Run Anywhere : Compile your code once and deploy it on any OS with minimal effort. Self-Contained Deployment : .NET Core apps can include the runtime in the deployment package, making them independent of the host system's installed runtime. Standardized Libraries ...

Clean Architecture: What It Is and How It Differs from Microservices

In the tech world, buzzwords like   Clean Architecture   and   Microservices   often dominate discussions about building scalable, maintainable applications. But what exactly is Clean Architecture? How does it compare to Microservices? And most importantly, is it more efficient? Let’s break it all down, from understanding the core principles of Clean Architecture to comparing it with Microservices. By the end of this blog, you’ll know when to use each and why Clean Architecture might just be the silent hero your projects need. What is Clean Architecture? Clean Architecture  is a design paradigm introduced by Robert C. Martin (Uncle Bob) in his book  Clean Architecture: A Craftsman’s Guide to Software Structure and Design . It’s an evolution of layered architecture, focusing on organizing code in a way that makes it  flexible ,  testable , and  easy to maintain . Core Principles of Clean Architecture Dependency Inversion : High-level modules s...