1 of 4

Event Store

https://github.com/Elders/Cronus/issues/265

EventStore Player

Migrations

What is data migration?

Data migration is the process of moving data from one system to another. And there are many reasons why a system may require such a move. To name most common ones:

Natural system evolution which requires the data to be optimized for performance or maintainability.
Legal issues where some parts of the data have to be deleted or encrypted
Bad data created by a bug in the system
Business reason. When businesses merge or split.

It is important that the business value of the data is not changed during the process.

There are many different strategies when and how to do data migration. You must carefully plan and execute because damages could be significant.

Challenges

Depending on the data volume the migration process could take hours, even days. During that time there are many things which could fail and corrupt the data in a irreversible way. To avoid such scenarios you should always migrate the data into a new storage repository.

Always migrate the data into a new storage repository.

Make sure the migration process does not overwhelm the live system. You should be in control when the data is being migrated so you could pause the migration during peek times of the live system. To achieve this, use a separate process to run data migration. Always keep in mind that migrating data takes from your system resources and you must account for that.

Use a separate process to run data migration.

When you are migrating a

How to do

Create a separate process which migrates the existing data into the new data repository
Live system must push any new data to the migration service. Could be easily achieved by sending it to a message broker.

Copy EventStore

Issue at hand

An issue that came up in the past was that we serialized a huge amount of information in an event. The event contained a structure that in itself had a very innocent-looking property called TimeZoneInfo:

    [DataContract(Namespace = BC.ContextName, Name = "dce741fb-8671-42b8-af59-d30aaae27bad")]
    public struct Cycle
    {
        [DataMember(Order = 1)]
        private DateTimeOffset _start;

        [DataMember(Order = 2)]
        private DateTimeOffset _end;

        [DataMember(Order = 3)]
        private TimeSpan _duration;

        [DataMember(Order = 4)]
        private readonly TimeZoneInfo _timezone;
    }

After releasing the software, we noticed that the project was taking up an unusually large amount of space. After checking out a couple of persisted events, we found out that each time we used the struct Cycle, we persisted some 6200 lines of serialized json. Of which, the 6000 lines were attributed to the TimeZoneInfo.This severely impacted event serialization and deserialization. The issue came up after we had done the following assignment

{
    ...
    _timezone = TZConvert.GetTimeZoneInfo("Central Standard Time");
    ...
}

Decision

We decided that in order to lower the amount of data, we needed to migrate the event store while keeping up a live version of the old one, to avoid downtime.

Migration Challenges

In order to avoid having downtime, we decided to create a single deployable service (let's call it Migrator) that subscribed to the same events as the original application service. However, the Migrator would write the events directly in the new event store. Furthermore, the Migrator would be responsible for once it boots, to start copying data over from the old event store while applying the needed changes. In our case, we needed to modify all events that had the Cycle in them, and replace the TimeZoneInfo with just a TimeZoneId which is a simple string.

How To Do this

Changing the structure

We changed the structure of the Cycle to this:

    [DataContract(Namespace = BC.ContextName, Name = "dce741fb-8671-42b8-af59-d30aaae27bad")]
    public struct Cycle
    {
        [DataMember(Order = 1)]
        private DateTimeOffset _start;

        [DataMember(Order = 2)]
        private DateTimeOffset _end;

        [DataMember(Order = 3)]
        private TimeSpan _duration;

        [DataMember(Order = 4)]
        private readonly string _timezoneId;
    }

Creating the project

Copy EventStore

Issue at hand

    [DataContract(Namespace = BC.ContextName, Name = "dce741fb-8671-42b8-af59-d30aaae27bad")]
    public struct Cycle
    {
        [DataMember(Order = 1)]
        private DateTimeOffset _start;

        [DataMember(Order = 2)]
        private DateTimeOffset _end;

        [DataMember(Order = 3)]
        private TimeSpan _duration;

        [DataMember(Order = 4)]
        private readonly TimeZoneInfo _timezone;
    }

{
    ...
    _timezone = TZConvert.GetTimeZoneInfo("Central Standard Time");
    ...
}

Decision

We decided that in order to lower the amount of data, we needed to migrate the event store while keeping up a live version of the old one, to avoid downtime.

Migration Challenges

How To Do this

Changing the structure

We changed the structure of the Cycle to this:

    [DataContract(Namespace = BC.ContextName, Name = "dce741fb-8671-42b8-af59-d30aaae27bad")]
    public struct Cycle
    {
        [DataMember(Order = 1)]
        private DateTimeOffset _start;

        [DataMember(Order = 2)]
        private DateTimeOffset _end;

        [DataMember(Order = 3)]
        private TimeSpan _duration;

        [DataMember(Order = 4)]
        private readonly string _timezoneId;
    }

Creating the project

Migrations

What is data migration?

Data migration is the process of moving data from one system to another. And there are many reasons why a system may require such a move. To name most common ones:

Natural system evolution which requires the data to be optimized for performance or maintainability.
Legal issues where some parts of the data have to be deleted or encrypted
Bad data created by a bug in the system
Business reason. When businesses merge or split.

It is important that the business value of the data is not changed during the process.

There are many different strategies when and how to do data migration. You must carefully plan and execute because damages could be significant.

Challenges

Always migrate the data into a new storage repository.

Use a separate process to run data migration.

When you are migrating a

How to do

Create a separate process which migrates the existing data into the new data repository
Live system must push any new data to the migration service. Could be easily achieved by sending it to a message broker.