Thursday, January 31, 2008

A Schema to Survive the Winds of Change


Today we rejoin our intrepid hero1 in his journey through the land of database schema changes where many dangers and adventures await the unsuspecting traveler. This is the post release phase of the journey - if you wish to refresh your memory of the journey pre-release, please go to my previous post.

Our hero is attending the post-release party. The party is quite post2. The customer button holes our hero and congratulates him: "You've done a great job. All the users are raving about your product. And we have great plans for the future". Even in his drunken stupor, our hero knows what that means - more work. Still, that's better than no work.

The next day, our hero finds out what the customer meant by "great plans". He learns about Sarbanes-Oxley and Basel II compliance; the customer's plan to roll the software out to the poor suckers in the international branches3; and some other functionality they just can't live without. "And bug fixes. There are a few production bugs we4 simply have to fix", says the customer.

Since the customer wants the bugs fixed yesterday, while the new requirements will take a lot longer, there is a mismatch in time lines. There's also the small matter of the customer not willing to accept a bug fix release that includes any partially done new features - he fears for the stability of the application in production and for the safety of his precious data. So our hero is forced to work with branches - one support branch where all the bug fixes go, and the trunk where the code for all the additional features goes. He bravely enters the forbidding jungle of Merge - a vast area marked in all the maps of software development with the legend "Keep Out! Here be Dragons!" - armed only with his trusty sword of Intellect and his favourite merge tool. He also asks his manager for one man day per week to merge changes from the bug-fix branch on to the trunk.

Our hero journeys uneventfully for a few weeks. Production bugs are fixed on the branch. In the fullness of time, these get merged into trunk. Testing effort is doubled for all the production bugs, since they have to be tested on the trunk as well. Thankfully, automated functional tests come to our hero's rescue and reduce the overall testing effort.

While doing the merge one week, our hero notices that there is a new alter script in the production support branch - and it has the same number as an alter script in trunk! Once it gets merged into trunk, there will be two different alter scripts with the same number. This is a problem. Since an alter script should not be run twice while upgrading the schema to a later version, they are written with a guard, like so:
if (select appliedUpto from AppliedAlters) < 1000
begin
alter table customer add initial varchar2(10) null
update AppliedAlters set appliedupto = 1000
end
Two alter scripts with the same number results in only one of them getting run. Definitely A Problem.

Since the problem originated with the branch and the trunk having alter scripts with the same number, our hero first considers reserving a set of numbers for each branch - scripts on the production support branch would have numbers between 1000 and 1100, while the scripts on trunk would have numbers greater than 1100. His manager comes by and asks him to consider all the upgrade scenarios. So our hero takes out his yellow legal pad and notes down these upgrade scenarios:
  1. From an earlier version to a later version on the production support branch
  2. From an earlier version to a later version on trunk
  3. From a version on production support to a version on trunk
He realises that scenario 3 will not work very well, even with the reserved alter numbers. Take, for example, the case of a trunk install, when the DB has had alter number 1300 applied on it, but a later merge from the production support branch introduces alter number 1050 into trunk. The upgrade will fail to apply the alter script with version 1050, since the DB only keeps track of the highest applied alter script number.

Our hero ponders a while and concludes that he has no option but to keep track of all the individual alter script numbers that have ever been applied to the DB. Which means that the alter scripts start looking like this:
if (select count(*) from 
AppliedAlters where appliedNumber = 1000) = 0
begin
alter table customer add initial varchar2(10) null
insert into AppliedAlters values (1000)
end

Does this take care of all the upgrade scenarios outlined earlier? Let's check:
  1. From an earlier version to a later version on the production support branch: Alters applied so far - 0 to 1049. New alter to be applied - 1050. This upgrade is easy. Even the earlier numbering scheme can handle this.

  2. From an earlier version to a later version on trunk: There are two cases here.
    1. Alters applied so far - 0 to 1049 and 1101 to 1250. New alter to be applied - 1251. No problem, even with the earlier numbering scheme.
    2. Alters applied so far - 0 to 1049 and 1101 to 1250. New alter to be applied - 1050. As the number indicates, this new alter came from a merge. This would've broken the old numbering scheme, but the new way of tracking the alter scripts realises that 1050 is not yet applied and so applies it - easy peasy.

  3. From a version on production support to a version on trunk: Alters applied so far - 0 to 1049. New alters to be applied - 1050 and 1101 to 1250. Depending on the order in which the alters were applied, even the old tracking scheme could've handled this. The new numbering scheme will definitely handle this.
So once again, the day is saved, thanks to the Powerpuff Girls our hero.

This article closely reflects what we're currently doing in our team. If you've faced a similar problem, do tell me how you've solved it. If you used this technique or something similar, please do let me know how successful it was for you.



1 That's you. And if you are a she, then please excuse my use of he and hero.
2The customer wanted to make sure that the programmers had really created what he wanted, before he shelled out good money feeding the creators.
3 Which our hero translated to mean localization into Spanish, Tamil and Swahili and multiple currency units based on his prior experience with other customers and his knowledge of the customer's operations.
4 we = you, in this case. I bet it totally irritates you when some one says "we" when they actually mean you, especially in the context of work that needs to be done.

No comments: