Wednesday, September 16, 2009

Slow Rollback, Reboot fixed, why?

I posted this on SQL Server Central, where it will get more attention than it will here, but wanted to put a copy here as well.

Today I ran an 'emergency' script to fix a problem in production. The script does 40,000 updates, each of which fires a trigger that does 1 insert. I should have disabled the trigger first, but didn't (mistake).

This is a very slow (Pentium 4) server, but it has 2 CPUs. MaxDOP is set to 1 (disabling parallel execution), and hyperthreading is turned off in the BIOS.

After 5 minutes, I realized my mistake not dropping the trigger, and canceled my job, effectively rolling back the changes. I waited another 15 minutes, and as expected the CPU usage of the machine was at 50% during this time. I later found out it was not 1 CPU being used 100% as I expected, but both CPUs actually were used 50%, which I still don't understand. During this time, no one was able to use SQL Server on this machine to do anything, which was also unexpected.

I suggested rebooting, saying SQL would continue doing what it is doing after it rebooted, so logically this makes no sense, but I hate just waiting and not doing anything. After the reboot, I expected the database to take forever to recover, and I came back online almost instantly.

Can anyone tell me what happened and why?