T-SQL Tuesday #92 - how missing indexes tore down the systems at the IT University of Copenhagen.
For this month, I want you peers to write about those important lessons that you learned the hard way, for instance something you did and put your systems down or maybe something you didn’t do and took your systems down. It can be also a bad decision you or someone else took back in the day and you’re still paying for it…
I’m going to blog about my experiences as a junior developer/dba working at the IT University of Copenhagen. But before we begin, I want to say that I think that Raul Gonzalez’ topic is great. Learning from other people’s mistakes is learning on at least three different dimensions: First, you learn about something from the harm it does when you do it wrong. Many times, this makes it easier to remember, and your brain can add this experience to your own experiences. Second, you get to see things from a real-life example. Many examples in tutorials and blogposts are stripped of context because they need to illustrate just one concept. But leaving out context can sometimes make the concept seem theoretical (nothing wrong with theory – I spent eight years studying pure mathematics… 😊) and de-attached from the real world. And third, you learn that other people (even your heroes) make mistakes. Makes it a little easier to look at yourself (and your colleagues) when something goes wrong.
My third job in the IT industry was as a programmer at the IT University of Copenhagen (ITU). Out small team of three people developed business processes (and corresponding IT systems to support them) for the university. Systems ran on the not so common LAOS technology stack (Linux, AOLServer, Oracle and StandardML), and we were responsible for everything from server maintenance to usability testing. In between these two extremes was database development. And that was my main responsibility.
Sidenote: The system system is called myITU, and it is still in production more than 15 years later running processes like program enrolment, thesis contracts, course database and course enrolment (we will get back to the latter later).
I was young and in-experienced. We had a lot of ground to cover. And I had never taken a database course in my life. So, it was natural that the responsibility of the database design (including tables, views, stored procedures and functions), landed on me… Well, I got most of it right. Except for one part. One tiny, crucial part: enumeration (state) tables.
Any administrative system dealing with business processes need to model states. An example is the state of a (university) course, which can hold values like draft, published, ongoing, held, deleted. After modelling a few objects with a state property, I realized that we could abstract out states into sort of a EAV model: EnumerationType and EnumerationValue. Now, the course table, the thesis table and the enrolment table would just reference the EnumerationValue table directly. Neat? Not really.
Remember that myITU was implemented on an Oracle database. Unlike on SQL Server, all tables in an Oracle database are physically represented as heaps, unless explicitly specified otherwise. That means no indexes. And I didn’t know anything about database performance back then, so I didn’t add any. Effectively any query against the course table would give a full table scans in the EnumerationType and EnumerationValue tables as a side effect. Fast forward to course enrolments…
At ITU, we had implemented a selection algorithm, which considered both the study program you were enrolled into as well as how early you registered for a course. Early birds got the course, so students would be ready when course enrolment was opened at 12pm. And at 12.05 myITU would start failing with 500 Internal Server Errors.
And the webserver logs showed errors coming from the database running out of memory and then restarting. Over and over again. The state tables were not very big, so why would the database fail so hard?
Well, add to the mix that myITU needed to be bi-lingual (English and Danish), so all descriptive texts needed to be represented in these two languages. Instead of having two attributes in the database tables (one for each language), we created another EAV monster, the Texts and TextValues tables. And of course, our states needed bi-lingual text representations. Yes, the Texts and TextValues tables were also stored as heaps with no indexes. And these tables were not so small – every bi-lingual attribute in the course and thesis systems would add two rows to them for each course/thesis record, and there were quite a few of these attributes. So, for each query involving state of a record, we would do a full table scan of the Text/TextValue tables as well. Auch.
We never solved the performance problem at course enrolment peak times during my employment at ITU. I left after 5 years, started as a database consultant at Miracle, where I worked with some of the best database professionals in Europe. I studied databases in depth, both theory and internals, and learned how to profile a SQL statement using the 10046 trace flag. And a few years later, I returned to ITU to teach my former colleagues about SQL profiling and tuning. It was in that moment that I realized why course enrolments were failing miserably. And we finally fixed those missing indexes…
That’s it for this month’s T-SQL Tuesday.
Read more about T-SQL Tuesday #092 here: http://www.sqldoubleg.com/2017/07/03/tsql2sday-92-lessons-learned-the-hard-way/
Check out the awesome HTTP Status Cats API here:
Learn about the 10046 trace flag on Oracle here: