The odd bit

Once is an accident, twice is a coincidence, three times is an enemy action.

The odd bit - Once is an accident, twice is a coincidence, three times is an enemy action.

Sitecore database cleanup problems

As your Sitecore site grows you’ll need to perform a bit of maintenance on the Sitecore databases. Our experience tells us it’s primarily the ‘master’ database that needs some work. The ‘master’ database is the one that gets all the editing while the ‘web’ database is only used as a publishing target.

Our initial wakeup call was a steadily growing ‘master’ database months after adding a synchronisation with an external system. The synchronisation data contains images and as we updated objects the database would keep growing. It’s enough material to create another article but to cut a long story short: the Blobs table would keep growing without removing records. tl;dr orphaned blobs.

An investigation into the deep, dark dungeons of Sitecore.Kernel.dll hinted at a CleanupDatabase() method on the Database class. It does quite a bit of cleaning including the removal of orphaned blobs. We scheduled a task and Eureka! … until recently.

A performance review of our Sitecore installation revealed sporadic timeouts in the logs. And it wasn’t the more or less normal request timeout but a rather alarming SQL timeout. Luckily, the offending SQL statement was also in the logs.

Exception: System.Data.DataException
Message: Error executing SQL command:  declare @x bigint set @x = 0 DECLARE @item TABLE(ID uniqueidentifier,parentID uniqueidentifier) INSERT INTO @item (ID,parentID)   SELECT  [ID],[ParentID] FROM [Items]  DECLARE @temp TABLE(ID uniqueidentifier) WHILE (SELECT count(id) FROM @item ) <> @x begin set @x = (SELECT count(id) FROM @item ) delete from @temp; insert into @temp (ID)   SELECT  id FROM @item where parentID  = @nullId update @item SET Parentid =@nullId where Parentid  in (select id from @temp) delete from @item where  id  in (select id from @temp) end UPDATE [Items] SET [Parentid] = @nullId where [ID]  in (select id from @item) ; DELETE from [Items] where [ID] in (select id from @item)

Nested Exception

Exception: System.Data.SqlClient.SqlException
Message: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
Source: .Net SqlClient Data Provider
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite)
at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite)
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at Sitecore.Data.DataProviders.Sql.DataProviderCommand.ExecuteNonQuery()

Running the query manually proved that it was right on the limit: 4 minutes and 51 seconds on a database with around 44000 items. The standard SQL timeout in Sitecore is 5 minutes. A bit of load on the database server would cause it to go longer than 5 minutes and trigger the timeout error.

Hey, let’s increase the timeout to 10 minutes and we’re done. Wrong! You’re done until the database grows large enough that 10 minutes are no longer sufficient. So I favour a more structural solution: let’s work on that query!

Diving into Sitecore.Kernel.dll once more showed that it was part of the database cleanup, namely the method CleanupCyclicDependences() implemented in SqlServerDataProvider. There are several problems with the query:

  1. A table variable is a bad idea once you’re working with real data (read: more than a couple of rows). If you don’t believe me take a look at the query execution plan. It’s filled with full table scans because SQL Server does not calculate statistics on table variables. The estimated number of rows will always be 1. The query has 2 table variables.
    The right approach would be temporary tables because SQL Server does calculate statistics for temp tables.
  2. Deleting all rows from a table is slower than just truncating that table.
  3. The last two statements are odd: first records are updated and then the same records are deleted. Why update them if the next step is deleting them?

So this is a much better query which will do the same work in 2-4 seconds (+/- 44000 items, depending on server load):

SET @x = 0
SELECT {0}ID{1},
FROM   {0}Items{1}
WHILE (SELECT Count(id) FROM   #item) <> @x
      SET @x = (SELECT Count(id)
                FROM   #item)
      TRUNCATE TABLE #temp;
      INSERT INTO #temp
      SELECT id
      FROM   #item
      WHERE  parentid = {2}nullId{3}
      UPDATE #item
      SET    parentid = {2}nullId{3}
      WHERE  parentid IN (SELECT id
                          FROM   #temp)
      DELETE FROM #item
      WHERE  id IN (SELECT id
                    FROM   #temp)
DELETE FROM {0}Items{1}
              FROM   #item)

Warning: do not run this query on SQL Server because it won’t work! It’s formatted for Sitecore’s data provider.

So the correct solution is:

  1. Create a custom data provider that inherits from SqlServerDataprovider.
  2. Override the CleanupCyclicDependences method.
  3. In the method body type
    this.Api.Execute(@””, “nullId”, ID.Null);
  4. Paste the above query between the double quotes.
  5. Wire your custom data provider in web.config.

And your database cleanup will be a lot faster.

Counterintuitive or counterproductive?

Do you ever get that feeling where a piece of software seems to make your life as a developer harder instead of easier? Well, I do but I’m not sure whether it’s the software or me.

I’ve worked with an open source content management system written in PHP for close to 6 years. In that timeframe, I got to know the little beast inside out. I knew the strong points of the product but more importantly I also knew the weak points and how to avoid/circumvent them. Anything I do with it just works and if it doesn’t I soon enough find out it was caused by a mistake on my part.

The new content management system is some commercial software written in .NET. I haven’t figured out why, but I seem to be in a constant fight with the software. I can’t have it working perfectly for 2 days straight. I do something, it fails… I finally get it working again and less than a day later it breaks again. Perhaps it’s the software, perhaps it’s me but I’m certainly not used to working with such flaky software. All I know is that it’s pretty frustrating at the moment.

Farewell eZ Publish

It is with great sadness, and a bit of anger, that I’m writing this post but today is the day that we have officially lost the fight for a future with eZ Publish.

It all started in 2002 with the 2.2 version of the package. We shifted to the 3.x series as soon as the very first alphas became available and continued to walk the enlightened path across all of the 3.x releases. The shift to PHP 5 and thus eZ Publish 4.0 also unleashed the power of the eZ Components and it only happened in early spring of this year.

We knew the day of doom would be coming despite our attempts to enlighten others. The only thing we managed to accomplish was to delay the decision by nearly two years. I would happely accept it if we were going to something superior, but the only reason why we have to switch is because “it is not .NET”. That’s just sad…

So I’ll take this opportunity to thank eZ Systems for their great product and let’s not forget the wonderful community. I’ll continue to monitor the project to see where it’s heading, but that will be about it.

Oh, and if you’re wondering about the replacement for eZ Publish, our CMS future is now called Sitecore.

So long and thanks for the fish!

The quest for a PHP editor

I’ve had a hate-and-love relationship with a few editors over the past few years, but I’ve never found one that I truly love. Some look promising in the beginning but after some time I really get tired of their limitations.

I thought I had settled with Eclipse (with PDT) because it’s the best I’ve used, but the software update messes up so often it’s not funny anymore. That and those pesky “builders” that keep flagging stuff as errors in places where they’re absolutely useless. So with Eclipse rapidly losing credit I started a new search… without any results so far.

Perhaps my requirements are too steep, but Visual Studio [1] manages to combine them so I don’t think they’re too far-fetched. What I want in a PHP editor (or rather IDE):

  • Projects instead of loose files (think of: Visual Studio solutions or Eclipse projects).
  • Smart intellisense (not just autocomplete, it should be able to parse the project and recognize custom classes – Visual Studio is the reference here).
  • Handle whitespace properly (tabs to spaces, clear trailing whitespace per line, clear empty lines).
  • Formatting (with bonus points if I can configure my own set of rules).
  • A non-cluttered modern interface.

Extra bonus points are awarded to IDEs that can perform small “design-time” checks (e.g. unreachable code, non-returning branch, unused variables, …) and have a couple of refactoring functions/shortcuts (to name two: rename variable/method and implement interface).

So if there’s anyone who knows about a little gem for PHP development, please let me know. Oh, and don’t make me beg 😉

[1]: No, there’s nothing wrong with your eyes. Visual Studio is actually a Microsoft product that I like. It’s simply the best IDE in my very humble opinion.

MOSS 2007: laugh or cry?

When Microsoft released Microsoft Office Sharepoint Server (MOSS) 2007, it was touted as The Next Big Thing™ for enterprises. And that’s where the good news ends…

This piece of software has so many issues I don’t even know where to start. My first encounter with it was when I had to evaluate the web content management features of the product. That evaluation period lasted about a week, but was stretched to a month just for the sake of it. When coming from something like eZ Publish, it felt like I had done some time travelling all the way back to the Stone Age [1].

A remarkable feat that Microsoft managed to pull was to release an anti-developer product. My career is still pretty short, but it was the first time that I encountered a piece of software that was set on making a developer’s life as hard as possible.

After the dust had settled and a couple of holidays had joined other historical facts, it was time for the second encounter. The idea was to give the collaboration features of MOSS 2007 a test run. So my colleagues and I clicked around when we suddenly noticed the “My site” link. A harmless link [2] to a personal site… until we (= 3 persons at that time) managed to click at the same time: one arrived at the personal site, one got an error but managed to proceed and the other one got an error and another one when trying to proceed. Guess which one was me… I was told it was caused by the speed of the network connection. Makes sense? Not to me.

The security settings regularly start leading their own life causing all sites to go down, file uploads went wrong and blocked all edit actions on document lists, etc. These are just a couple of things, but I could go on for quite a while. It’s at a point where I don’t know whether to laugh when it goes wrong or to cry with the fact Microsoft managed to produce such a nightmare.

[1]: I know there were no computers back then, but let’s forget that little detail so I can make my point 😉

[2]: Or so it seemed…

IE8 defaults to IE8 now

One of my previous posts mentioned that IE8 would default to IE7 standards mode unless web developers would specifically request the new IE8 mode. Well, there’s some good news coming from Redmond.

The IE team announced that they changed the behaviour. IE8 will now use its most standard compliant rendering mode for pages that meet the criteria for standards mode. If you, as web developer, want pages to be rendered using IE7’s standards mode you will have to use the META tag or the corresponding HTTP header. So they did the right thing and made this feature an opt-in feature: you only have to act if you want to use this feature.

I’m glad Microsoft listened to the web developer community and did the right thing. This puts the burden on the developers who don’t want to fix their pages and it might persuade them to update their code if it’s broken in a new IE version.

In other IE8 news, Beta 1 of Microsoft’s newest browser is now available. This is only intended for web developers and designers. If you are a regular user, you should skip this release.

IE8 defaults to IE7

This is surreal… I just posted about Microsoft’s latest trick with IE8 and a meta tag and I already have an update that warrants a new post. Through some reading and clicking, I arrived at a blog post by Jeremy Keith on this subject. Without going into the pros and cons of the meta tag, his post shows me three things.

The first thing is the format of the tag. It appears it will take the following form:

<meta http-equiv="X-UA-Compatible" content="IE=8" />

A second point is IE’s default behaviour. In this case, default means without any changes to existing pages. Apparently, Microsoft decided that, without meta tag, a page will be rendered in IE7 mode. I’m going to borrow Jeremy’s words because they are perfect:

Unless you explicitly declare that you want IE8 to behave as IE8, it will behave as IE7.

That’s just plain ridiculous! What’s the point of creating a new version if the default behaviour is to use the old version? I guess this is Microsoft logic.

The third point is closely related to the second one. The right default behaviour would be to use the current browser version. There is a way to activate that option by using IE=edge as… you guess it, value for the content attribute of the meta tag. Using that trick is strongly discouraged though.

So essentially this means the meta tag is not an optional step, but rather a mandatory part of creating a web page. To use the mode associated with a browser version beyond 7, you have to specify it. To disable the checks, you also have to specify it.

This is so surreal and such mess..

Microsoft does it again: IE8 and web standards

I thought this was a really early April’s fool when I first read it, but this is all over the place it just has to be true. I picked it up from Robert O’Callahan’s blog. I don’t know who keeps inventing these things, but sometimes you just can’t come up with such funny jokes no matter how hard you try.

Microsoft feels they made a mistake when they changed the behaviour of the “Standards compliant” mode between IE6 and IE7. They argue that web developers had implemented hacks to go around the imperfections of IE6’s standards mode (no kidding, the standards mode really didn’t live up to its name). Then IE7 came and it shipped with improved support for standards. But because of IE detection, IE7 received the same content as IE6 and as such the improved standards mode broke more than it fixed (that says more about the web developer though, I haven’t done a lot of fixes to be IE7-compatible).

So the folks in Redmond believe they should do something. Web pages are developed for a particular browser version and they should never break in a newer browser. It should be rendered by the engine it was created for. Microsoft’s solution? Let’s ship different rendering engines!

<insert awkward silence>

The first thing I can think of is maintenance hell. The second thing I can think of is development hell and the third thing is testing hell. So basically, IE is heading to hell. Small clue for certain readers: typing this paragraph made me think of a certain Peanut. But wait… there’s more from the Redmond Beast!

What if you, as web developer, know a page is compatible with a new IE engine? Just because you know how to do your work shouldn’t leave you with old pages stuck in an old IE version, right? Well, Microsoft agrees and they’ve come up with a way to signal which IE mode you want. You can indicate your page’s compatibility by using a … wait for it … <meta> tag! And it will be a meta tag of the http-equivalent type so you can achieve the same effect by sending an HTTP header via the server.

Is it just me or does Microsoft have a nose for picking the worst solution to solve a problem? Instead of fixing their part, they’re putting the burden on the developers once again. Before IE8 comes out, we can all waste hours/days to add their silly meta tag. This is all for Microsoft’s “Don’t break the web” philosophy. Well hello, Microsoft! You broke it in the first place, fix it without harassing us every time you release a new version of Internet Exploder.

Mind boggling questions sometimes have a really easy and simple answer. How can we make the web a better place? Get rid of IE and leave the web to browsers.

Oh, for the record: this has been made official on the IE Blog.

Choices: the lottery of life

Choices are a common thing in the life cycle of a human being. They’re some kind of puzzle, often with many solutions and not always resulting in a predictable outcome. Sometimes, those “puzzles of life” are easy to solve but they can also be quite hard to solve.

An easy choice was the direction of my IT education. We were the first “generation” to actually have a choice. The education used to be a mix between development and networking/system administration but the main focus was on development. We were offered a choice for the third (and thus last) year of the education: software development or networking/system administration. With two years of software development being part of history, I figured I would have enough background to do fine in software development (and web development in particular) and decided to explore new horizons. Looking back at that decision, I still think it was the right one.

Time warp to the present which is a gap of 6 years and I can see the choice reappearing on the horizon. I’ve spent 5 years doing web development mixed with very brief moments of maintaining two Linux servers. While short in time, those moments have provided a welcome distraction from the development job. In fact, taking them away from me would really piss me off. So I feel I’m rapidly approaching a crossroad and I have no idea what direction to take. I would like to remain active as a web developer but I also feel an increasing desire to do system administration.

Web development is an exciting area when you can live on the proverbial edge: trying new and emerging trends/technologies. But the fun is completely gone when you’re not allowed to create new challenges or to be innovative. If the only goal is to get the job done (that means without any interest in the user experience, the quality of the product, the architecture of the backend, …), web development becomes just as boring as looking at an old sock. The real thrill comes from e.g. playing with new technologies and then incorporating them into your projects ending in a better user experience ultimately making users happier. We are not drones, we’re creative minds!

It’s probably that frustration that drives the system administrator in me. It’s like going from the front lines to the supply lines. You’re not taking commands all the time and you’re fine as long as the supply line doesn’t collapse. Managing servers, a domain, network infrastructure, user policies, security, … are things I really enjoy doing.

Eventually, it all comes down to who I am as an IT guy. I’m not an analyst, I’m a technical guy. I love to get a deep knowledge of the product I’m working with, I love diving in APIs to find hidden features or nifty things, I love pushing things to the limit just to see how far the tech can go, I love trying new stuff and I love looking for other ways to accomplish the same thing. But I can only reach my full potential when I’m given some room to be creative and innovative. And I do have a real-life example to back that up.

And after all that text, I still don’t know what to do…

HOWTO: Feed Component

Version 2007.1beta1 of the eZ Components has been released earlier this week. This release features a beta version of the Feed component. The downside is that there is almost no documentation available about this component, but I’m sure that is just a temporary situation that will be resolved before the final release.

Anyway, programmers are a bit like adventurers: to boldly go where no one has gone before. It would be a shame to keep this adventure for myself, so what follows is a small and simple tutorial that will show you how you can display items from an external feed.

Continue reading