The odd bit

Once is an accident, twice is a coincidence, three times is an enemy action.

The odd bit - Once is an accident, twice is a coincidence, three times is an enemy action.

The lurking dangers of references in PHP

A while ago, PHP 4.4 was released to address a rather weird memory corruption problem in the PHP 4.3 series. The problem was related to returning by reference. The bug allowed statements to be returned by reference while it should only be possible to return variables by reference.

To provide users with a PHP 4.4 compatible version of eZ publish, eZ systems released version 3.7 which is a clone of version 3.6 with the reference fixes added to it. Simply changing the 3.6 codebase was not an option because what was needed in 3.6 to go around the memory corruption would no longer be valid in PHP 4.4. The reverse was also true, what was needed in the PHP 4.4 version would trigger the memory corruption in 3.6.

Unfortunately, the problem doesn’t stop there. There are also a lot of extensions for eZ publish. Some of them don’t need a change while others need to be updated to be PHP 4.4 compatible. I’ll give an example…

If you happen to use e.g. the eZXML library in your extension, you’re among the lucky ones who need to change some code for the newer version ;-). A great example is eZDOMDocument::createElementNode(). That function returns by reference in eZ publish versions prior to version 3.6 and returns by value from 3.7 onwards. This means your code for 3.6 should look like ($doc is an instance of the eZDOMDocument class) (code A):

$exampleNode =& $doc->createElementNode( ‘example’ );

And for 3.7 (code B):

$exampleNode = $doc->createElementNode( ‘example’ );

The difference is pretty small if you look at it, but it could be rather significant. If you use code A for both versions (3.6 and 3.7) it should not be that significant. It’s how it should be for 3.6, while the reference assignment should be ignored in 3.7 because the function returns by value. However, you’re not following the eZXML API which could have an unknown impact now or in later PHP versions and it’s not a good coding practice.
One could also suggest to use code B for both versions, but then things get worse. You’re alright in version 3.7 (or any PHP 4.4 version for that matter), but you’re simply ignoring the return by reference in 3.6 which means you’re back at the initial problem: possible memory corruption. And just to be sure, I asked Derick Rethans (PHP and eZ publish developer) if ignoring a return by reference would be a good idea in PHP 4.3. His response:

It’s indeed how it not should be done. Returning by reference from a function, while not assigning the result by reference will not work (ofcourse, as the reference is ignore). As far as I can remember it is exactly this case that caused the memory corruptions in PHP 4.3.x.

So be careful when switching to the new code because you can’t predict when the memory corruption bug will appear, but count yourself lucky if you don’t encounter it. I have encountered the bug before and it absolutely makes no sense when you spot the problem.

Note: 3.6 in this post refers to eZ publish version 3.6 which in turn refers to the eZ publish codebase for PHP 4.3. 3.7 in the post refers to eZ publish version 3.7 which should be read as the eZ publish codebase for PHP 4.4.

Nasty bug fixed

Good news from the eZ camp: they finally fixed bug #6199. The bug is triggered by enabling the TemplateOptimization setting, which tries to optimise some calls made in compiled templates.

If you assign a content node to a variable called $node in templates, calling $node.object.data_map may fail. The optimisations assume that $node is only used by the content module and not by custom modules or any other templates.

The bugfix will appear in the next point releases who should be released quite soon.

Regexp datatype

I’ve resumed the work on regexpline, one of my personal extensions for eZ publish. The datatype is exactly like the standard “Text line” that comes with the software, except that it validates the input against a regular expression. This makes it possible to limit the input to a certain formatting (even requiring a valid IP address is possible).

Today I’ve added support for presets of regular expressions. You can define named regular expressions in an ini file and then pick one of them during class edit. I’ve also fixed the object-level input validation. Validation started whining when it wasn’t necessary (required & info collector would whine during object edit instead of during info collection).

I’ve also bumped the required eZ publish version up to 3.6.0. That allowed me to use some newer template constructs.