The odd bit

Once is an accident, twice is a coincidence, three times is an enemy action.

The odd bit - Once is an accident, twice is a coincidence, three times is an enemy action.

The lurking dangers of references in PHP

A while ago, PHP 4.4 was released to address a rather weird memory corruption problem in the PHP 4.3 series. The problem was related to returning by reference. The bug allowed statements to be returned by reference while it should only be possible to return variables by reference.

To provide users with a PHP 4.4 compatible version of eZ publish, eZ systems released version 3.7 which is a clone of version 3.6 with the reference fixes added to it. Simply changing the 3.6 codebase was not an option because what was needed in 3.6 to go around the memory corruption would no longer be valid in PHP 4.4. The reverse was also true, what was needed in the PHP 4.4 version would trigger the memory corruption in 3.6.

Unfortunately, the problem doesn’t stop there. There are also a lot of extensions for eZ publish. Some of them don’t need a change while others need to be updated to be PHP 4.4 compatible. I’ll give an example…

If you happen to use e.g. the eZXML library in your extension, you’re among the lucky ones who need to change some code for the newer version ;-). A great example is eZDOMDocument::createElementNode(). That function returns by reference in eZ publish versions prior to version 3.6 and returns by value from 3.7 onwards. This means your code for 3.6 should look like ($doc is an instance of the eZDOMDocument class) (code A):

$exampleNode =& $doc->createElementNode( ‘example’ );

And for 3.7 (code B):

$exampleNode = $doc->createElementNode( ‘example’ );

The difference is pretty small if you look at it, but it could be rather significant. If you use code A for both versions (3.6 and 3.7) it should not be that significant. It’s how it should be for 3.6, while the reference assignment should be ignored in 3.7 because the function returns by value. However, you’re not following the eZXML API which could have an unknown impact now or in later PHP versions and it’s not a good coding practice.
One could also suggest to use code B for both versions, but then things get worse. You’re alright in version 3.7 (or any PHP 4.4 version for that matter), but you’re simply ignoring the return by reference in 3.6 which means you’re back at the initial problem: possible memory corruption. And just to be sure, I asked Derick Rethans (PHP and eZ publish developer) if ignoring a return by reference would be a good idea in PHP 4.3. His response:

It’s indeed how it not should be done. Returning by reference from a function, while not assigning the result by reference will not work (ofcourse, as the reference is ignore). As far as I can remember it is exactly this case that caused the memory corruptions in PHP 4.3.x.

So be careful when switching to the new code because you can’t predict when the memory corruption bug will appear, but count yourself lucky if you don’t encounter it. I have encountered the bug before and it absolutely makes no sense when you spot the problem.

Note: 3.6 in this post refers to eZ publish version 3.6 which in turn refers to the eZ publish codebase for PHP 4.3. 3.7 in the post refers to eZ publish version 3.7 which should be read as the eZ publish codebase for PHP 4.4.

  • Kristof Coomans says:

    Hi Hans

    In the source code of the eZDOMDocument class, there is a doxygen command \static for the method createElementNode. Shouldn’t it be called as a class member then?

    In the same file, some examples are listed where the method is called on an instance. So either the doxygen command \static isn’t correct or the examples are wrong.

    2 February 2006 at 11:29
  • Hans Melis says:

    Hi kristof

    From a pure PHP point of view, it does not matter whether you call that method statically or not. $this is not used in the function, so a static call works. But a non-static use also works. The documentation should make its usage clear: static, non-static or both.

    But the way you call it has no effect on the return by reference/value.

    2 February 2006 at 11:41
  • Kristof Coomans says:

    Indeed, it has no effect on the return by reference/value. But you mentioned ‘good coding practices’ in your article, and I believe that – from a pure object oriented view – calling static methods on an object isn’t good practice, although it’s possible with PHP.

    Back to the topic then :-) I don’t know what exactly triggered the memory corruption problems in PHP 4.3, and I can’t find anything really useful about it on the web. Do you have some documentation on it? I thought these corruptions could only come up when you return a non-variable from a function that returns by reference (therefor the notices in PHP 4.4 when you do this).

    Maybe we should call Derick to join us and bring some clarity? 😉

    2 February 2006 at 12:42
  • Hans Melis says:

    But there’s no direct evidence suggesting it really *is* a static method. The current documentation suggests both access methods so there’s no bad way to call it and there’s no way PHP4 can determine if it’s static or not. PHP5 makes this a moot point of course 😉

    I don’t know of any documentation that clearly explains how to reproduce the problem. It’s unpredictable. Some code might run fine, you copy it into another script and you have the memory corruption. It’s really a weird thing.

    I have encountered it once when I forgot the & when doing $ini = eZINI::instance( ‘someini.ini’ );. I got a completely different ini file than the one I requested (and eZINI::instance has a completely valid “return $impl;”). Adding the ampersand solved it. Mysterious at the time, but it all made sense once I knew about the memory corruptions.

    2 February 2006 at 13:48

Your email address will not be published. Required fields are marked *