More PHP Internals: References

By request, a quick post on using PHP references in extensions.

To start, here’s an example of references in PHP we’ll be translating into C:


This will print:

x is 1
called not_by_ref(1)
x is 1
called by_ref(1)
x is 3

If you want your C extension’s function to officially have a signature with ampersands in it, you have to declare to PHP that you want to pass in refs as arguments. Remember how we declared functions in this struct?

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  { NULL, NULL, NULL }
};

The second argument to PHP_FE, NULL, can optional be the argument spec. For example, let’s say we’re implementing by_ref() in C. We would add this to php_rlyeh.c:

// the 1 indicates pass-by-reference
ZEND_BEGIN_ARG_INFO(arginfo_by_ref, 1)
ZEND_END_ARG_INFO();

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(by_ref, arginfo_by_ref)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(by_ref) {
  zval *zptr = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  php_printf("called (the c version of) by_ref(%d)n", (int)Z_LVAL_P(zptr));
  ZVAL_LONG(zptr, 3);
}

Suppose we also add not_by_ref(). This might look something like:

ZEND_BEGIN_ARG_INFO(arginfo_not_by_ref, 0)
ZEND_END_ARG_INFO();

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(by_ref, arginfo_by_ref)
  PHP_FE(not_by_ref, arginfo_not_by_ref)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(zptr));
  ZVAL_LONG(zptr, 2);
}

However, if we try running this, we’ll get:

x is 1
called (the c version of) not_by_ref(1)
x is 2
called (the c version of) by_ref(2)
x is 3

What happened? not_by_ref used our variable like a reference!

This is really weird and annoying behavior (if anyone knows why PHP does this, please comment below).

To work around it, if you want non-reference behavior, you have to manually make a copy of the argument.

Our not_by_ref() function becomes:

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  // make a copy                                                                                                                                                          
  MAKE_STD_ZVAL(copy);
  memcpy(copy, zptr, sizeof(zval));

  // set refcount to 1, as we're only using "copy" in this function                                                                                                         
  Z_SET_REFCOUNT_P(copy, 1);

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy));
  ZVAL_LONG(copy, 2);

  zval_ptr_dtor(&copy);
}

Note that we set the refcount of copy to 1. This is because the refcount for zptr is 2: 1 ref from the calling function + 1 ref from the not_by_ref function. However, we don’t want the copy of zptr to have a refcount of 2, because it’s only being used by the current function.

Also note that memcpy-ing the zval only works because this is a scalar: if this were an array or object, we’d have to use PHP API functions to make a deep copy of the original.

If we run our PHP program again, it gives us:

x is 1
called (the c version of) not_by_ref(1)
x is 1
called (the c version of) by_ref(1)
x is 3

Okay, this is pretty good… but we’re actually missing a case. What happens if we pass in a reference to not_by_ref()? In PHP, this looks like:

function not_by_ref($arg) {
   $arg = 2;
}

$x = 1;
not_by_ref(&$x);
display($x);

…which displays “x is 2”. Unfortunately, we’ve overridden this behavior in our not_by_ref() C function, so we have to special case: if this is a reference, change its value, otherwise make a copy and change the copy’s value.

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  // NEW CODE
  if (Z_ISREF_P(zptr)) {
    // if this is a reference, make copy point to zptr
    copy = zptr;

    // adding a reference so we can indiscriminately delete copy later
    zval_add_ref(&zptr);
  }
  // OLD CODE
  else {
    // make a copy                                                                                                                                  
    MAKE_STD_ZVAL(copy);
    memcpy(copy, zptr, sizeof(zval));

    // set refcount to 1, as we're only using "copy" in this function                                                                                                       
    Z_SET_REFCOUNT_P(copy, 1);
  }

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy));
  ZVAL_LONG(copy, 2);

  zval_ptr_dtor(&copy);
}

Now it’ll behave “properly.”

There may be a better way to do this, please leave a comment if you know of one. However, as far as I know, this is the only way to emulate the PHP reference behavior.

If you would like to read more about PHP references, Derick Rethans wrote a great article on it for PHP Architect.

2 thoughts on “More PHP Internals: References

  1. i’ve read a book a year ago i think “Extending and Embedding php” . I’ll read it again, because i’m sure i found the answer to your question there, but i don’t remember now…..i’ll re-read it and post it!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: