By request, a quick post on using PHP references in extensions.
To start, here’s an example of references in PHP we’ll be translating into C:
This will print:
x is 1 called not_by_ref(1) x is 1 called by_ref(1) x is 3
If you want your C extension’s function to officially have a signature with ampersands in it, you have to declare to PHP that you want to pass in refs as arguments. Remember how we declared functions in this struct?
zend_function_entry rlyeh_functions[] = { PHP_FE(cthulhu, NULL) { NULL, NULL, NULL } };
The second argument to PHP_FE
, NULL, can optional be the argument spec. For example, let’s say we’re implementing by_ref()
in C. We would add this to php_rlyeh.c:
// the 1 indicates pass-by-reference ZEND_BEGIN_ARG_INFO(arginfo_by_ref, 1) ZEND_END_ARG_INFO(); zend_function_entry rlyeh_functions[] = { PHP_FE(cthulhu, NULL) PHP_FE(by_ref, arginfo_by_ref) { NULL, NULL, NULL } }; PHP_FUNCTION(by_ref) { zval *zptr = 0; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) { return; } php_printf("called (the c version of) by_ref(%d)n", (int)Z_LVAL_P(zptr)); ZVAL_LONG(zptr, 3); }
Suppose we also add not_by_ref()
. This might look something like:
ZEND_BEGIN_ARG_INFO(arginfo_not_by_ref, 0) ZEND_END_ARG_INFO(); zend_function_entry rlyeh_functions[] = { PHP_FE(cthulhu, NULL) PHP_FE(by_ref, arginfo_by_ref) PHP_FE(not_by_ref, arginfo_not_by_ref) { NULL, NULL, NULL } }; PHP_FUNCTION(not_by_ref) { zval *zptr = 0, *copy = 0; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) { return; } php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(zptr)); ZVAL_LONG(zptr, 2); }
However, if we try running this, we’ll get:
x is 1 called (the c version of) not_by_ref(1) x is 2 called (the c version of) by_ref(2) x is 3
What happened? not_by_ref
used our variable like a reference!
This is really weird and annoying behavior (if anyone knows why PHP does this, please comment below).
To work around it, if you want non-reference behavior, you have to manually make a copy of the argument.
Our not_by_ref()
function becomes:
PHP_FUNCTION(not_by_ref) { zval *zptr = 0, *copy = 0; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) { return; } // make a copy MAKE_STD_ZVAL(copy); memcpy(copy, zptr, sizeof(zval)); // set refcount to 1, as we're only using "copy" in this function Z_SET_REFCOUNT_P(copy, 1); php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy)); ZVAL_LONG(copy, 2); zval_ptr_dtor(©); }
Note that we set the refcount of copy
to 1. This is because the refcount for zptr
is 2: 1 ref from the calling function + 1 ref from the not_by_ref
function. However, we don’t want the copy of zptr
to have a refcount of 2, because it’s only being used by the current function.
Also note that memcpy
-ing the zval only works because this is a scalar: if this were an array or object, we’d have to use PHP API functions to make a deep copy of the original.
If we run our PHP program again, it gives us:
x is 1 called (the c version of) not_by_ref(1) x is 1 called (the c version of) by_ref(1) x is 3
Okay, this is pretty good… but we’re actually missing a case. What happens if we pass in a reference to not_by_ref()
? In PHP, this looks like:
function not_by_ref($arg) { $arg = 2; } $x = 1; not_by_ref(&$x); display($x);
…which displays “x is 2”. Unfortunately, we’ve overridden this behavior in our not_by_ref()
C function, so we have to special case: if this is a reference, change its value, otherwise make a copy and change the copy’s value.
PHP_FUNCTION(not_by_ref) { zval *zptr = 0, *copy = 0; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) { return; } // NEW CODE if (Z_ISREF_P(zptr)) { // if this is a reference, make copy point to zptr copy = zptr; // adding a reference so we can indiscriminately delete copy later zval_add_ref(&zptr); } // OLD CODE else { // make a copy MAKE_STD_ZVAL(copy); memcpy(copy, zptr, sizeof(zval)); // set refcount to 1, as we're only using "copy" in this function Z_SET_REFCOUNT_P(copy, 1); } php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy)); ZVAL_LONG(copy, 2); zval_ptr_dtor(©); }
Now it’ll behave “properly.”
There may be a better way to do this, please leave a comment if you know of one. However, as far as I know, this is the only way to emulate the PHP reference behavior.
If you would like to read more about PHP references, Derick Rethans wrote a great article on it for PHP Architect.
i’ve read a book a year ago i think “Extending and Embedding php” . I’ll read it again, because i’m sure i found the answer to your question there, but i don’t remember now…..i’ll re-read it and post it!
LikeLike
Found the answer here:
http://www.php.net/manual/en/internals2.ze1.zendapi.php
sections:
Dealing with Arguments Passed by Reference and
Assuring Write Safety for Other Parameters
Basically because php uses references and copy on write, you have to isolate your value yourself when params are sent normaly (without php reference)
Thanks for the great posts!
LikeLike