More PHP Internals: References

By request, a quick post on using PHP references in extensions.

To start, here’s an example of references in PHP we’ll be translating into C:


This will print:

x is 1
called not_by_ref(1)
x is 1
called by_ref(1)
x is 3

If you want your C extension’s function to officially have a signature with ampersands in it, you have to declare to PHP that you want to pass in refs as arguments. Remember how we declared functions in this struct?

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  { NULL, NULL, NULL }
};

The second argument to PHP_FE, NULL, can optional be the argument spec. For example, let’s say we’re implementing by_ref() in C. We would add this to php_rlyeh.c:

// the 1 indicates pass-by-reference
ZEND_BEGIN_ARG_INFO(arginfo_by_ref, 1)
ZEND_END_ARG_INFO();

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(by_ref, arginfo_by_ref)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(by_ref) {
  zval *zptr = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  php_printf("called (the c version of) by_ref(%d)n", (int)Z_LVAL_P(zptr));
  ZVAL_LONG(zptr, 3);
}

Suppose we also add not_by_ref(). This might look something like:

ZEND_BEGIN_ARG_INFO(arginfo_not_by_ref, 0)
ZEND_END_ARG_INFO();

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(by_ref, arginfo_by_ref)
  PHP_FE(not_by_ref, arginfo_not_by_ref)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(zptr));
  ZVAL_LONG(zptr, 2);
}

However, if we try running this, we’ll get:

x is 1
called (the c version of) not_by_ref(1)
x is 2
called (the c version of) by_ref(2)
x is 3

What happened? not_by_ref used our variable like a reference!

This is really weird and annoying behavior (if anyone knows why PHP does this, please comment below).

To work around it, if you want non-reference behavior, you have to manually make a copy of the argument.

Our not_by_ref() function becomes:

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  // make a copy                                                                                                                                                          
  MAKE_STD_ZVAL(copy);
  memcpy(copy, zptr, sizeof(zval));

  // set refcount to 1, as we're only using "copy" in this function                                                                                                         
  Z_SET_REFCOUNT_P(copy, 1);

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy));
  ZVAL_LONG(copy, 2);

  zval_ptr_dtor(&copy);
}

Note that we set the refcount of copy to 1. This is because the refcount for zptr is 2: 1 ref from the calling function + 1 ref from the not_by_ref function. However, we don’t want the copy of zptr to have a refcount of 2, because it’s only being used by the current function.

Also note that memcpy-ing the zval only works because this is a scalar: if this were an array or object, we’d have to use PHP API functions to make a deep copy of the original.

If we run our PHP program again, it gives us:

x is 1
called (the c version of) not_by_ref(1)
x is 1
called (the c version of) by_ref(1)
x is 3

Okay, this is pretty good… but we’re actually missing a case. What happens if we pass in a reference to not_by_ref()? In PHP, this looks like:

function not_by_ref($arg) {
   $arg = 2;
}

$x = 1;
not_by_ref(&$x);
display($x);

…which displays “x is 2”. Unfortunately, we’ve overridden this behavior in our not_by_ref() C function, so we have to special case: if this is a reference, change its value, otherwise make a copy and change the copy’s value.

PHP_FUNCTION(not_by_ref) {
  zval *zptr = 0, *copy = 0;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z", &zptr) == FAILURE) {
    return;
  }

  // NEW CODE
  if (Z_ISREF_P(zptr)) {
    // if this is a reference, make copy point to zptr
    copy = zptr;

    // adding a reference so we can indiscriminately delete copy later
    zval_add_ref(&zptr);
  }
  // OLD CODE
  else {
    // make a copy                                                                                                                                  
    MAKE_STD_ZVAL(copy);
    memcpy(copy, zptr, sizeof(zval));

    // set refcount to 1, as we're only using "copy" in this function                                                                                                       
    Z_SET_REFCOUNT_P(copy, 1);
  }

  php_printf("called (the c version of) not_by_ref(%d)n", (int)Z_LVAL_P(copy));
  ZVAL_LONG(copy, 2);

  zval_ptr_dtor(&copy);
}

Now it’ll behave “properly.”

There may be a better way to do this, please leave a comment if you know of one. However, as far as I know, this is the only way to emulate the PHP reference behavior.

If you would like to read more about PHP references, Derick Rethans wrote a great article on it for PHP Architect.

PHP Extensions Made Eldrich: Classes

This is the final section of a 4-part series on writing PHP extensions.

  1. Setting Up PHP – compiling PHP for extension development
  2. Hello, world! – your first extension
  3. Working with the API – the PHP C API
  4. Classes – creating PHP objects in C
Objects

branch: oop

This section will cover creating objects. Objects are like associative arrays++, they allow you to attach almost any functionality you want to a PHP variable.

You can create an object in much the same way that you’d create an array:

PHP_FUNCTION(makeObject) {
    object_init(return_value);

    // add a couple of properties
    zend_update_property_string(NULL, return_value, "name", strlen("name"), "yig" TSRMLS_CC);
    zend_update_property_long(NULL, return_value, "worshippers", strlen("worshippers"), 4 TSRMLS_CC);
}

If you call var_dump(makeObject()), you’ll see something like:

object(stdClass)#1 (2) {
  ["name"]=>
  string(3) "yig"
  ["worshippers"]=>
  int(4)
}

Classes

branch: cultists

You create a class by designing a class template, stored in a zend_class_entry.

For our extension, we’ll make a new class, Cultist. We want a standard cultist template, but every individual cultist is unique.

I like to give each class its own C file to keep things tidy, but that’s not necessary if it’s more logical to group them together or something. However, it’s my tutorial, so we’re splitting it out.

Add two new files to your extension directory: cultist.c and cultist.h. Add the new C file to your config.m4, so it will get compiled into your extension:

PHP_NEW_EXTENSION(rlyeh, php_rlyeh.c cultist.c, $ext_shared)

Note that there is no comma between php_rlyeh.c and cultist.c.

Now we want to add our Cultist class. Open up cultist.c and add the following code:

#include 

#include "cultist.h"

zend_class_entry *rlyeh_ce_cultist;

static function_entry cultist_methods[] = {
  PHP_ME(Cultist, sacrifice, NULL, ZEND_ACC_PUBLIC)
  {NULL, NULL, NULL}
};

void rlyeh_init_cultist(TSRMLS_D) {
  zend_class_entry ce;

  INIT_CLASS_ENTRY(ce, "Cultist", cultist_methods);
  rlyeh_ce_cultist = zend_register_internal_class(&ce TSRMLS_CC);

  /* fields */
  zend_declare_property_bool(rlyeh_ce_cultist, "alive", strlen("alive"), 1, ZEND_ACC_PUBLIC TSRMLS_CC);
}

PHP_METHOD(Cultist, sacrifice) {
  // TODO                                                                                                                                                                   
}

You might recognize the function_entry struct from our original extension: methods are just grouped into function_entrys per class.

The real meat-and-potatoes is in rlyeh_init_cultist. This function defines the class entry for cultist, giving it methods (cultist_methods), constants, and properties.

There are tons of flags that can be set for methods and properties. Some of the most common are:

ZEND_ACC_STATIC
ZEND_ACC_PUBLIC
ZEND_ACC_PROTECTED
ZEND_ACC_PRIVATE
ZEND_ACC_CTOR
ZEND_ACC_DTOR
ZEND_ACC_DEPRECATED

Currently we’re just using ZEND_ACC_PUBLIC for our sacrifice function, but this could be OR-ed with any of the other flags (for example, if we decided sacrifice2() had a better API, we could change sacrifice‘s flags to ZEND_ACC_PUBLIC|ZEND_ACC_DEPRECATED and PHP would warn the user if they tried to use it).

In cultist.h, define all of the functions used above:

#ifndef CULTIST_H
#define CULTIST_H

void rlyeh_init_cultist(TSRMLS_D);

PHP_METHOD(Cultist, sacrifice);

#endif

Now we have to tell the extension to load this class on startup. Thus, we want to call rlyeh_init_cultist in our MINIT function and include the cultist.h header file. Open up php_rlyeh.c and add the following:

// at the top
#include "cultist.h"

// our existing MINIT function from part 3
PHP_MINIT_FUNCTION(rlyeh) {
  rlyeh_init_cultist(TSRMLS_C);
}

Because we changed config.m4, we have to do phpize && ./configure && make install, not just make install, otherwise cultist.c won’t be added to the Makefile.

Now if we run var_dump(new Cultist());, we will see something like:

object(Cultist)#1 (1) {
  ["alive"]=>
  bool(true)
}
Creating a new class instance

We can also initialize cultists from C. Let’s add a static function to create a cultist. Open cultist.c and add the following:

static function_entry cultist_methods[] = {
  PHP_ME(Cultist, sacrifice, NULL, ZEND_ACC_PUBLIC)
  PHP_ME(Cultist, createCultist, NULL, ZEND_ACC_PUBLIC|ZEND_ACC_STATIC)
  {NULL, NULL, NULL}
};

PHP_METHOD(Cultist, createCultist) {
   object_init_ex(return_value, rlyeh_ce_cultist);
}

Now we can call Cultist::createCultist() to create a new cultist.

What if creating new cultists takes some setup, so we’d like to have a constructor? Well, the constructor is just a method, so we can add that:

static function_entry cultist_methods[] = {
  PHP_ME(Cultist, __construct, NULL, ZEND_ACC_PUBLIC|ZEND_ACC_CTOR)
  PHP_ME(Cultist, sacrifice, NULL, ZEND_ACC_PUBLIC)
  PHP_ME(Cultist, createCultist, NULL, ZEND_ACC_PUBLIC|ZEND_ACC_STATIC)
  {NULL, NULL, NULL}
};

PHP_METHOD(Cultist, __construct) {
  // do setup
}

Now PHP will automatically call our Cultist::__construct when we call new Cultist. However, createCultist won’t: it’ll just set the defaults and return. We have to modify createCultist to call a PHP method from C.

Calling method-to-method

branch: m2m

First, add this enormous block to your php_rlyeh.h file:

#define PUSH_PARAM(arg) zend_vm_stack_push(arg TSRMLS_CC)
#define POP_PARAM() (void)zend_vm_stack_pop(TSRMLS_C)
#define PUSH_EO_PARAM()
#define POP_EO_PARAM()

#define CALL_METHOD_BASE(classname, name) zim_##classname##_##name

#define CALL_METHOD_HELPER(classname, name, retval, thisptr, num, param) 
  PUSH_PARAM(param); PUSH_PARAM((void*)num);                            
  PUSH_EO_PARAM();                                                      
  CALL_METHOD_BASE(classname, name)(num, retval, NULL, thisptr, 0 TSRMLS_CC); 
  POP_EO_PARAM();                       
  POP_PARAM(); POP_PARAM();

#define CALL_METHOD(classname, name, retval, thisptr)                  
  CALL_METHOD_BASE(classname, name)(0, retval, NULL, thisptr, 0 TSRMLS_CC);

#define CALL_METHOD1(classname, name, retval, thisptr, param1)         
  CALL_METHOD_HELPER(classname, name, retval, thisptr, 1, param1);

#define CALL_METHOD2(classname, name, retval, thisptr, param1, param2) 
  PUSH_PARAM(param1);                                                   
  CALL_METHOD_HELPER(classname, name, retval, thisptr, 2, param2);     
  POP_PARAM();

#define CALL_METHOD3(classname, name, retval, thisptr, param1, param2, param3) 
  PUSH_PARAM(param1); PUSH_PARAM(param2);                               
  CALL_METHOD_HELPER(classname, name, retval, thisptr, 3, param3);     
  POP_PARAM(); POP_PARAM();

These macros let you call PHP functions from C.

Add the following to cultist.c:

#include "php_rlyeh.h"

PHP_METHOD(Cultist, createCultist) {
  object_init_ex(return_value, rlyeh_ce_cultist);
  CALL_METHOD(Cultist, __construct, return_value, return_value);
}

this

branch: this

We’ve pretty much just been dealing with return_values, but now that we’re working with objects we can also access this. To get this, use the getThis() macro.

For example, suppose we want to set a couple of properties in the constructor:

PHP_METHOD(Cultist, __construct) {
  char *name;
  int name_len;
  // defaults                                                                                                                                                               
  long health = 10, sanity = 4;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|ll", &name, &name_len, &health, &sanity) == FAILURE) {
    return;
  }

  zend_update_property_stringl(rlyeh_ce_cultist, getThis(), "name", strlen("name"), name, name_len TSRMLS_CC);
  zend_update_property_long(rlyeh_ce_cultist, getThis(), "health", strlen("health"), health TSRMLS_CC);
  zend_update_property_long(rlyeh_ce_cultist, getThis(), "sanity", strlen("sanity"), sanity TSRMLS_CC);
}

Note the zend_parse_parameters argument: “s|ll”. The pipe character (“|”) means, “every argument after this is optional.” Thus, at least 1 argument is required (in this case, the cultist’s name), but health and sanity are optional.

Now, if we create a new cultist, we get something like:

$ php -r 'var_dump(new Cultist("Todd"));'
object(Cultist)#1 (4) {
  ["alive"]=>
  bool(true)
  ["name"]=>
  string(4) "Todd"
  ["health"]=>
  int(10)
  ["sanity"]=>
  int(4)
}
Attaching Structs

As mentioned earlier, you can attach a struct to an object. This lets the object carry around some information that is invisible to PHP, but usable to your extension.

You have to set up the zend_class_entry in a special way when you create it. First, add the struct to your cultist.h file, as well as the extra function declaration we’ll be using:

typedef struct _cult_secrets {
    // required
    zend_object std;

    // actual struct contents
    int end_of_world;
    char *prayer;
} cult_secrets;

zend_object_value create_cult_secrets(zend_class_entry *class_type TSRMLS_DC);
void free_cult_secrets(void *object TSRMLS_DC);
// existing init function
void rlyeh_init_cultist(TSRMLS_D) {
  zend_class_entry ce;

  INIT_CLASS_ENTRY(ce, "Cultist", cultist_methods);
  // new line!
  ce.create_object = create_cult_secrets;
  rlyeh_ce_cultist = zend_register_internal_class(&ce TSRMLS_CC);

  /* fields */
  zend_declare_property_bool(rlyeh_ce_cultist, "alive", strlen("alive"), 1, ZEND_ACC_PUBLIC TSRMLS_CC);
}

zend_object_value create_cult_secrets(zend_class_entry *class_type TSRMLS_DC) {
  zend_object_value retval;
  cult_secrets *intern;
  zval *tmp;

  // allocate the struct we're going to use
  intern = (cult_secrets*)emalloc(sizeof(cult_secrets));
  memset(intern, 0, sizeof(cult_secrets));

  // create a table for class properties
  zend_object_std_init(&intern->std, class_type TSRMLS_CC);
  zend_hash_copy(intern->std.properties,
     &class_type->default_properties,
     (copy_ctor_func_t) zval_add_ref,
     (void *) &tmp,
     sizeof(zval *));

  // create a destructor for this struct
  retval.handle = zend_objects_store_put(intern, (zend_objects_store_dtor_t) zend_objects_destroy_object, free_cult_secrets, NULL TSRMLS_CC);
  retval.handlers = zend_get_std_object_handlers();

  return retval;
}

// this will be called when a Cultist goes out of scope
void free_cult_secrets(void *object TSRMLS_DC) {
  cult_secrets *secrets = (cult_secrets*)object;
  if (secrets->prayer) {
    efree(secrets->prayer);
  }
  efree(secrets);
}

If we want to access this, we can fetch the struct from getThis() with something like:

PHP_METHOD(Cultist, getDoomsday) {
  cult_secrets *secrets;

  secrets = (cult_secrets*)zend_object_store_get_object(getThis() TSRMLS_CC);

  RETURN_LONG(secrets->end_of_world);
}

Exceptions

branch: exceptions

All exceptions must descend from the base PHP Exception class, so this is also an intro to class inheritance.

Aside from extending Exception, custom exceptions are just normal classes. So, to create a new one, open up php_rlyeh.c and add the following:

// include exceptions header
#include 

zend_class_entry *rlyeh_ce_exception;

void rlyeh_init_exception(TSRMLS_D) {
  zend_class_entry e;

  INIT_CLASS_ENTRY(e, "MadnessException", NULL);
  rlyeh_ce_exception = zend_register_internal_class_ex(&e, (zend_class_entry*)zend_exception_get_default(TSRMLS_C), NULL TSRMLS_CC);
}

PHP_MINIT_FUNCTION(rlyeh) {
  rlyeh_init_exception(TSRMLS_C);
}

Don’t forget to declare rlyeh_init_exception in php_rlyeh.h.

Note that we could add our own methods to MadnessException with the third argument to INIT_CLASS_ENTRY, but we’ll just leave it with the default exception methods it inherits from Exception.

Throwing Exceptions

An exception isn’t much good unless we can throw it. Let’s add a method that can throw it:

zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(lookAtMonster, NULL)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(lookAtMonster) {
  zend_throw_exception(rlyeh_ce_exception, "looked at the monster too long", 1000 TSRMLS_CC);
}

The 1000 is the exception code, you can set that to whatever you want (users can access it from the exception with the getCode() method).

Now, if we compile and install, we can run lookAtMonster() and we’ll get:

Fatal error: Uncaught exception 'MadnessException' with message 'looked at the monster too long' in Command line code:1
Stack trace:
#0 Command line code(1): lookAtMonster()
#1 {main}
  thrown in Command line code on line 1

Congratulations, now you’ve stared into the abyss!

This tutorial is an ongoing work. I hope you’ve enjoyed it and please comment below if you think I’ve missed any important topics or anything is unclear.

PHP Extensions Made Eldrich: PHP Variables

This is section 3 of a 4-part introduction to PHP extensions:

  1. Setting Up PHP – compiling PHP for extension development
  2. Hello, world! – your first extension
  3. Working with the API – the PHP C API
  4. Classes – creating PHP objects in C

This section is, unfortunately, longer than all of the other sections combined. The upshot is that this section covers 90% of the functions you’ll use when creating extensions.

Using Variables

In the previous sections, we got PHP set up and created our first extension. In this section, we’ll look at how to use more of the PHP API.

Working with input

branch: zend_parse_parameters

Our existing extension is nice, but it isn’t very interactive. We can modify this function to accept variables as arguments using the zend_parse_parameters function:

PHP_FUNCTION(cthulhu) {
    // boolean type
    zend_bool english = 0;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "b", &english) == FAILURE) {
        return;
    }

    if (english) {
        php_printf("In his house at R'lyeh dead Cthulhu waits dreaming.n");
    }
    else {
        php_printf("Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn.n");
    }
}

Try re-compiling the extension and calling cthulhu(true); and cthulhu(false);.

If you try calling cthulhu(); (no arguments), you’ll notice that zend_parse_parameters takes care of warning you about it:

$ php -r 'cthulhu();'

Warning: cthulhu() expects exactly 1 parameter, 0 given in Command line code on line 1
A note on return values

zend_parse_parameters and many other PHP API function return SUCCESS or FAILURE, which are int values. Irritatingly, SUCCESS is 0 (false in C) and FAILURE is -1 (true in C)! So, you generally can’t say if (some_php_api_func()), you have to say if (some_php_api_func() == SUCCESS).

zend_parse_parameters input

The parameters passed to zend_parse_parameters are:

ZEND_NUM_ARGS()
The number of arguments passed in (you can hard-code this, but using ZEND_NUM_ARGS() will automatically grab that info for you).
TSRMLS_CC
You’ll see this magic variable all over the place in PHP extensions. It’s a macro that defines “, <thread_info>” (or “” if threading is disabled). Note that, because it includes a comma, there’s no comma between ZEND_NUM_ARGS() and TSRMLS_CC. You don’t have to worry about it or do anything with it, just pass it around.
“b”
This is a string describing the arguments you expect. Common values are:

  • “b”: boolean, expects zend_bool.
  • “s”: string, expects char* and int.
  • “l”: long, expects long.
  • “d”: double, expects double.
  • “a”: array, expects zval*.
  • “o”: object, expects zval*.
  • “z”: any type, expects zval*.

Except for “b”, “l”, and “d”, zend_parse_parameters does not create a copy of the parameter, it just returns the address. Thus, you generally shouldn’t free this memory, as the calling function “owns” it.

The options listed above can be combined. For example, suppose we had a function that took a number of times to append a given string to a given array. We’d expect it to look something like:

PHP_FUNCTION(chant) {
  int str_len;
  long num;
  char *str;
  zval *arr;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "lsa", &num, &str, &str_len, &arr) == FAILURE) {
    return;
  }

  /* function body */
}

Note that you always pass in the address of the variable, not the variable itself.

&english
A list of addresses to use to store passed-in values. zend_bool is just a typedefed numeric type to represent booleans.

Note that you must always use “long” for integers (not int). The long type is a different size on 32-bit and 64-bit machines (except on Windows!), so you’ll get weird segfaults if you use another numeric type on certain platforms.

Zvals

branch: types

You may have noticed above that arrays, objects, and any type are all returned a zvals by zend_parse_parameters. This is because every PHP variable is, under the covers, a C struct called a zval. For example, if you say $x = "foo"; $y = 123; $z = array();, then $x, $y, and $z are all zvals.

If you want to be able to communicate information from C to PHP, it’s important to understand how to work with zvals. A zval is defined as:

struct _zval_struct {
    zvalue_value value;
    zend_uint refcount__gc;
    zend_uchar type;
    zend_uchar is_ref__gc;
};

The main components of this struct are:

value
The actual value of the variable. This is defined as a union of:

typedef union _zvalue_value {
    long lval;
    double dval;
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;
    zend_object_value obj;
} zvalue_value;

These field correspond to the following types:

  • lval: Longs and booleans
  • dval: Doubles
  • str: Strings
  • ht: Arrays and associative arrays
  • obj: Objects
refcount__gc
A reference count for garbage collection. When you (or PHP) asks for a zval to be destroyed, the zval destructor decrements the refcount and checks if it is 0. If the refcount is greater than 0, it will just decremented the refcount and return. Once the refcount is 0, the zval will actually be freed.
type
The type of this zval. This tells PHP which union element to look for and what to do when it finds it. There are human-readable macros for each type:

#define IS_NULL	          0
#define IS_LONG	          1
#define IS_DOUBLE         2
#define IS_BOOL	          3
#define IS_ARRAY          4
#define IS_OBJECT         5
#define IS_STRING         6
#define IS_RESOURCE       7
#define IS_CONSTANT       8
#define IS_CONSTANT_ARRAY 9

This tutorial will only cover working with types 0-6. I’ve found that, with object-oriented PHP, the last three are less useful.

Resources are a way of binding C structs to PHP variables (e.g., passing a database connection around with the defunct mysql extension), but objects provide a nicer way of doing struct attachment. The Developer Zone tutorial goes into quite a lot of detail on resources, if you’re interested.

The field type tells PHP which union element to look at for the zval’s value. For example, if zval_p->type == IS_STRING, the value for the zval should be in the zval_p->value.str field.

This field also determines how PHP interprets the value field. For example, lval does double duty for longs and booleans. So, if you have lval set to 1 and zval_p->type==IS_LONG, it will be displayed as 1. If you have zval_p->type==IS_BOOL, it will be displayed as true.

// add function entries for each function you define
zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  PHP_FE(makeBool, NULL)
  PHP_FE(makeLong, NULL)
  { NULL, NULL, NULL }
};

PHP_FUNCTION(makeBool) {
    Z_TYPE_P(return_value) = IS_BOOL;
    Z_LVAL_P(return_value) = 1;
}

PHP_FUNCTION(makeLong) {
    Z_TYPE_P(return_value) = IS_LONG;
    Z_LVAL_P(return_value) = 1;
}

// don't forget to add declarations for these functions to your header file, too!

return_value is passed into PHP_FUNCTIONs and holds the value that is returned (it defaults to null).

If you compile this new code and run:

var_dump(makeBool());
var_dump(makeLong());

You’ll see something like:

bool(true)
int(1)
is_ref__gc
If this is a PHP reference.

Accessing the Contents of a Zval

The internals of a zval are subject to change, so you should always PHP’s zval macros instead of touching the gooey innards (e.g., don’t actually set a value by putting zval_p->value.lval in your code).

As shown in the example above, you can safely manipulate said innards through these macros:

zval *zval_p;

long l        = Z_LVAL_P(zval_p)
zend_bool b   = Z_BVAL_P(zval_p)
double d      = Z_DVAL_P(zval_p)
char *str     = Z_STRVAL_P(zval_p)
int str_len   = Z_STRLEN_P(zval_p)
HashTable *ht = Z_ARRVAL_P(zval_p)

// objects are a bit complicated... suffice to know that these exist:
Z_OBJPROP_P(zval_p)
Z_OBJCE_P(zval_p)
Z_OBJVAL_P(zval_p)
Z_OBJ_HANDLE_P(zval_p)
Z_OBJ_HT_P(zval_p)
Z_OBJ_HANDLER_P(zval_p, h)
Z_OBJDEBUG_P(zval_p,is_tmp)

As you can see, you can extract each part of a zval’s value using a macro. The ones listed above work on zval pointers (zval*s). If you are using zval or zval**, there are analogous helpers with one fewer or one more P, respectively. For example:

long get_long(zval z) {
  return Z_LVAL(z);
}

// or 

long get_long(zval **zval_pp) {
  return Z_LVAL_PP(zval_pp);
}

Creating Zvals

Before using a zval, you must make sure that its refcount, type, and value are set correctly. For scalar types, you can set value and type using a single macro: ZVAL_type.

ZVAL_NULL(zval_p);
ZVAL_BOOL(zval_p, 0);
ZVAL_LONG(zval_p, 123);
ZVAL_DOUBLE(zval_p, 12.3);

For strings, it is a little trickier because you have to allocate space for the string or let PHP know that you’ve already allocated space for it.

Thus, ZVAL_STRING takes an argument that tells PHP whether or not to make a copy of the string for the zval. Basically, this should be 0 if you’ve already created a special instance of the string for this zval and 1 if you haven’t.

// "bar" is on the stack and zval_p is on the heap, so we 
// want to make a copy of "bar" on the heap
ZVAL_STRING(zval_p, "bar", 1);
// this means "copy" ------^

// copy "bar" to the heap
char *str = estrdup("bar");
ZVAL_STRING(zval_p, str, 0);
// "don't copy" ---------^

Which brings us to the next section, memory management.

Memory Management

branch: mm

PHP uses its own memory pool and allocation/deallocation functions, which you should generally use instead of malloc, free, and friends.

PHP has similar functions to the standard C library, only everything is prefixed with an “e”:

void* emalloc(size_t size);
void* ecalloc(size_t size);
void* erealloc(size_t size);

void efree(void* ptr);

char* estrdup(char* str);
char* estrndup(char* str, int len);

If you are used to C programming where you check if memory was successfully allocated (x = malloc(sizeof(x)); if (!x) return 0;), know that this is not strictly necessary in PHP. PHP’s memory management functions will exit PHP if you run out of memory, so if emalloc returns, it returned some memory.

Remember how you compiled PHP with --enable-maintainer-zts at the beginning? Well, here’s the payoff: it will let you know about any memory leaks it detects. For example, try adding a function to your extension:

// add to function_entry table and header file, too
PHP_FUNCTION(leak) {
    emalloc(20);
}

Now, if you recompile your extension and run leak(), you’ll see:

[Wed Aug 10 16:34:42 2011]  Script:  '-'
/Users/k/php-5.3.6/Zend/zend_builtin_functions.c(1360) :  Freeing 0x100AA75E0 (3 bytes), script=-
=== Total 1 memory leaks detected ===

This can make tracking down memory leaks much easier. (Getting friendly with valgrind is a good idea, too.)

Creating and Destroying Zvals

Zvals can be created using emalloc, but I’d recommend generally using a different macro: MAKE_STD_ZVAL. This macro not only allocates a zval, but it also sets the refcount and isref fields, so you don’t have to worry about setting those yourself.

zval *zval_p;
MAKE_STD_ZVAL(zval_p);

If you need to destroy a zval, use zval_ptr_dtor, which takes a zval** (not a zval*).

zval *zval_p;
MAKE_STD_ZVAL(zval_p);

zval_ptr_dtor(&zval_p);
// back to square one

zval_ptr_dtor decrements the refcount by 1. If the refcount is still greater than 0, then zval_ptr_dtor will just return. If this makes the refcount 0, it also destroys the current zval. If this zval is a string, array, or object, PHP will take care of freeing the associated memory. Thus, you should generally not call free on a zval (as this will cause leaks: orphaned strings or objects with no zval pointing to them).

Also, you should always make sure that you have set the zval to the correct type before calling zval_ptr_dtor: if you call it on garbage, it can segfault if it tries to free, say, an string that was actually an invalid pointer.

The Persistence of Memory

branch: persistence

Theoretically, all memory allocated with emalloc is freed after each request (I say theoretically because in my experience, it’s not so much freed as leaked). If you want something to hang around for longer than a single request, you’ll need to use persistent memory. Persistent memory hangs around for longer than one request (generally), up to the lifetime of the PHP process.

To allocate persistent memory, use “pe”-prefixed memory allocation functions, instead of “e”-prefixed.

void* pemalloc(size_t size, int persistent);
void* pecalloc(size_t size, int persistent);
void* perealloc(size_t size, int persistent);

void pefree(void* ptr, int persistent);

char* pestrdup(char* str, int persistent);
char* pestrndup(char* str, int len, int persistent);

The “persistent” option lets you choose whether you want to allocate persistent memory (1) or transitory memory (0, normal “e”-allocation behavior).

Search and Destroy: Finding and Cleaning Up Persistent Memory

Suppose your extension allocates a persistent struct in one HTTP request. How do you find it during the next HTTP request?

There are three steps:

  1. You have to create a type for this memory.
  2. You have to link this type to a destructor, so that PHP knows how to clean up the memory.
  3. You have to insert your allocated memory into PHP’s persistent memory hash.
Persistent Gods

To try out persistent memory, we want a struct that should persist for multiple requests. Great Old Ones are pretty darn persistent, so we’ll create an old_one struct in php_rlyeh.h:

typedef struct _old_one {
    char *name;
    int worshippers;
} old_one;

Now we need to creating a type for it. Near the beginning of php_rlyheh.c, add an int, named anything you want. This integer will hold the numeric type for Great Old Ones.

// traditionally these start with "le_", which stands 
// for "list entry"
int le_old_one;

Now we need to link the le_old_one type up to a destructor. We’ll do this when our module is first loaded, in the magical PHP_MINIT_FUNCTION(rlyeh) function:

// add MINIT to the module description:
zend_module_entry rlyeh_module_entry = {
  STANDARD_MODULE_HEADER,
  PHP_RLYEH_EXTNAME,
  rlyeh_functions,
  PHP_MINIT(rlyeh),
  NULL,
  NULL,
  NULL,
  NULL,
  PHP_RLYEH_VERSION,
  STANDARD_MODULE_PROPERTIES
};

// add this to php_rlyeh.h
PHP_MINIT_FUNCTION(rlyeh) {
    le_old_one = zend_register_list_destructors_ex(NULL, rlyeh_old_one_pefree, "Great Old One", module_number);
}

Also, add a line to php_rlyeh.h:

PHP_MINIT_FUNCTION(rlyeh);

zend_register_list_destructors_ex says, “make a new type for le_old_one. If you have to automatically free something of this type, call rlyeh_old_one_pefree on it.”

Persistent destructors always take a zend_rsrc_list_entry: this is the container PHP holds list entries (which is how we’re storing persistent memory). So, our destructor would look like:

void rlyeh_old_one_pefree(zend_rsrc_list_entry *rsrc TSRMLS_DC) {
    old_one *god = rsrc->ptr;

    // free the char* field, if set
    if (god->name) {
        pefree(god->name, 1);
    }

    pefree(god, 1);
}

Now we are ready to create some Great Old Ones!

Let’s make a new function: getYig(). If there’s already been an old_one created, it’ll return information about it, otherwise it’ll create a new one.

PHP_FUNCTION(getYig) {
    zend_rsrc_list_entry *le;
    char *key = "yig";

    if (zend_hash_find(&EG(persistent_list), key, strlen(key)+1, (void**)&le) == FAILURE) {
        // need to create a new god
        zend_rsrc_list_entry nle;
        old_one *yig;

        yig = (old_one*)pemalloc(sizeof(old_one), 1);
        yig->name = pestrdup("Yig", 1);
        yig->worshippers = 4;

        php_printf("creating a new godn");

        nle.ptr = yig;
        nle.type = le_old_one;
        nle.refcount = 1;

        zend_hash_update(&EG(persistent_list), key, strlen(key)+1, (void*)&nle, sizeof(zend_rsrc_list_entry), NULL);
    }
    else {
        old_one *god = le->ptr;

        php_printf("fetched %s: %d worshippersn", god->name, god->worshippers);
    }
}

Note that zend_hash_update and zend_hash_find take the key length + 1. The PHP API is a bit inconsistent about this: the best way to figure out if a function takes length or length+1 is to look at the source or find an example of it being used in another extension.

If you have a web server set up, add your extension to the php.ini it’s using (warning: this is probably a different php.ini than the command-line client uses). Restart it and load a page that calls getYig() a couple of times. The first time you’ll see “creating”, the next times you’ll see “fetched…”.

In the code above, you may notice that we use hash functions (zend_hash_find and zend_hash_add) to manipulate the EG(persistent_list). EG(persistent_list) is actually a HashTable that you can use to store persistent memory. However, the name reveals something that I find interesting about PHP internals: all HashTables (associative arrays) are lists, too (they keep the elements in order and you can access elements by index or key).

And speaking of hashes and lists…

Arrays

Creating Arrays

branch: array

To create an array or associative array, use array_init().

You can insert new elements to an associative array with one of these functions:

add_assoc_long(zval *zval_p, char *key, long n)
add_assoc_null(zval *zval_p, char *key)
add_assoc_bool(zval *zval_p, char *key, zend_bool b) 
add_assoc_double(zval *zval_p, char *key, double d) 
add_assoc_string(zval *zval_p, char *key, char *str, int duplicate) 
add_assoc_stringl(zval *zval_p, char *key, char *str, int length, int duplicate)
add_assoc_zval(zval *zval_p, char *key, zval *value) 

You can also “push” new elements to the array with related functions:

add_next_index_long(zval *zval_p, long n);
add_next_index_null(zval *zval_p);
add_next_index_bool(zval *zval_p, int b);
add_next_index_resource(zval *zval_p, int r);
add_next_index_double(zval *zval_p, double d);
add_next_index_string(zval *zval_p, const char *str, int duplicate);
add_next_index_stringl(zval *zval_, const char *str, uint length, int duplicate);
add_next_index_zval(zval *zval_p, zval *value);

Let’s use this to fill in the function we started in the zend_parse_parameters function:

PHP_FUNCTION(chant) {
  int str_len, i;
  long num;
  char *str;
  zval *arr;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "lsa", &num, &str, &str_len, &arr) == FAILURE) {
    return;
  }

  // sanity check
  if (num  100) {
    return;
  } 

  for (i=0; i<num; i++) {
    add_next_index_stringl(arr, str, str_len, 1);
  }
}

If we run this function, we can see that appends strings to the array correctly.


This should output:

derp
derp
derp
derp
derp
derp

Accessing Array Elements

To find an element in an associative array, use one of the zend_hash functions.

PHP_FUNCTION(findMonster) {
  int monster_len;
  char *monster;
  zval *list, **desc;

  if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "sa", &monster, &monster_len, &list) == FAILURE) {
    return;
  }

  if (zend_hash_find(Z_ARRVAL_P(list), monster, monster_len+1, (void**)&desc) == FAILURE) {
    RETURN_NULL();
  }

  RETURN_STRINGL(Z_STRVAL_PP(desc), Z_STRLEN_PP(desc), 1);
}

Note that the fourth argument is the pointer to a pointer to a pointer. I was skeptical about that for a while, but there it is.

Also, we are using a couple of new macros for setting the return value. These RETURN_type macros just set the return_value we were manipulating directly earlier.

Now, if we run something like:

 "The Toad God",
           "Yig" => "Father of Serpents",
           "Ythogtha" => "The Thing in the Pit");

var_dump(findMonster("Yig", $a));

?>

We’ll get “Father of Serpents”.

There are also a couple other hash functions you’ll probably find useful for your code:

int zend_hash_find(const HashTable *ht, const char *arKey, uint nKeyLength, void **pData);
int zend_hash_add(const HashTable *ht, const char *arKey, uint nKeyLength, void *pData, int pDataSize, void **pDest);
int zend_hash_add(const HashTable *ht, const char *arKey, uint nKeyLength, void *pData, int pDataSize, void **pDest);
int zend_hash_num_elements(const HashTable *ht);
int zend_hash_exists(const HashTable *ht, const char *arKey, uint nKeyLength);

Note that these functions do not add references to array elements. Thus, you should not, for example, do zend_hash_find and then call zval_ptr_dtor on the element found or your array will be in a weird half-freed state and PHP will try to double-free the element when the array is properly destroyed. Therefore, if you want to use an array element outside of the context of the array, you should add a reference to it, first. (We avoid that in the situation above by returning duplicates of the element’s string value.)

Iterating Through Arrays

You can iterate through an array, element by element, but it ain’t pretty. Here’s the standard for-loop you need:

HashTable *hindex = Z_ARRVAL_P(zval_p);
HashPosition pointer;
zval **data;

for(zend_hash_internal_pointer_reset_ex(hindex, &pointer);
    zend_hash_get_current_data_ex(hindex, (void**)&data, &pointer) == SUCCESS;
    zend_hash_move_forward_ex(hindex, &pointer)) {

  char *key;
  uint key_len, key_type;
  ulong index;

  key_type = zend_hash_get_current_key_ex(hindex, &key, &key_len, &index, 0, &pointer);

  switch (key_type) {
  case HASH_KEY_IS_STRING:
    // associative array keys
    php_printf("key: %sn", key);
    break;
  case HASH_KEY_IS_LONG:
    // numeric indexes
    php_printf("index: %dn", index);
    break;
  default:
    php_printf("errorn");
  }
}

Now let’s never speak of it again.

Instead, let’s move on to objects!

PHP Extensions Made Eldrich: Hello, World!

This is part 2 of a 4-part tutorial on writing PHP extensions:

  1. Setting Up PHP – compiling PHP for extension development
  2. Hello, world! – your first extension
  3. Working with the API – the PHP C API
  4. Classes – creating PHP objects in C

First we need to think of a name for our extension. I’ve been reading some H.P. Lovecraft, so let’s call it “rlyeh”.

For our first extension, we’ll create a new function, cthulhu(). When we call cthulhu() (tee hee), PHP will print “In his house at R’lyeh dead Cthulhu waits dreaming.”

Cheat Sheet

If you don’t want to copy/paste all of the code, you can clone the Github repo for this tutorial and check out sections as you go.

$ git clone git://github.com/kchodorow/rlyeh.git

This part of the tutorial (Hello, world!) is the master branch. Stating in part 3, each “unit” has a branch: <branchname> at the beginning of the section. You can checkout this branch if you want to see the code example in context.

For example, if you see branch: oop, you’d do:

$ git checkout -b oop origin/oop

Then you can compare what you’re doing to the “ideal” example code.

Setting Up

Create a directory for your PHP extension, named “rlyeh”. This is where all of the source code for your extension will live.

$ mkdir rlyeh
$ cd rlyeh

A PHP extension consists of at least three files:

  1. “config.m4”, which contains compilation instructions for PHP
  2. “php_extname.c”: source code
  3. “php_extname.h”: a header file

Creating a config.m4 file is wholly lacking in interest, so just cut/paste the one below.

dnl lines starting with "dnl" are comments

PHP_ARG_ENABLE(rlyeh, whether to enable Rlyeh extension, [  --enable-rlyeh   Enable Rlyeh extension])

if test "$PHP_RLYEH" != "no"; then

  dnl this defines the extension
  PHP_NEW_EXTENSION(rlyeh, php_rlyeh.c, $ext_shared)

  dnl this is boilerplate to make the extension work on OS X
  case $build_os in
  darwin1*.*.*)
    AC_MSG_CHECKING([whether to compile for recent osx architectures])
    CFLAGS="$CFLAGS -arch i386 -arch x86_64 -mmacosx-version-min=10.5"
    AC_MSG_RESULT([yes])
    ;;
  darwin*)
    AC_MSG_CHECKING([whether to compile for every osx architecture ever])
    CFLAGS="$CFLAGS -arch i386 -arch x86_64 -arch ppc -arch ppc64"
    AC_MSG_RESULT([yes])
    ;;
  esac

fi

If you want to call your extension something else, global replace “rlyeh” with your extension’s name.

Now for the actual extension: create a file called php_rlyeh.c with the following content:

// include PHP API
#include 

// header file we'll create below
#include "php_rlyeh.h"

// define the function(s) we want to add
zend_function_entry rlyeh_functions[] = {
  PHP_FE(cthulhu, NULL)
  { NULL, NULL, NULL }
};

// "rlyeh_functions" refers to the struct defined above
// we'll be filling in more of this later: you can use this to specify
// globals, php.ini info, startup and teardown functions, etc.
zend_module_entry rlyeh_module_entry = {
  STANDARD_MODULE_HEADER,
  PHP_RLYEH_EXTNAME,
  rlyeh_functions,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  PHP_RLYEH_VERSION,
  STANDARD_MODULE_PROPERTIES
};

// install module
ZEND_GET_MODULE(rlyeh)

// actual non-template code!
PHP_FUNCTION(cthulhu) {
    // php_printf is PHP's version of printf, it's essentially "echo" from C
    php_printf("In his house at R'lyeh dead Cthulhu waits dreaming.n");
}

That’s a whole lotta template, but it’ll make more sense as you go along.

Learning PHP extension programming is sort of like learning Java as your first programming language: “type ‘public static void main’.” “Why? What does that even mean?” “It doesn’t matter, you’ll learn about it later.”

You also have to make a header file, to declare the cthulhu function as well as the two extension info macros used in php_rlyeh.c (PHP_RLYEH_EXTNAME and PHP_RLYEH_VERSION).

Create a new file, php_rlyeh.h, and add a couple of lines:


#define PHP_RLYEH_EXTNAME "rlyeh"
#define PHP_RLYEH_VERSION "0.01"

PHP_FUNCTION(cthulhu);

You can change the version whenever you do a new release. It can be any string. It’s displayed when you do:

$ php --ri rlyeh

(once the extension is installed).

Speaking of, now all that’s left is to compile and install. Make sure that your custom-compiled-PHP is first in your PATH. If it isn’t, put it there before doing the rest of the install.

$ echo $PATH
$PHPDIR/install-debug-zts/bin:/usr/local/bin:/usr/bin
$ phpize
Configuring for:
PHP Api Version:         20090626
Zend Module Api No:      20090626
Zend Extension Api No:   220090626
$
$ ./configure
# lots of checks...
$
$ make
# compile...

Build complete.
Don't forget to run 'make test'.

$ make install
$
Installing shared extensions:     $PHPDIR/install-debug-zts/lib/php/extensions/debug-zts-20090626/

Now, add your extension to your php.ini file. PHP is probably expecting a php.ini file in the lib/ subdirectory of your install directory ($PHPDIR/install-debug-zts/lib/php.ini). It probably doesn’t exist yet, so create a new php.ini file with one line:

extension=rlyeh.so

Now you should be able to use your function from PHP without importing, loading, or requiring anything. Do:

$ php -r 'cthulhu();'
In his house at R'lyeh dead Cthulhu waits dreaming.

Your first PHP extension is working!

Next up: a deep dive into the PHP API.

PHP Extensions Made Eldrich: Installing PHP

A PHP extension allows you to connect almost any C/C++ code you want to PHP. This is a 4-part tutorial on how to write an extension:

  1. Setting Up PHP – compiling PHP for extension development
  2. Hello, world! – your first extension
  3. Working with the API – the PHP C API
  4. Classes – creating PHP objects in C

Almost all of the code examples in this tutorial are available on Github.

Zend Developer Zone has an excellent tutorial on writing PHP extensions. I wrote this tutorial because the DevZone article is getting a little old: it doesn’t cover objects, methods, or exceptions and it uses Zend API artifacts dating back to PHP 3. However, it is still an excellent tutorial and I highly recommend reading through it if you’re interested in writing PHP extensions.

Setting Up PHP

Before you start developing an extension, you should compile PHP from source (it’ll make debugging easier later on). If you hate future-you, though, you can just do which phpize and if it returns something, you’re can continue to the next section.

Compiling PHP yourself isn’t too scary (unless your on Windows, in which case welcome to Hell). First, download the source for version you want to develop with. The current stable release is a good choice.

Unpack the tarball and change to the PHP source directory:

$ tar jxvf php-5.3.6.tar.bz2
$ cd php-5.3.6/
$ PHPDIR=`pwd` # setting this up so I can refer to $PHPDIR later

Note: this tutorial assumes that you’re using version 5.3.*. The API changes every version, so if you’re not using 5.3, this tutorial is going to be very frustrating.

To install PHP, run:

$ mkdir install-debug-zts # install dir
$ ./configure --enable-debug --enable-maintainer-zts --prefix=$PHPDIR/install-debug-zts
$ make install

I recommend using a custom install prefix ($PHPDIR/install-debug-zts in the example above), to keep it separate from any PHP you might have installed previously.

If you install multiple versions of PHP to the default location (/usr/local), it’ll get really annoying really fast: you always have to re-install if you want to try a different build, package managers often install PHP in /usr (so you may have two PHP installs floating around), and the PHP install is oddly coy about overwriting existing files: sometimes it decides to just leave the old versions there if a file already exists.

Thus, it pays to keep things organized in custom installation folders.

There are a couple of configuration options that you should enable, too, for extension development: –enable-debug (debugging info) and –enable-maintainer-zts (thread stuff and memory tracking).

Once make install is done, you’ve got PHP installed! Add $PHPDIR/install-debug-zts/bin to your path with:

$ # this will only add it to the path for this shell
$ PATH=$PHPDIR/install-debug-zts/bin:$PATH

Now you’re ready to make an extension.

Next up: writing your first extension.

Scaling, scaling everywhere

Interested in learning more about scaling MongoDB? Pick up September’s issue of PHP|Architect magazine, the database issue! I wrote an article on scaling your MongoDB database: how to choose good indexes, help handle load using replication, and set up sharding correctly (it’s not PHP-specific).

If you prefer multimedia, I also did an O’Reilly webcast on scaling MongoDB, which you can watch below:

Unfortunately, I had some weird lag problems throughout and at the end it totally cut my audio, so I didn’t get to all of the questions. I asked the O’Reilly people to send me the unanswered questions, so I’ll post the answers as soon as they do (or you can post it again in the comments below).

MongoDB PHP Driver 1.0.3 Release

Version 1.0.3 was released today.  Everyone should upgrade because there were some weird bugs in 1.0.2 due to a half-complete feature that was added in 1.0.2 and has since been removed.  Unfortunately, because I’ve had to bump up the release date, the big feature that was scheduled for 1.0.3, asynchronous queries, has been pushed to 1.0.4.  Sorry guys.  However, I’m working hard on the asynchronous stuff and I’ll get 1.0.4 out the door ASAP.

The only API change in this release is the addition of client side cursor timeouts.  For example, to create a cursor that will wait 2.5 seconds for queries to complete:

$cursor = $collection->find()->timeout(2500);

Time is specified in milliseconds.  If the query takes longer than the specified timeout, a MongoCursorTimeoutException will be thrown.  Timeouts do not affect MongoDB itself, your query will still be running on the server. It is merely a client side convenience.

Also, array serialisation is significantly faster in this version (only “normal” array serialisation, not associative array serialisation).

Upcoming Talks

Want to learn more about MongoDB?  Here’s the places I’ll be speaking in the next month or so:

If your event desperately needs a NoSQL talk, feel free to contact me at kristina at mongodb dot org.

(Woohoo! I’m going to Belgium! …Not that Long Island isn’t exciting, but…)

Replacing $ in the MongoDB PHP Driver

I’ve just added a feature to the Mongo PHP driver (and I plan to add it to the Perl driver soon) to use a character other than $ for special ops.

In Mongo, there are tons of interesting things you can do by using $-prefixed strings:

// run a server-side function as part of a query
db.collection.find({$where : function() { ... });

// increment a field in an update
db.collection.update({_id : id}, {$inc : {counter : 1}});

// add more cowbell to an array
db.collection.update({_id : id}, {$push : {life : "cowbell"}});

There are a gazillion more, and we keep adding them. Anyway, it’s a bit of a pain in PHP (and Perl) because “$var” means “replace $var with the variable $var and convert it to a string.” As there is probably no $var variable, it’s null, and null converted to a string is “”. So, instead of the string “$var”, you get “”. Of course, you can prevent this by saying “$var” or ‘$var’ (single quotes), but people have requested being able to use a different character, instead of $.

I considered choosing one for people, but then I decided it was better to let people choose their own. So, if you want to use the character with the ASCII value of 0, go nuts (anyone who maintains your code certainly will).

To choose an alternative, add a line to your php.ini file:

mongo.cmd = "+"

…replacing “+” with whatever symbol you’d like to use. If you do not have access to your php.ini file or you feel like being a jerk (I think doing this makes your code pretty impenetrable), you can change it anywhere in your code with ini_set:

// use : for an update (instead of $inc)
ini_set("mongo.cmd", ":");
$collection->update(array("_id" => $id), array(":inc" => array("counter" => 1)));

// use > (instead of the $cmd collection)
ini_set("mongo.cmd", ">");
$cmdCollection = $db->selectCollection(">cmd");
$cmdCollection->findOne(array("getlasterror" => 1));

// use x for db refs (this is a particularly bad idea, use a character that 
// won't occur as the first character of a key name)
ini_set("mongo.cmd", "x");
$collection->insert(array("ref" => array("xref" => $ns, "xid" => $id)));

My recommendation is to decide on an alternative character and stick with it.

Some technical details:

  • $ will still work. Even if you choose another character, you can still use $s, too.
  • Only the first character will be replaced by $. So, if you chose “~” as your substitute character, “~~~” would be sent to the database as “$~~”.
  • I made this work in all of the places I could think of where you’d need a $ (key names, database references, and collection names). If you think of any others, please let me know so I can add them

PHP Extension Wiki

I started a wiki on this site (http://www.kchodorow.com/php) to write down all the stuff I learn about writiing PHP extensions. If anyone else has experience with them, feel free to add or edit articles.

Some basics: a PHP extension is written in C. In fact, PHP itself is written in C, so there’s a lot of good source code to look at out there. There’s an excellent introduction to writing PHP extensions at Zend DevZone. However, it doesn’t go into a lot of the specifics, which is why I started the wiki. I had to figure out how to do a ton of stuff on my own, mostly by digging through the PHP source code and other extensions’ source code. No one should have to look through 500 undocumented C files to figure out how to create a PHP class in C. (However, if you like digging through source code, it’s all available to view on the web. Extensions are under pecl and PHP source is available under php-src.)

I feel like I have a pretty good handle on how to do almost anything with PHP in C, so if anyone has any questions or suggestions for an article, feel free to ask and I’ll try to write a page on it.

Upcoming pages I’d planning on writing:
– Throwing exceptions
– How to extend/implement other classes
– Using HashTable