FOSDEM: Some Pictures

This picture was taken by outerthought at FOSDEM. People look fairly interested 🙂 There was a guy on the other side of the room who was asleep the whole time, but he was old and I tried not to look at him. You can see I’m all super-professional in my XKCD “I’m not slacking off, my code’s compiling” teeshirt.

Given I’m in Belgium, I sneaked a few Magrittes into my slideshow:


They just seemed to call out “consistency” and “transactions” for me.

Andrew and I actually tried to visit the Magritte museum today, but we couldn’t find it. We walked all around, circled the block… I figure it must be in his old house and it was closed, because it was indistinguishable from every other townhouse on the street. Annoying.

FOSDEM

I gave a talk at FOSDEM (Free and Open Source Developers European Meetup) this morning: “Introduction to MongoDB”. It went pretty well, I think. Slides are up at scribd.com and it was recorded, so the video for it should be somewhere soon (I’ll update when I find out where).

The trip across the Atlantic was interesting. It was so bumpy that one of the stewardesses serving drinks fell over and the captain announced “All flight attendants, take your seats with your seat belts fastened!” Takeoff had been delayed to fix something and, in addition to the regular dropping a couple dozen feet in altitude, the engine kept making funny noises, and I was pretty sure we were going to go down. I considered putting my shoes on, but I decided that the last thing I wanted while floating on a raft in the North Atlantic was wet shoes. The plane pulled through, though, and eventually we got to Belgium. So, no excellent story (or fiery death) for me.

One of the cool things about being here was that I got to meet chx, a Drupal developer. He’s helping integrate MongoDB and Drupal 7. We have been wanting to send him some schwag but he’s on an extended visit to relatives who live on a hill that the postman can’t climb. However, I knew he was going to be here, so I carried a mug all the way to Belgium and got to give it to him. Mongo devs: neither snow nor sleet nor gloom of 3000 miles of ocean keep these swift couriers from delivering mugs. Woot.

My talk was at 10am (4am New York time… ugh). Andrew and I went to the conference’s cafeteria beforehand so I could get some coffee. It was… interesting. I have a theory on how Belgians make coffee: they brew a pot of coffee, and then let it sit on a burner until only a cup is left in the pot. Then they serve you that cup. Now, I am grateful, because I managed to drink it (with the help of a chocolate croissant) and it kept me upright for my talk, but I am glad I live in a country where people like their coffee watery.

Andrew and I are at what I think is the Belgian equivalent of a diner, where we’re having some coffee and beer. I feel like a total philistine, but I can’t actually tell the difference between Belgian beer and a decent American beer. Obviously more data points are necessary, I’ll be looking into it further tonight.

Giving talks is fun, but stressful. I feel like my whole body is relaxing now. I’m looking forward to sleeping at least 12 hours tonight.

Mongo Mailbag: Master/Slave Configuration

Trying something new: each week, I’ll take an interesting question from the MongoDB mailing list and answer it in more depth.  Some of the replies on the list are a bit short, given that the developers are trying to, you know, develop (as well as answer over a thousand questions a month).  So, I’m going to grab some interesting ones and flesh things out a bit more.

Hi all,

Assume I have a Mongo master and 2 mongo slaves.  Using PHP, how do I do it so that writes goes to the master while reads are spread across the slaves (+maybe the master)?

1) 1 connect to all 3 nodes in one go, PHP/Mongo handles all the rest
2) 1 connect to the master for writes. Another connection to connect to all slave nodes and read from them.

Thanks all and sorry for the noobiness!

-Mr. Google

Basics first: what is master/slave?

One database server (the “master”) is in charge and can do anything.  A bunch of other database servers keep copies of all the data that’s been written to the master and can optionally be queried (these are the “slaves”).  Slaves cannot be written to directly, they are just copies of the master database.  Setting up a master and slaves allows you to scale reads nicely because you can just keep adding slaves to increase your read capacity.  Slaves also make great backup machines. If your master explodes, you’ll have a copy of your data safe and sound on the slave.

A handy-dandy comparison chart between master database servers and slave database servers:

Master Slave
# of servers 1
permissions read/write read
used for queries, inserts, updates, removes queries

So, how do you set up Mongo in a master/slave configuration?  Assuming you’ve downloaded MongoDB from mongodb.org, you can start a master and slave by cutting and pasting the following lines into your shell:

$ mkdir -p ~/dbs/master ~/dbs/slave
$ ./mongod --master --dbpath ~/dbs/master >> ~/dbs/master.log &
$ ./mongod --slave --port 27018 --dbpath ~/dbs/slave --source localhost:27017 >> ~/dbs/slave.log &

(I’m assuming you’re running *NIX.  The commands for Windows are similar, but I don’t want to encourage that sort of thing).

What are these lines doing?

  1. First, we’re making directories to keep the database in (~/dbs/master and ~/dbs/slave).
  2. Now we start the master, specifying that it should put its files in the ~/dbs/master directory and its log in the ~/dbs/master.log file.  So, now we have a master running on localhost:27017.
  3. Next, we start the slave. It needs to listen on a different port than the master since they’re on the same machine, so we’ll choose 27018. It will store its files in ~/db/slave and its logs in ~/dbs/slave.log.  The most important part is letting it know who’s boss: the –source localhost:27017 option lets it know that the master it should be reading from is at localhost:27017.

There are tons of possible master/slave configurations. Some examples:

  • You could have a dozen slave boxes where you want to distribute the reads evenly across them all.
  • You might have one wimpy little slave machine that you don’t want any reads to go to, you just use it for backup.
  • You might have the most powerful server in the world as your master machine and you want it to handle both reads and writes… unless you’re getting more than 1,000 requests per second, in which case you want some of them to spill over to your slaves.

In short, Mongo can’t automatically configure your application to take advantage of your master-slave setup. Sorry.  You’ll have to do this yourself. (Edit: the Python driver actually does handle case 1 for you, see Mike’s comment.)

However, it’s not too complicated, especially for what MG wants to do.  MG is using 3 servers: a master and two slaves, so we need three connections: one to the master and one to each slave.  Assuming he’s got the master at master.example.com and the slaves at slave1.example.com and slave2.example.com, he can create the connections with:

$master = new Mongo("master.example.com:27017");
$slave1 = new Mongo("slave1.example.com:27017");
$slave2 = new Mongo("slave2.example.com:27017");

This next bit is a little nasty and it would be cool if someone made a framework to do it (hint hint).  What we want to do is abstract the master-slave logic into a separate layer, so the application talks to the master slave logic which talks to the driver.  I’m lazy, though, so I’ll just extend the MongoCollection class and add some master-slave logic.  Then, if a person creates a MongoMSCollection from their $master connection, they can add their slaves and use the collection as though it were a normal MongoCollection.  Meanwhile, MongoMSCollection will evenly distribute reads amongst the slaves.

class MongoMSCollection extends MongoCollection {
    public $currentSlave = -1;

    // call this once to initialize the slaves
    public function addSlaves($slaves) {
        // extract the namespace for this collection: db name and collection name
        $db = $this->db->__toString();
        $c = $this->getName();

        // create an array of MongoCollections from the slave connections
        $this->slaves = array();
        foreach ($slaves as $slave) {
            $this->slaves[] = $slave->$db->$c;
        }

        $this->numSlaves = count($this->slaves);
    }

    public function find($query, $fields) {
        // get the next slave in the array
        $this->currentSlave = ($this->currentSlave+1) % $this->numSlaves;

        // use a slave connection to do the query
        return $this->slaves[$this->currentSlave]->find();
    }
}

To use this class, we instantiate it with the master database and then add an array of slaves to it:

$master = new Mongo("master.example.com:27017");
$slaves = array(new Mongo("slave1.example.com:27017"), new Mongo("slave2.example.com:27017"));

$c = new MongoMSCollection($master->foo, "bar");
$c->addSlaves($slaves);

Now we can use $c like a normal MongoCollection.  MongoMSCollection::find will alternate between the two slaves and all of the other operations (inserts, updates, and removes) will be done on the master.  If MG wants to have the master handle reads, too, he can just add it to the $slaves array (which might be better named the $reader array, now):

$slaves = array($master, new Mongo("slave1.example.com:27017"), new Mongo("slave2.example.com:27017"));

Alternatively, he could change the logic in the MongoMSCollection::find method.

Edit: as of version 1.4.0, slaveOkay is not neccessary for reading from slaves. slaveOkay should be used if you are using replica sets, not –master and –slave. Thus, the next section doesn’t really apply anymore to normal master/slave.

The only tricky thing about Mongo’s implementation of master/slave is that, by default, a slave isn’t even readable, it’s just a way of doing backup for the master database.  If you actually want to read off of a slave, you have to set a flag on your query, called “slaveOkay”.  Instead of saying:

$cursor = $slave->foo->bar->find();

we have:

$cursor = $slave->foo->bar->find()->slaveOkay();

Or, because this is a pain in the ass to set for every query (and almost impossible to do for findOnes unless you know the internals) you can set a static variable on MongoCursor that will hold for all of your queries:

MongoCursor::$slaveOkay = true;

And now you will be allowed to query your slave normally, without calling slaveOkay() on each cursor.

References:

Washington DC

I’m giving a talk at DC PHP in… oh… 40 minutes.  I started the day in New York City, and hopefully will make it back there tonight.  Long day.

This morning, when I got to Penn Station, I tried to use the ticket machine’s barcode scanner.  It usually scans my confirmation page and prints out the tickets.  Nothing.  I put down all my stuff and tried a bunch of variations, but the barcode scanner continued to ignore me.  Fine.  I took out my credit card and retrieved my tickets that way.  I picked up my coat and… it was all wet.  Ew.  I wiped my coat off on my jeans, figuring jeans would dry out faster.  As I was walking over to the trains, I had an unpleasant thought.  Maybe that wasn’t water… I sniffed my coat.  Ugh.  Nope, not water.  Now, not only does my coat smell like hobo piss, I brilliantly wiped it on my jeans so now I smell like hobo piss.  Hopefully I’m talking in a big, well-ventilated room.

So far, Washington gets an A+.  The train station was great (although possibly Penn Station is such a cesspool that my expectations have been lowered, but it was really nice) and the information booth gave me good directions to the Metro (which I promptly ignored, but that was hardly their fault).  The Metro was really nice (padded seats!), fast, well-marked, and made sense.

I was walking along K Street and saw the Washington Monument, which I took a picture of like a total tourist.  Then I realized that there was a white house under it, possibly the white house?  I was actually kind of excited to see it in real life, so I moved in to get a better picture.  Once I was closer, I realized it was just a white house, not the White House.  I took a picture of it anyway.  It was probably something.

Edit: my talk went great and the people were super nice.  Here are the pictures I took of DC.

MongoDB PHP Driver 1.0.3 Release

Version 1.0.3 was released today.  Everyone should upgrade because there were some weird bugs in 1.0.2 due to a half-complete feature that was added in 1.0.2 and has since been removed.  Unfortunately, because I’ve had to bump up the release date, the big feature that was scheduled for 1.0.3, asynchronous queries, has been pushed to 1.0.4.  Sorry guys.  However, I’m working hard on the asynchronous stuff and I’ll get 1.0.4 out the door ASAP.

The only API change in this release is the addition of client side cursor timeouts.  For example, to create a cursor that will wait 2.5 seconds for queries to complete:

$cursor = $collection->find()->timeout(2500);

Time is specified in milliseconds.  If the query takes longer than the specified timeout, a MongoCursorTimeoutException will be thrown.  Timeouts do not affect MongoDB itself, your query will still be running on the server. It is merely a client side convenience.

Also, array serialisation is significantly faster in this version (only “normal” array serialisation, not associative array serialisation).

Upcoming Talks

Want to learn more about MongoDB?  Here’s the places I’ll be speaking in the next month or so:

If your event desperately needs a NoSQL talk, feel free to contact me at kristina at mongodb dot org.

(Woohoo! I’m going to Belgium! …Not that Long Island isn’t exciting, but…)

Mongo Just Pawn in Game of Life

This is in response to this nifty blog post on storing a chess board in MySQL and this snarky Tweet about NoSQL DBs (because I’m never snarky).

On the one hand, I can’t believe I’m doing this. What database can’t store a chessboard? On the other hand, it’s fun, and once I thought of the title, I really had to write the post. Let the pointless data storage begin!

Okay, so first we need a representation for a chess piece. I’m tempted to just use the UTF-8 symbol and position, but it would be nice ot have a human-readable string to query on. So, we’ll use something like:

{
    "name" : "black king",
    "symbol" : "♚",
    "pos" : [8, "e"] 
}

Ha! Can your relational database query for a subfield of a subfield of type integer or string? (Actually, I have no idea, maybe it can.) Anyway, moving right along…

So, MongoDB can just run JavaScript, so I’ll write a JavaScript file that does everything we need. Here’s the code to create the basic chess board. “db” is a global variable that is the database you’re connected to. It defaults to “test”, so we’ll start by switching it to the “chess” database. If it doesn’t exist yet, it’ll be created when we put something in it. Then we’ll actually populate it:

// use the "chess" database, creates it if it doesn't exist
db = db.getSisterDB("chess");
// make sure the db is empty (in case we run this multiple times)
db.dropDatabase();

// map indexes to chess board locations
column_map = {0 : "a", 1 : "b", 2 : "c", 3 : "d", 4 : "e", 5 : "f", 6 : "g", 7 : "h"};
 
// starting at 1a
color_char = {"black" : "█", "white" : " "};
color = "black";
for (i=1; i<=8; i++) {
    for(j=0; j<8; j++) {
        db.board.insert({x : i, y : column_map[j], color : color_char[color]})
 
        /* 
         * switch the color of the square... it's always the opposite
         * of the previous one, unless we're at the end of a row
         */
        if (j != 7) {
            color = color == "white" ? "black" : "white";
        }
    }
}

Okay, now let’s iterate through the pieces, create their objects, and add them to the board:

// create unique ids from symbols
function get_name(symbol, column) {
    switch (symbol) {
    case '♖':
    case '♜':
        return " rook " + (column < 4 ? "left" : "right");
    case '♘':
    case '♞':
        return " knight " + (column < 4 ? "left" : "right");
    case '♗':
    case '♝':
        return " bishop " + (column < 4 ? "left" : "right");
    case '♕':
    case '♛':
        return " queen";
    case '♔':
    case '♚':
        return " king";
    case '♙':
    case '♟':
        return " pawn " + column;
    }
}


// go through the 2D array of pieces, create the objs, and insert them
function add_pieces(color, color_str) {
    for (row=0; row<color.length; row++) {
         chess_row = row + (color_str == "white" ? 1 : 7);

         for (column=0; column < color[row].length; column++) {
             chess_column = column_map[column];
 
             db.board.update({x : chess_row, y : chess_column}, {$set : {piece : {name : color_str+get_name(color[row][column], column), symbol : color[row][column], pos : [chess_row, chess_column]}}});
       }
    }
}

add_pieces([['♖','♘','♗','♕','♔','♗','♘','♖'], ['♙','♙','♙','♙','♙','♙','♙','♙']], "white");
add_pieces([['♟','♟','♟','♟','♟','♟','♟','♟'], ['♜','♞','♝','♛','♚','♝','♞','♜']], "black");

Phew! The hard part is done. Let’s print out this sucker!

// sort by x from 8-1 and y from a-h
cursor = db.board.find().sort({x:-1, y:1});


count = 0;
board = "";
while(cursor.hasNext()) {
    square = cursor.next();
    if (square.piece) {
        board += square.piece.symbol;
    }
    else {
        board += square.color;
    }

    count++;
    if (count % 8 == 0) {
        board += "n";
    }
}
print(board);

And we get:

♜♞♝♛♚♝♞♜
♟♟♟♟♟♟♟♟
 █ █ █ █
█ █ █ █ 
 █ █ █ █
█ █ █ █ 
♙♙♙♙♙♙♙♙
♖♘♗♕♔♗♘♖

Very snazzy. Now we can query by symbol, human readable name, or board position. Also, it’ll only take two updates to move a piece. (I attached chess.js, if you don’t want to copy/paste it yourself.)

Public Speaking: The Prequel

Me, presenting at SF JUG
Me, presenting at SF JUG

There’s a video that everyone seems to have seen of me (seriously, when I went to Brazil everyone mentioned it) presenting MongoDB to the San Francisco Java User Group.  Unfortunately, I think it’s the worst presentation I’ve ever given, partly because of the lead-up and partly because of inexperience.

I looked up directions and gave myself an extra 15 minutes to get to the talk.  It looked like I had to take a bus so I walked down to the bus stop, but all the buses that went by had a different naming scheme than what I had to take.  I asked the driver on the next bus that went by and he pointed downwards and drove off, which I realised a second later probably meant that I wanted the subway.  Whoops.

I went downstairs and saw ticket machines, so I went over and bought tickets.  Then I went over to the gate, which didn’t seem to take tickets.

“Excuse me,” I said to the guard, “how do I get in?”

“Oh, that’s a BART ticket, this is the Muni system, it takes quarters.”

So I had to wait in line at the ticket/change machines again, because I didn’t have eight quarters on me.

I finally made it to a platform and the stupid Muni came.  I had to go two stops.  At the second stop I got out, went upstairs, exited the station and… had no idea where I was.  I had somehow gotten off at the wrong station.  I started to freak out.  However, San Francisco is a city, and as a city, it has cabs.  I gave up on public transportation and hailed a cab.  The driver drove me all of five blocks to the building where the Java User Group was meeting.  I handed him the rest of the money in my wallet and ran in.

By the time I got there, I was at least five minutes late.  However, there were thirty people waiting to sign in, so I relaxed a bit as I waited in line.

The organiser looked relieved when he saw me and pulled me and the other presenter aside.  We were supposed to each talk for an hour and hold all questions until the end.  I was up first and started in. With about two slides to go, I casually checked my cellphone to see if I was on track with the time.  I wasn’t.

I had been talking for 25 minutes.

I must have looked like I had suddenly been hit by a bus.  In the video, you can see me suddenly run my hand through my hair about 16 times (I didn’t even realise I did that when I was nervous).  Then I made those two slides last as long as I damn well could, which was about 10 minutes.

When people were done asking questions, I skulked to the back of the room where I found a seat on the edge.  After a few minutes, I couldn’t take it anymore, so I went to the bathroom where I could freak out in peace.  When I got back, I sat down and tried not to think about my talk.  And, after about ten minutes, the other speaker wrapped up his talk.  It seemed like he had just started, but I assumed trauma had made time go all squiggly.  I checked my cellphone, and indeed, he had only talked for a half-hour, too!  I had died a thousand deaths for nothing.

Afterwards, I talked to a bunch of cool people who were working on interesting projects, which was definitely the highlight.  Once all the people who wanted to talk had gone, I packed up and left.  There was a bus that could take right to my hotel that I was right on time to catch.  I walked towards the stop and, as I approached, I heard the bus turn the corner behind me.  I was ahead of it, but it gained speed on the straightaway.  It was a long block, and it got a couple-hundred yard lead before it stopped to pick up passengers at the end of the block.  I went into full speed, sprinting down the street with my damn brick of a laptop.  The bus was stuck at the stop, as the traffic light ahead of it was red.  Hurrah!  I covered the last fifty yards, touched the back of the bus, and… the light turned green and the bus drove off.

I.  Was.  So.  Pissed.

I decided that I’d walk back to the hotel, since I now vaguely knew where I was.  It was a half-hour walk, but I was done with San Francisco public transportation.

When I got home, I realised it was just as well that the bus had driven off, because I didn’t even have a dollar left in my wallet after paying for the Bart ticket, Muni quarters, and taxi ride.

NoSQL Trolls

trollI have a Twitter feed for the term “nosql” and every day I get tweets like:

“What moron came up with #nosql?  you’re all fired!”

“nosql is making all the same mistakes people made 40 years ago… relational dbs won!”

“yeah, use nosql… if you don’t mind losing all your data”

(these are based on real tweets, but aren’t actually verbatim.  They’re all pretty much the same.)  I hope I meet someone who says this to me someday, though, so I can say: “Boy, what a good point!  If only Google and Yahoo and LinkedIn and Twitter and the thousands of other high-traffic websites had listened to you.  Obviously you know what’s going on better than they do, this NoSQL thing is just a bunch of idiots spinning their wheels.”

Then, as they reeled, rendered helpless by my cunning sarcasm, I’d continue in a slightly different vein: “You freakin’ moron!  Relational databases failed miserably for huge websites, so alternative database popped up to fill that need.  And, so long as we’re making a new database, we figure computer languages and database administration have changed a bit since 1974, so we might as well make dbs easier to use.  You’re welcome!”

Then I’d punch them in gibblies until they saw my point.

I might need a vacation.

Public Speaking

MongodbI talked at LILUG (Long Island Linux User Group) about MongoDB on Tuesday, which was really fun.

Not that it started out all that well. One of LILUG’s officer’s picked me up at the station. We had… an imperfect fusion of souls. He told me, on the way to the pub, that he was sick of GNU, sick of Linux, and sick of being an officer in LILUG. When we got to the pub, he made a joke about Chinese people having slanty eyes. Ugh.

Anyway, we got there, and met a couple guys standing at the bar. The older one (~60) asked if he could buy me a drink, to which I awkwardly acquiesced. I’m never sure what the protocol is, if there’s anything implied by accepting a drink. The other guys wandered off towards a table, but the old guy made no move.

“Um, are you… um… a member of LILUG?” I asked.

“No,” he said, staring at me. Whoops.

“Um, I have to go,” I said, and went over to the table where the LILUG guys had gone.

“Yeah, he’s just some random guy who was at the bar,” one of the members said, laughing. Har de har.

I ordered nachos, which turned out to be a freakin mountain that could have served six people. They were seriously piled almost a foot high on the plate. They were delicious, and a bunch of really nice LILUG members showed up.

Once we were done eating, we went over to Stony Brook, which is where I was actually giving my talk. I’ve had a cold for the last week and I was a bit nervous about my voice giving out, but it held up and people really seemed to enjoy it.

I like it when people ask lots of questions an participate, and I had a brainwave before I left the office on how to encourage it. When I started my presentation, I told people to feel free to ask questions. “And the first person who asks a question,” I said, rooting around in my bag, “gets this fabulous Mongo mug!” I told them, unearthing it and holding it up. A college student’s hand shot up. “What was Mongo named after?” he asked. And we were off to the races!

Afterwards, everyone went out for one more drink. “To the downfall of SQL!” someone called, and everyone cheered and toasted to it.

P.S. Just to be clear, I didn’t actually advocate the downfall of SQL, in fact, I specifically mentioned relational databases are needed in some cases. It was cute, though 🙂

P.P.S. Slides are on slideshare.