I’m going to Brazil!

I have been invited to speak at Latinoware 2009!  It’s October 22-24th and I’m giving a talk on non-relational databases in general, MongoDB in particular.  I’ll be giving it in English, but it’s going to be simulcast in Spanish and Portugese.

I’ve never been to South America, and I’m so excited.  The conference is right near these famous waterfalls that look freakin gorgeous.  It looks like we might be able to check out the rainforest a bit, too (it’s right nearby).  So.  Cool.

I made the travel reservations today and it’s going to take me 15 hours to get there, 19 hours to get back.  The way back has a seven hour stopover, followed by a 10 hour flight.  Ugh… not looking forward to that part of it.  At least it’s at the end of the trip so we can get home and collapse.

Anyway, I am really looking forward to seeing Brazil.  I’m going to try to learn some Portuguese before I go, but I dunno… I hope people speak some English.  I’ve never been very good at picking up (non-programming) languages.

OSCON

I’m at OSCON, which is really fun (and exhausting).

Last night, when my company’s venture capitalist found out I had MongoDB stickers, he asked me to put them on the tables scattered around the conference area, where everyone sat to talk and surf the web.

For some reason, doing so made me feel like a total tool, but this morning, I sucked it up and put a little pile on each table.  As I was walking away from one table, a guy sitting there loudly said, “I bet she doesn’t even know what ‘Mongo’ is from.”  Jerk.  What geek isn’t a Mel Brooks fan?  Even if I hadn’t seen Blazing Saddles, I work on the freakin project.

When I put the last pile down on the table at which I was eating breakfast, the guy sitting next to me immediately picked one up and stuck it on his laptop.

“Do you use MongoDB?” I asked, surprised at the chances, but gratified.

“No, I’ve never heard of it, but it’s a lot nicer than the HP logo.”

So, good job Jason!  (The graphic designer for MongoDB.)  I mean, not only with the spiffy business cards and stickers, but he actually made our Confluence wiki pretty, which is no mean feat.

Soccer: No Slimy GirlS Allowed!

Now, I love playing soccer, which I haven’t gotten to do much since high school. Recently, someone created a pickup soccer game meetup for IT professionals (http://www.meetup.com/soccer-nyc/).  What with it being geeks+soccer, this sounded awesome, so I requested to join.  Denied!

Your request to join NYC Turquoise IT – Soccer For IT Professionals has been declined

The person who declined your request, Xxxxxxxxxx, said:
—————————————————————-
sorry not for girls
—————————————————————-

and he added the “Male only” bullet point to the membership requirements.

WTF? It’s a freakin pickup game! Are they afraid women will get hurt? Sit around buffing our nails? What?

Edit: I just told my cowoker Mike about it and he linked me to http://groups.google.com/group/nyc-pickup-soccer, which is a group he started and thus, he assures me, is coed.  Everyone (interested in soccer) should join this group!

How do you make the web reliable?

Andrew and I were talking this morning about how we can count on the New York Times to be accurate, but an article linked to by Reddit is often horribly biased.  We began thinking about how to hold the web accountable, and we came up with a nifty idea.  Make it a wiki/competition/social network.  Here’s our plan:

You download a Firefox extension that puts  a little “+/- comments” in the corner of your screen.  If you’re on a page you like, you click the “+”.  If you don’t like the page, you click the “-“.  If you have more to say, click “comments” and type in a comment about the page, and anyone visiting the site after you will be able to see it.  Clicking “comments” will show you all the comments other people have made about this page, too.

Let’s take an example.  Recently, Microsoft put up a page comparing IE8 to other browsers.  It was… a bit biased.  But there’s no comments section!  So, using my handy-dandy extension, I click on “-” because I think it’s a dumb page.  Then I click on “comments” and type “biased much?” (clever, eh?).  Then I see that someone else made a truly intelligent comment and linked to an impartial comparison.  I click on the “+” next to their name, upvoting their comment.  Oh, also, I see that the page has been rated -923 by users overall.

Now, I’m totally riffing off of Stack Overflow here, which had the brilliant idea of attaching karma to users. When I upvote a user’s comment, they get 10 karma points.  When I downvote, they lose a karma point, and so do I.  More karma gives you more privileges.  

I’m really psyched about this.  It seems like it would be easy to develop the basic idea (basically a user system and a blog system) and there are a zillion features we can add later. Benefits for users:

  • Ability to comment on web sites with no comments section (cool ones I’ve thought of are Twitter pages, ftp:// (downloads), and pay sites’ logins (yes, I’m a jerk))
  • The satisfaction of gaining karma to gain more power
  • Access to other visitors to the site, whom you’d never normally be able to interact with

So, I’m going to try to implement it.

CouchDB vs. MongoDB Benchmark

Edit (9/1/10): this benchmark is old, silly, and should probably be ignored in favor of more recent and representative ones. I don’t want to take it down for historical purposes, but seriously people, it was never a good benchmark, it’s over a year old at this point, and both databases have changed a lot.

Edit (12/6/09): this is the #1 Google result for “mongodb benchmark”, so I figure I’ll do some community service: if you’re interested in benchmarks, you might want to look at the 3rd party ones listed on the mongodb.org website.


Felix Geisendörfer did a benchmark in PHP that was super-easy for me to port into MongoDB. You can see his post on his blog.

And now… comparing his results for CouchDB with mine for MongoDB’s (I did the graph in Open Office, which is why the quality sucks):

As you can see, MongoDB does, uh, slightly better.  Here are the numbers:

# of Inserts Couch Total Time (sec) Couch Time/Doc (ms) Mongo Total Time (sec) Mongo Time/Doc (ms)
1 .0015 1.46 .0005 .5
2 .0015 .75 .0004 .2096
3 .0017 .56 .0005 .1604
4 .0017 .44 .0005 .1190
5 .0018 .36 .0005 .1060
6 .0019 .32 .0006 .0931
7 .0021 .3 .0006 .0847
8 .0022 .27 .0007 .0789
9 .0023 .25 .0007 .0734
10 .0025 .25 .0007 .0721
50 .0072 .14 .0024 .0476
100 .0136 .14 .0044 .0442
500 .0687 .14 .0253 .0505
1000 .1361 .14 .0372 .0372
2500 .4686 .19 .0278 .0111
5000 .9165 .18 .0488 .0098
7500 1.5116 .2 .0835 .0111
10000 2.3111 .23 .1065 .0107
25000 6.8684 .27 .2711 .0108
50000 15.8227 .32 .5430 .0109
100000 35.3071 .35 .1.7697 .0177
250000 104.0009 .42 6.4533 .0258
500000 230.6021 .46 11.7684 .0235
750000 352.7959 .47 17.0473 .0227
1000000 487.3284 .49 18.4376 .0184

Please let me know if I made any mistakes, all the values were hand-copied.

I ran these tests using the PHP driver on Ubuntu 9.04 on my MacBook Pro.  You can see the test script I forked on Github.

A little analysis: Both DBs start with some overhead, but by 1000 inserts CouchDB seems to be chugging along nicely.  MongoDB takes slightly longer to hit its groove, hitting its peak around 10000.  They both slow a little near the end, as MongoDB starts spending most of its time allocating files and, although I know almost nothing about CouchDB’s structure, I’d guess it’s doing something similar.

Because we are the Mutha Flippin Win

I’ve been keeping records of funny and nice things people have said about MongoDB. A lot of them can be found at http://www.mongodb.org/display/DOCS/User+Feedback, but there are some good ones that weren’t quite… “right” for the official page. So, slightly off-color, biased, or weird quotes ahoy:

Mailing list:

“Guys at Redmond should get a long course from you about what is the software development and support :-)”
-kunthar

Twitter:

@mully It was a well orchestrated effort for @mongodb world domination. I was the shock, @jnunemaker was the awe.
-pengwynn

the @mongodb guys are great! They figure out segfaults, fix bugs and push new builds in practically real time!
-brondsem

Mongo looks like the Mutha Flippin Win. I need to carve some time out to play. http://www.mongodb.org/display/DOCS/Home
-thejohnny

IRC:

“limit=-10 is my hero”
-claymation

Blogs

“It actually has a kind of quirky, yet lovable syntax for defining criteria.”
-http://railstips.org/2009/6/3/what-if-a-key-value-store-mated-with-a-relational-database-system

Revisionist history

I’ve switched out the first ~10 cartoons, and replaced them with one-shot gags. So there are a bunch of new comics to read, starting with the first one.

I’ve always preferred doing The Far Side-/New Yorker-style cartoons, but when I started this site, I thought I should do an ongoing story with characters and plot an all. It turns out that I like killing my characters too much for that to really work out. I hope you enjoy their brief existences.

PHP Extension Wiki

I started a wiki on this site (http://www.kchodorow.com/php) to write down all the stuff I learn about writiing PHP extensions. If anyone else has experience with them, feel free to add or edit articles.

Some basics: a PHP extension is written in C. In fact, PHP itself is written in C, so there’s a lot of good source code to look at out there. There’s an excellent introduction to writing PHP extensions at Zend DevZone. However, it doesn’t go into a lot of the specifics, which is why I started the wiki. I had to figure out how to do a ton of stuff on my own, mostly by digging through the PHP source code and other extensions’ source code. No one should have to look through 500 undocumented C files to figure out how to create a PHP class in C. (However, if you like digging through source code, it’s all available to view on the web. Extensions are under pecl and PHP source is available under php-src.)

I feel like I have a pretty good handle on how to do almost anything with PHP in C, so if anyone has any questions or suggestions for an article, feel free to ask and I’ll try to write a page on it.

Upcoming pages I’d planning on writing:
– Throwing exceptions
– How to extend/implement other classes
– Using HashTable

From Russia with Bugs

My Windows partition got a virus. Ugh. Somehow I’ve managed to avoid this, I’ve never gotten a virus on any of my computers before. Anyway, being the geek I am, I decided to get it off my computer without any of this fancy-shmancy (expensive) anti-virus software. My journey:

1. I tried to edit my registry. I got the message: “Your administrator has turned off this function.” Basically, the virus, which had obviously installed itself with whatever permissions my user has, had made it impossible for me to open the registry editor. I’m cool with the registry only being editable by an admin, but then Windows should freakin prompt you for the admin password. And if the virus installed itself when I was logged in, it shouldn’t be able to do admin things!

2. I tried to delete all the virus’s files using Windows Explorer. The virus had deleted “Folder Options” from the menu, so I couldn’t delete hidden files and folders. Why the fsck would Windows let any program remove functionality willy nilly from Explorer?!

3. They could remove the “Show hidden files and folders” option from the GUI, but I’m a Linux programmer. I can just use the command line. Well, the Windows command line is awful. Or hard to use. Or both. I couldn’t make any headway with it. I couldn’t even make it do the equivalent of “ls -a”.

4. The virus put executables in system32, c:, and other folders that should be only writable by the administrator.

Anyway, that’s why I haven’t put up any new cartoons this week. I’m not sure what to do now. A combination of the virus and Windows itself has totally flummoxed me. I’d like to just get rid of Windows, but I need it for Photoshop.

Get on the bus, Gus

I booted into OS X today and tried running the unit tests for my PHP driver. It chugged away for a while, then gave me “bus error”. Great. Gotta love C error messages. I narrowed it down to when I insert 253 or more elements into MongoDB. Now, that number was suspiciously close to 256, but I couldn’t think why that would be. I tracked it to the encoding code, then to the _id class code, then to the _id generation code. Surprisingly, it was doing its “bus error” thing before any memory access, it was just reading a file. I suddenly realized that I wasn’t closing the file after reading it, fixed it, and it ran perfectly.