A Neat C Preprocessor Trick

I’ve been looking at Clang and they define lexer tokens in a way that I thought was clever.

The challenge is: how do you keep a single list of language tokens but use them as both an enum and a list of strings?

Clang defines C token types in a file, TokenKinds.def, with all of the names of the different C language tokens (pretend C only has four tokens for now):

#ifndef TOK
#define TOK(X)


#undef TOK

If you just #include this file, the preprocessor defines TOK(X) as “” (nothing), so the whole thing becomes an empty file.

However! When they want a declaration of all possible tokens that could be used, they makes an enum of this list like this:

enum TokenKind = {
#define TOK(X) X,
#include "clang/Basic/TokenKinds.def"

Because TOK is defined when TokenKinds.def is included, the preprocessor will spit out something like:

enum TokKind = {

This has the nice property that you can check if a type is valid by making sure that it is less than NUM_TOKENS. But if we’re going to put the tokens into that enum, woudln’t it be clearer just to put them there, instead of in a separate file? Maybe, but doing it this way gives them a nice way to get a string representation of the types, too. In another file, they do:

const char* const TokNames[] = {
#define TOK(X) #X,
#include "clang/Basic/TokenKinds.def"

“#X” means that the preprocessor replaces X and surrounds it in quotes, so that turns into:

const char* const TokNames[] = {

Now if they have a token, they can say TokNames[token.kind] to get the string name of that token. It lets them use the token types efficiently, print them out nicely for debugging, and not have to maintain multiple lists of tokens.

4 thoughts on “A Neat C Preprocessor Trick

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: