A Neat C Preprocessor Trick

I’ve been looking at Clang and they define lexer tokens in a way that I thought was clever.

The challenge is: how do you keep a single list of language tokens but use them as both an enum and a list of strings?

Clang defines C token types in a file, TokenKinds.def, with all of the names of the different C language tokens (pretend C only has four tokens for now):

#ifndef TOK
#define TOK(X)
#endif

TOK(comment)
TOK(identifier)
TOK(string_literal)
TOK(char_constant)

#undef TOK

If you just #include this file, the preprocessor defines TOK(X) as “” (nothing), so the whole thing becomes an empty file.

However! When they want a declaration of all possible tokens that could be used, they makes an enum of this list like this:

enum TokenKind = {
#define TOK(X) X,
#include "clang/Basic/TokenKinds.def"
    NUM_TOKENS
};

Because TOK is defined when TokenKinds.def is included, the preprocessor will spit out something like:

enum TokKind = {
    comment,
    identifier,
    string_literal,
    char_constant,
    NUM_TOKENS
};

This has the nice property that you can check if a type is valid by making sure that it is less than NUM_TOKENS. But if we’re going to put the tokens into that enum, woudln’t it be clearer just to put them there, instead of in a separate file? Maybe, but doing it this way gives them a nice way to get a string representation of the types, too. In another file, they do:

const char* const TokNames[] = {
#define TOK(X) #X,
#include "clang/Basic/TokenKinds.def"
    0
};

“#X” means that the preprocessor replaces X and surrounds it in quotes, so that turns into:

const char* const TokNames[] = {
    "comment",
    "identifier",
    "string_literal",
    "char_constant",
    0
};

Now if they have a token, they can say TokNames[token.kind] to get the string name of that token. It lets them use the token types efficiently, print them out nicely for debugging, and not have to maintain multiple lists of tokens.

4 thoughts on “A Neat C Preprocessor Trick”

David Schneider says:

August 27, 2012 at 1:39 am

Using Boost’s preprocessor library you can have it even more flexible. No separate file needed: http://pastebin.com/TKG9W9TL
PS. example compiles with boost 1.44

LikeLike

1. Boostfag says:
  
  August 27, 2012 at 2:49 am
  
  Are you aware of ugly fuck you just linked ? Good luck boosting your shit.
  FYI:
  http://harmful.cat-v.org/software/c++/linus
  
  LikeLike
  
  1. kristina1 says:
    
    August 27, 2012 at 6:54 am
    
    Dude, don’t be a dick.
    
    LikeLike
2. kristina1 says:
  
  August 27, 2012 at 7:04 am
  
  Cool! I like learning different ways of implementing this kind of thing, it’s a neat mental exercise.
  
  LikeLike

Share this:

Related

4 thoughts on “A Neat C Preprocessor Trick”

Leave a comment Cancel reply