I’ve been looking at Clang and they define lexer tokens in a way that I thought was clever.
The challenge is: how do you keep a single list of language tokens but use them as both an enum and a list of strings?
Clang defines C token types in a file, TokenKinds.def, with all of the names of the different C language tokens (pretend C only has four tokens for now):
#ifndef TOK #define TOK(X) #endif TOK(comment) TOK(identifier) TOK(string_literal) TOK(char_constant) #undef TOK
If you just #include
this file, the preprocessor defines TOK(X) as “” (nothing), so the whole thing becomes an empty file.
However! When they want a declaration of all possible tokens that could be used, they makes an enum
of this list like this:
enum TokenKind = { #define TOK(X) X, #include "clang/Basic/TokenKinds.def" NUM_TOKENS };
Because TOK is defined when TokenKinds.def is included, the preprocessor will spit out something like:
enum TokKind = { comment, identifier, string_literal, char_constant, NUM_TOKENS };
This has the nice property that you can check if a type is valid by making sure that it is less than NUM_TOKENS. But if we’re going to put the tokens into that enum, woudln’t it be clearer just to put them there, instead of in a separate file? Maybe, but doing it this way gives them a nice way to get a string representation of the types, too. In another file, they do:
const char* const TokNames[] = { #define TOK(X) #X, #include "clang/Basic/TokenKinds.def" 0 };
“#X” means that the preprocessor replaces X and surrounds it in quotes, so that turns into:
const char* const TokNames[] = { "comment", "identifier", "string_literal", "char_constant", 0 };
Now if they have a token, they can say TokNames[token.kind]
to get the string name of that token. It lets them use the token types efficiently, print them out nicely for debugging, and not have to maintain multiple lists of tokens.
Using Boost’s preprocessor library you can have it even more flexible. No separate file needed: http://pastebin.com/TKG9W9TL
PS. example compiles with boost 1.44
LikeLike
Are you aware of ugly fuck you just linked ? Good luck boosting your shit.
FYI:
http://harmful.cat-v.org/software/c++/linus
LikeLike
Dude, don’t be a dick.
LikeLike
Cool! I like learning different ways of implementing this kind of thing, it’s a neat mental exercise.
LikeLike