Essentially a parser groups tokens (like the ones generated by Lex) into logical structures. Yacc is a parser generator, specifically a tool to generate LALR parsers. In fact it became part of the POSIX standard, essentially any respectable OS needed to have a tool like that. It was an important software, both for its quality and for the needs of development at that time.
#Lexer generator algorithm software
Mike Lesk and Eric Schmidt (who later became CEO of Google) developed it as proprietary software in 1975 in C. You can read more about the basics of parsing in our article: A Guide to Parsing: Algorithms and Terminology. It can be used for things like the recognition of elements of a programming language to perform syntax highlighting. A standalone lexer has few uses, because it basically just organize set of characters into categories. So, for instance you can tell Lex that any grouping of digits (i.e., + characters) should be classified as an INT. All you need to have is a grammar that describes the rules to follow. A lexer accepts as input normal text and it organize it in corresponding tokens. In case you do not know what a lexer is, these are the basics. Lex is a lexer generator, that is to say a tool to generate lexical analyzers. The Beginningsīoth software have an interesting history, although Yacc’s story looks like a soap opera. We briefly look at the history of these software.
![lexer generator algorithm lexer generator algorithm](https://miro.medium.com/max/765/1*bUzlbeWx5iw18dthZvSlRg.png)
If you do not care about their history, you can go to the second part using this handy table of contents. The first part of this article explains the history of these two software, while the second one analyze their flaws. The short version is that there are tools that are more flexible and productive, like ANTLR. With Bison our clients had trouble organizing large codebases and we found difficult improving the efficiency of a parser without rewriting large part of the grammar. So, why you should avoid them? Well, we found a few reasons based in our experience developing parsers for our clients.įor example, we had to work with existing lexers in flex and found difficult adding modern features, like Unicode support or making the lexer re-entrant (i.e., usable in many threads). For some people these are still the first software they think about when talking about parsing.
![lexer generator algorithm lexer generator algorithm](https://cs.nyu.edu/~gottlieb/courses/2000s/2007-08-fall/compilers/lectures/diagrams/nfa-34.png)
Each of these software has more than 30 years of history, which is an achievement in itself. Lex and Yacc were the first popular and efficient lexers and parsers generators, flex and Bison were the first widespread open-source versions compatible with the original software.