I just read the first few chapters of Bjarne's book the "The Design and Evolution of C++", it is highly interesting. In his book, Bjarne mentioned that used yacc for the first Cfront was the correct choice at that time, but in retrospect it was a bad choice, and it should had been down in recursive descent. But He did not give much explanation, except that at that time there was not even a yacc grammar for C. I am wondering if there are other reasons behind his argument. I am studying Compiler, and I learned generally LL(k) grammar is much more difficult to design than a LALR(1) grammar, also semantic checking is somewhat more difficult in LL(k). So last quarter when I implemented a C++ parser for a project, I used yacc++ and C++, the only extra Lexical analysis is to return a class_name token by consulting class_name_tables at a couple of locations. Since I was convinced by Aho's book that LL(1) is less powerful and harder to implement, I would appreciate if someone (including the creator ) could shed some light on this issue. Dongxiao Yue
More options Apr 12 1994, 9:37 am
Newsgroups: comp.lang.c++
From: y...@tuna.micro.umn.edu (Dongxiao Yue)
Date: Tue, 12 Apr 1994 16:37:51 GMT
Local: Tues, Apr 12 1994 9:37 am
Subject: Why Stroustrup do not like YACC grammar for C++ ?
I just read the first few chapters of Bjarne's book the "The Design and Evolution of C++", it is highly interesting. In his book, Bjarne mentioned that used yacc for the first Cfront was the correct choice at that time, but in retrospect it was a bad choice, and it should had been down in recursive descent. But He did not give much explanation, except that at that time there was not even a yacc grammar for C.
I am wondering if there are other reasons behind his argument. I am studying Compiler, and I learned generally LL(k) grammar is much more difficult to design than a LALR(1) grammar, also semantic checking is somewhat more difficult in LL(k). So last quarter when I implemented a C++ parser for a project, I used yacc++ and C++, the only extra Lexical analysis is to return a class_name token by consulting class_name_tables at a couple of locations.
Since I was convinced by Aho's book that LL(1) is less powerful and harder to implement, I would appreciate if someone (including the creator ) could shed some light on this issue.
... [stuff about Stroustrup saying he should not have used YACC]
>I am wondering if there are other reasons behind his argument. I am >studying Compiler, and I learned generally LL(k) grammar is much more >difficult to design than a LALR(1) grammar, also semantic checking is >somewhat more difficult in LL(k).
LL(k) can be harder to design, but try wading through
stmt_list: stmt_list stmt | stmt
in LR style, or
stmt_list: (stmt)+
in LL style.
It's much more intuitive what's going on in the LL grammar. Besides, there are tools (like the Purdue Compiler Construction ToolSet) that will parse LL(k) with infinite lookahead, if you want it.
>Since I was convinced by Aho's book that LL(1) is less powerful and >harder to implement, I would appreciate if someone (including the creator ) >could shed some light on this issue.
LL(k) with infinite lookahead is theoretically *more* powerful than LALR. Somebody got a PhD off that theory (can't quite remember who).
Add to that the fact that in LALR, you can't have inherited attributes, only synthesized, and you've got some good arguments for using LL(k).
I've written a Pascal compiler (for a class) in LL(1) by hand, and also with tools, and I also tried YACC. I found the LL approach much more feasible, and hell of a lot easier. -- -------------------------------- Paul Fagerburg (p...@bert.cs.byu.edu)
"We are Microsoft. OS/2 is irrelevant. Unix is irrelevant. Open systems are futile. Prepare to be assimilated."
In @news.cis.umn.edu> y...@tuna.micro.umn.edu (Dongxiao Yue) writes:
>In his book, Bjarne mentioned that used yacc for the first >Cfront was the correct choice at that time,
I read the book again, the above statement is inaccurate, very sorry. quote: "For most people, it(yacc) would have been the right choice. In retrospect, for me(Stroustrup) and C++ it was a bad mistake. "
In p...@bert.cs.byu.edu (Paul L Fagerburg) writes: >y...@tuna.micro.umn.edu (Dongxiao Yue) writes: >LL(k) can be harder to design, but try wading through >stmt_list: stmt_list stmt > | stmt >in LR style, or >stmt_list: (stmt)+ >in LL style. The thing I am afraid of is sometimes left factoring and elimination of left recursion will leave the grammar intractable. Given the expressions as example, I really don't like this E ---> T E' E' ---> |+T E' in LR we have E ---> E+T And I don't think the LR grammar for stmt_list is too bad. >It's much more intuitive what's going on in the LL grammar. Besides, I just think the opposite. >there are tools (like the Purdue Compiler Construction ToolSet) that >will parse LL(k) with infinite lookahead, if you want it. Infinte look ahead, does that mean it continues to look ahead until there is a resolution? >Add to that the fact that in LALR, you can't have inherited attributes, >only synthesized, and you've got some good arguments for using LL(k). That's not a problem, and it is easy to get any attributes if we rely on ourselves a little more. for example var_decl: type decl_list{$2->add_type_to_all($1);} decl_list: decl_list decl{$$=$1->add_next($2);} |decl {$$=new symbol($1);} ; assign decl_list a linked list of symbols. I am wondering if there is public domain LL(k) parser for C++.
More options Apr 13 1994, 2:07 pm
Newsgroups: comp.lang.c++
From: y...@tuna.micro.umn.edu (Dongxiao Yue)
Date: Wed, 13 Apr 1994 21:07:38 GMT
Local: Wed, Apr 13 1994 2:07 pm
Subject: Re: Why Stroustrup do not like YACC grammar for C++ ?
In @bert.cs.byu.edu> p...@bert.cs.byu.edu (Paul L Fagerburg) writes:
>y...@tuna.micro.umn.edu (Dongxiao Yue) writes: >LL(k) can be harder to design, but try wading through >stmt_list: stmt_list stmt > | stmt >in LR style, or >stmt_list: (stmt)+ >in LL style.
The thing I am afraid of is sometimes left factoring and elimination of left recursion will leave the grammar intractable.
Given the expressions as example, I really don't like this
E ---> T E' E' ---> |+T E'
in LR we have E ---> E+T
And I don't think the LR grammar for stmt_list is too bad.
>It's much more intuitive what's going on in the LL grammar. Besides,
I just think the opposite.
>there are tools (like the Purdue Compiler Construction ToolSet) that >will parse LL(k) with infinite lookahead, if you want it.
Infinte look ahead, does that mean it continues to look ahead until there is a resolution?
>Add to that the fact that in LALR, you can't have inherited attributes, >only synthesized, and you've got some good arguments for using LL(k).
That's not a problem, and it is easy to get any attributes if we rely on ourselves a little more.
for example
var_decl: type decl_list{$2->add_type_to_all($1);} decl_list: decl_list decl{$$=$1->add_next($2);} |decl {$$=new symbol($1);} ;
assign decl_list a linked list of symbols.
I am wondering if there is public domain LL(k) parser for C++.
y...@tuna.micro.umn.edu (Dongxiao Yue) writes: > The thing I am afraid of is sometimes left factoring and elimination of > left recursion will leave the grammar intractable.
> Given the expressions as example, I really don't like this > E ---> T E' > E' ---> > |+T E' > in LR > we have E ---> E+T
In LL(k), we usually do E -> T + E T -> F * T
etc. I can't remember the whole thing, but I have a parser that works fine on all expressions. Plus, no messy %left %right %unary stuff and shift/reduce conflicts.
> Infinte look ahead, does that mean it continues to look ahead until > there is a resolution?
It means that the parser will read to the end of the file if it has to to disambiguate productions. If you're not afraid of LL(k>1) then it's simple. If you want LL(1), then you get the E -> T E' and all that crap.
> I am wondering if there is public domain LL(k) parser for C++.
The Purdue Compiler Construction ToolSet (mentioned in last post, availabe via anon ftp from marvin.ecn.purdue.edu) will produce C or C++ output; you decide. -- -------------------------------- Paul Fagerburg (p...@bert.cs.byu.edu) "If I was being executed by injection, I'd clean up my cell real neat. Then, when they came to get me, I'd say, `Injection? I thought you said inspection.' They'd probably feel bad, and maybe I could get out of it."
In @bert.cs.byu.edu> p...@bert.cs.byu.edu (Paul L Fagerburg) writes:
>y...@tuna.micro.umn.edu (Dongxiao Yue) writes: >> The thing I am afraid of is sometimes left factoring and elimination of >> left recursion will leave the grammar intractable.
>> Given the expressions as example, I really don't like this >> E ---> T E' >> E' ---> >> |+T E' >> in LR >> we have E ---> E+T >In LL(k), we usually do >E -> T + E >T -> F * T
Sorry to repeat something that is in text book. But E-> T + E alone will not work since E can also be just T, to make it LL(1), we must left factor. so my we need the grmmar in my original post. Maybe we can live with E --> T | T+ E if LL(k) k>1, but this will be not efficient compared to LALR(1). efficient performance is vital for a compiler, I am sick of sitting there waiting for my program to get compiled.
>etc. I can't remember the whole thing, but I have a parser that works >fine on all expressions. Plus, no messy %left %right %unary stuff and >shift/reduce conflicts.
No, the E-->E+T grammar does not need to consider associativity, only the E--> E+E, E->E*E ... stuff do.
>> Infinte look ahead, does that mean it continues to look ahead until >> there is a resolution? >It means that the parser will read to the end of the file if it has to >to disambiguate productions. If you're not afraid of LL(k>1) then it's >simple. If you want LL(1), then you get the E -> T E' and all that crap. >> I am wondering if there is public domain LL(k) parser for C++. >The Purdue Compiler Construction ToolSet (mentioned in last post, >availabe via anon ftp from marvin.ecn.purdue.edu) will produce C or C++ >output; you decide.