I'd be interested in learning more how you implemented this. I reccomended somet...

JabavuAdams · on April 3, 2009

I'm kind of working on something similar.

One thing that struck me about C++ is that although template parsing, etc, is pretty hard, you can get a long way by segmenting the source into delimited regions. This might also speed up your semantic analysis, by allowing you to write code that's operating on a known kind of region.

For instance after pre-processing, it should be easy to find nested "{", "[", "(", "\"", "'", and "<". Segmenting the source in this way, first, might make further processing easier.

It seems that the standard parsing methods we learn in CS are optimized for single-pass speed, but not for maintainability or development efficiency. Computers have gotten fast. Using multiple passes of simpler operators seems like a good promising approach.

I see the process of parsing as some kind of folding, where you start with a linear sequence of chars and gradually fold it over and over into a "lumpy" tree.

jng · on April 13, 2009

My actual scheme is similar in some ways to what you describe. More complex, because it involves many different types of structures.

Also, my approach is more explicit, I don't "restart a parse" in a general way, as I contemplate each possible change explicitly. This is prohibitive to do by hand for a full grammar, but I'm not doing full C/C++/C# analysis, so it's doable. It also makes the dynamic parser fully incremental.