It's because the author doesn't want a whole new language, but rather a better w...

pcwalton · on Feb 7, 2016

Even if all you want is C semantics with a different syntax, though, I think it's better to emit LLVM IR. LLVM IR is close to C semantics (though aliasing information has to be supplied explicitly), but you have direct control over debug info, which is important to avoid a huge regression in the debugging experience. As an added bonus, you eliminate the necessity of serializing and deserializing your IR during your compilation pipeline for no reason (which is effectively what compiling and reparsing C is doing).

dang · on Feb 7, 2016

You may be right, but learning curve is an issue. If you're already familiar with C, writing a sexpr C generator is super easy. I can see why someone would balk at learning a new conceptual model and toolset, and just take the path of least resistance, especially if they already know the kind of C program they'd like to write. This would have been even more true in 2010 of course.

So what's the best way to tackle the learning curve of what you're suggesting? If I know zero about LLVM and I want to make something like the OP, what should I do?

mpweiher · on Feb 7, 2016

Another good option is to use Joe Armstrong's approach: look at the LLVM IR generated by clang for example and then emit those.

My first attempt at using LLVM was using the C++ API. It was...a struggle. Using this approach (IR snippets), I made more progress in a day than I had in months using the API.

sklogic · on Feb 7, 2016

Also worth mentioning an invaluable learning tool: 'cpp' backend in LLVM. It emits an idiomatic C++ code that generates any given IR module using the LLVM API.

mpweiher · on Feb 10, 2016

Yup, that's what I used in my first attempt. Didn't work for me. (Actually: a later part of my first attempt, I think I initially tried with the straight API. Good luck with that).

sklogic · on Feb 11, 2016

What exactly did not work? Have you filed a bug?

dang · on Feb 8, 2016

That's clever.

pcwalton · on Feb 7, 2016

LLVM has a great tutorial (Kaleidoscope): http://llvm.org/docs/tutorial/

It walks you through basic expression generation, control flow, memory, etc. for a simple language. The learning curve isn't zero, to be sure, but I think the time saved by being able to work with IR as a tree instead of as a flat series of bytes makes it easily worth it.

nickpsecurity · on Feb 8, 2016

"you eliminate the necessity of serializing and deserializing your IR during your compilation pipeline for no reason (which is effectively what compiling and reparsing C is doing)."

I did debugging before it got to C and just ensured C generation would do exactly what I wanted. I could read and debug the C itself as a check against problems in that. Yet, serializing and deserializing my IR just didn't happen: it was just LISP or BASIC expressions depending on which version we're talking about. Just tree's.

You're debugging regression claim is correct as I addressed in my main comment. Fortunately, my development style and choice of libraries compensated for that nicely. It would've been quite painful if I had to deal with arbitrary FOSS or proprietary stuff out of necessity. I'd be working at both abstraction levels for sure or coding miracles into my tooling haha.

marktangotango · on Feb 7, 2016

I don't doubt your position has merit, and there are many options for generating code other than llvm, but, is anything really quicker to implement than fprintf? I have implemented compile (note I prefer to phrase it as 'source to source translation' rather 'compile') to X myself, and for a certain class of project (personal or rarely used by others, compilation speed isn't an issue) fprintf (or whatever) get's you a lot of bang for the development-hour buck.

pcwalton · on Feb 7, 2016

In order to actually generate valid C, you have to do a lot of work to figure out what you're supposed to fprintf; you have to get the operator precedence right, you have to do scoping right, debugging the serialization code is annoying, etc. A high-level API like LLVM IR, by contrast, lets you interact with the IR as a tree instead of an output stream, which is usually easier because your AST is already a tree.

sklogic · on Feb 7, 2016

There is a plain text form of an LLVM IR. It is a bit easier to pritnf it than a fully featured C.