If anything, I think self the failure to hit L5 driving after billions of dollars and millions of man hours invested is probably reflective of how automatic C to Rust translation will go. We'll cruise 90% of the way, but the last 10% will prove insurmountable with current technology.
Think about the number of C programs in the wild that rely on compiler-specific or libc-specific or platform-specific behavior, or even undefined behavior plus the dumb luck of a certain brittle combination of {compiler version} ∩ {libc version} ∩ {linker version} ∩ {build flags} emitting workable machine code. There's a huge chunk of C software where there's not enough context within the source itself (or even source plus build scripts) to understand the behavior. It's not even clear that this is a solvable problem in the abstract.
None of that is to say that DARPA shouldn't fund this. Research isn't always about finding an industrial strength end product; the knowledge and expertise gained along the way is important too.
This is the exact formulation of the argument before computers beat humans at chess, or drew pictures, or represented color correctly, or... Self driving cars will be solved. There is at least one general purpose computer that can solve it already (a human brain), so of a purpose built computer can also be made to solve it.
In 10 (or 2 or 50 or X) years when Chevy, Ford, and others are rolling out cheap self driving this argument stops working. The important thing is that this argument stops working with no change in how hard C to Rust conversion is.
We really should be looking at the specifics of both problems. What makes computer language translation hard? Why is driving hard? One needs to be correct while inferring intent and possibly reformulating code to meet new restrictions. The other needs to be able to make snap judgments and in realtime avoid hitting things even if it just means stopping to prefer safety over motion. One problem can be solved piecewise without significant regard to time and the other solved in realtime as it happens without producing unsafe output.
These problems really aren't analogous.
I think you picked self driving cars just because it is a big and only partially solved problem. One could just as easily pick a big solved problem or a big unstarted problem and formulate equally bad arguments.
I am not saying this problem is easy, just that it seems solvable with sufficient effort.
I'd put money on the solutions to said problems looking largely the same though - big ass machine learning models.
My prediction is that a tool like copilot (but specialized to this domain) will do the bulk of source code conversions, with a really smart human coming behind to validate.
With you, except for the conclusion "[ the tool ] will do the bulk of source code conversions, with a really smart human coming behind to validate".
The director orders the use of the tool when the dev team got downsized (and the two most-seniors left for greener pastures just after that). Validation is in the "extensive" tests anyway, we have those, right, so the new intern shall have a look, make it all work (fudge the tests where possible and remove the persistently failing ones as they've probably been always broken). The salesman said it comes from the DOA or DOD or something. If the spooks can do it so can we.
> This is the exact formulation of the argument before computers beat humans at chess, or drew pictures, or represented color correctly, or...
Which are things that took 20 or 50 years longer than expected in some cases.
> I think you picked self driving cars just because it is a big and only partially solved problem. One could just as easily pick a big solved problem or a big unstarted problem and formulate equally bad arguments.
But C to Rust translation is a big and only partially solved problem.
Ok, but if it's like 90% of small projects can use it as direct no pain bridge, that can be a huge win.
Even if it's "can handle well 90%" of the transition for any project, this is still interesting. Unlike cars on the road, most code transition project out there doesn't need to be 100% fine to provide some useful value.
Even if every project can only be 90% done, that’s a huge win. Best would be if it could just wrap the C equivalent code into an unsafe block which would be automatically triaged for human review.
Just getting something vaguely Rust shaped which can compile is the first step in overcoming the inertia to leave the program in its current language.
c2rust exists today, and pretty much satisfies this. I've used it to convert a few legacy math libraries to unsafe rust, and then been able to do the unsafe->safe refactor in the relative comfort of the full rust toolset (analyser + IDE + tests)
There is real utility in slowly fleshing out the number of transforms in a tool like c2rust that can recognise high-level constructs in C code and produce idiomatic safe equivalents in rust
"real" (large) C/C++ programs get much of their complexity from the fact that it's hundred of "sources" (both compiled and libraries) that sometimes, or even often, share global state and at best use a form of "opportunistic sharing". Global variables are (deliberately, and justifiedly-so) hard in rust, but (too) trivial in C/C++, cross-references / pointer chains / multi-references likewise. And once you enter threading, it becomes even harder to output "good" rust code - you'd have to prove func() is called from threaded code and should in rust best take Arc<> or some such instead of a pointer.
It'll be great for "pure" functions. For the grimey parts of the world, funcs taking pointer args and returning pointers, for things that access and modify global data without locks, for threaded code with implicit (and undocumented) locking, the tool would add most value. If it can. Even only by saying "this code looks grimey. here's why. A bit of FFI will also be thrown in because it links against 100 libraries. I suggest changes along those lines ... use one of the 2000000 hint flags to pick-your-evil".
In addition to the other replies, this is a one-time project. After everything (or almost everything) has been translated, you're done, you won't be running into new edge cases.
> You can attach about a hundred asterisks to that.
Not in San Francisco. There are about 300 Waymo cars safely driving in one of the most difficult urban environments around (think steep hills, fog, construction, crazy traffic, crazy drivers, crazier pedestrians). Five years ago this was "someday" science-fiction. Frankly I trust them much more then human drivers and envision a future utopia where human drivers are banned from urban centers.
To get back on topic, I don't think automatic programming language translation is nearly as hard, especially since we have a deterministic model of the machines it runs on. I can see a possible approach where AI systems take the assembler code of a C++ program, then translate that into Rust, or anything else. Can they get 100% accuracy and bit-for-bit compatibility on output? I would not bet against it.
Opinions about automated driving systems vary. Just from my own experience doing business all around San Francisco I have seen at least a half dozen instances of Waymo vehicles making unsafe maneuvers. Responders have told me and local government officials that Waymo vehicles frequently fail to acknowledge emergency situations or respond to driving instructions. Driving is a social exercise which requires understanding of a number of abstractions.
Isn't 100% accuracy (relatively) easy? c2rust already does that, or at least comes close, as far as I know.
Getting identical outputs on safe executions, catching any unsafe behavior (at translation-time or run-time), and producing efficient, maintainable code all at once is a million times harder.
You can attach about a hundred asterisks to that.
If anything, I think self the failure to hit L5 driving after billions of dollars and millions of man hours invested is probably reflective of how automatic C to Rust translation will go. We'll cruise 90% of the way, but the last 10% will prove insurmountable with current technology.
Think about the number of C programs in the wild that rely on compiler-specific or libc-specific or platform-specific behavior, or even undefined behavior plus the dumb luck of a certain brittle combination of {compiler version} ∩ {libc version} ∩ {linker version} ∩ {build flags} emitting workable machine code. There's a huge chunk of C software where there's not enough context within the source itself (or even source plus build scripts) to understand the behavior. It's not even clear that this is a solvable problem in the abstract.
None of that is to say that DARPA shouldn't fund this. Research isn't always about finding an industrial strength end product; the knowledge and expertise gained along the way is important too.