"Explicit is better than implicit". Therefore, I agree with you that explicit closure list, with the ability to copy and reference captured variables, is actually what C++ does right, not wrong.
The explicit capture list is only necessary in C++ because memory ownership and lifetimes are managed by the programmer in C++. Compare that to a garbage-collected language like Scheme or C#: when the implementation can figure out where memory needs to be freed and ensures you can't use-after-free, it frees the programmer from thinking about ownership (but not necessarily lifetimes: you can still wind up with memory leaks in GC'd languages if you're not careful to let go of references you no longer need). As mentioned elsewhere in this thread, Rust also offers the same level of explicit control without capture lists (though I'm on the fence about which way I prefer).
My point is that in languages with automatic memory management, explicit capture lists don't make much sense because the programmer is not tasked with managing memory and can safely capture references all the time. There's no need to ask oneself, "Do I own this pointed-to memory? Do I need to worry about it being freed before this closure? Should I make a copy?", etc. This is because, in a sense, the garbage collector itself owns the memory, but checks to make sure nothing else can use it anymore before it frees it.
You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.
For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur. In languages with mutability, the ability to make some part immutable is a virtue.
For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!
For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)
> You only talk about the memory management part and I guess most language designers think the same. What you and they fail to account for is that explicit capture list can reduce logical bugs.
I suppose, as a language designer, I tend to think that the more I do automatically, the more I ease the programmer's burden. However, as you point out, that's not always true. That said, my point wasn't (isn't?) that explicit capture is only a good idea sans automatic memory management (it may well be -- you've certainly given me some food for thought here), but rather that it's only necessary in that case, and I think that point still stands.
> For one, if I were allowed to explicitly capture the counter variable by copy, the surprising behavior mentioned above would never occur.
That's a failure of language design and I don't think the proper solution is to force explicit capture on closure creation (also note that you need more than just explicit capture because to prevent such an error, you need the ability to specify that the "captured" variable ought to be copied rather than actually captured). I think the proper solution to that problem is the one that the C# team went with: limit the scope of iteration control variables to the iterated block. This is typically what programmers used to block-structured languages would expect, anyway, unless the variable were clearly declared outside the scope of the iteration.
> In languages with mutability, the ability to make some part immutable is a virtue.
That's an orthogonal issue, and can be done in many other (and more general) ways.
> For two, in languages without explicit variable declaration, which variable is defined where quickly becomes murky when you have implicit capture. I have so many frustrations where the inner `i` variable clashes with the outer `i` in Python. Yes, I could just use a different name, but naming is hard, and with a new scope I should be able to reuse the name. That is almost the whole point of opening a new scope!
You're right: that is the point of opening a new scope! That sounds like a flaw in Python's design and could be remedied by making variable definition syntax different from assignment syntax. Consider Lua with its `local` syntax, C and kin with their type annotations, the Lisps with their completely separate forms for variable definition and assignment, and so on. There's also the Tcl strategy of "it's a definition unless it was imported into this scope with `global` or `upval`; otherwise it's an assignment".
> For three, in Javascript where closures are everywhere due to the amount of callbacks, the reference graph is just impossible to analyze. A closure may closes over another closure which closes over an object with a reference to the original closure. An explicit capture list makes the programmer think, and ease the job of anyone who tries to spot memory leaks from the source code. (But I guess that is just not the Javascript style, as they are so fond of never letting the programmers know about their mistakes. At least in C++ we trade that for speed. I don't know what Javascript trades that for.)
JavaScript is a shitty language to begin with, and fixing it wouldn't be as simple as fixing C# or Python... You make a good point here, but I still think that better tooling for data-flow analysis is a more attractive choice than a compulsory explicit capture list. On the flip side, an optional capture list could be a good compromise.
> On the flip side, an optional capture list could be a good compromise.
That is exactly what I am thinking about. Or, rather, what C++ has done right: You can let the compiler infer what to capture, like [=] or [&], or you can explicitly list the variables to capture.
> you need the ability to specify that the "captured" variable ought to be copied rather than actually captured
Yes, that is what I am talking about, and again, what C++ has done right. Most other languages give you no choice whether the capture is by copy or by reference.
> Most other languages give you no choice whether the capture is by copy or by reference.
That's because in languages that have traditionally had GC (i.e., languages in the Lisp tradition or in the ML tradition), the distinction didn't matter. Those languages did not "suffer" from a value/reference dichotomy (e.g., in Scheme, you're literally capturing the variable rather than a copy or reference to the value stored within -- under the hood, that variable might always store a reference for convenience, or it might store a value for performance, but it doesn't matter as it's strictly an implementation detail).
I'm glad that the C++ committee didn't just dump closures into the language without considering this sort of interaction with other aspects of the language. Without the capture lists, closures in C++ have the potential to really suck. That the explicit capture lists even exist is evidence that they've carefully considered how the new features are going to play with existing characteristics of C++. Kudos to them for that!
That is almost true, but there's one exception in those GC'ed languages due to the dichotomy of value types and reference types. The confusing behavior on capturing the iteration variable is one example.
Ah, yes! You're correct. I spend most of my GC'd time in languages that don't have such a value vs. reference dichotomy, and I'd completely forgotten about it.
Spores seem like an interesting solution. The language designer in me has a distaste for it, though :p
For case 1 (capture of mutable references), an explicit copy operator might be better (as in, "I want whatever value this variable is bound to, rather than the storage location") (or even vice versa, where value is the default and there's an operator for location). In a way, spores accomplish this by forcing you to do the copy manually -- but then programmers have to always remember to use the extra syntax, and they need to do it for every captured variable. I'm not quite happy with even this solution, and it may be possible to come up with something even better. Concurrency is always a can o' worms :)
For case 2 (capture of implicit "this"), I'd argue that if (a) the compiler is smart enough to know that "helper" is implicitly "this.helper" and (b) that "this" will be captured by the closure, then (c) the compiler is also smart enough to create an implicit binding for "helper" and capture that instead. This would lead to less-surprising behavior, and intentional capture of "this" could still be done via explicit access. Another option is to, rather than treating "this" as being in an enclosing scope, treat it as though it were an implicit argument to the method (albeit a covariant one). This avoids capture altogether.
Agree on the copy operator, not only for spores, have wanted it more than one time in other languages too.
Not sure how the this binding should work though. If calling a method you need to a) dispatch on the runtime type and b) provide the instance to the method when called.
The compiler would essentially emit the same code that it would in the case of the spore, but it would be automatic. You still get to dispatch on the runtime type, because the binding is created after the method invocation, but before the scope of the lambda to be closed.
I think when a programmer writes "foo.combobulate()", the vast majority of the time, the intend to capture "foo". If they didn't and were being clever, I don't think it's unreasonable for the compiler to expect them to be explicit and write "this.foo.combobulate()" instead. In the former case, the compiler creates the implicit binding to capture, in the latter it does nothing implicit and just closes over "this".
I'm certain that the compiler has enough information to do this, and that it's in accordance with the principle of least surprise ;)
"Explicit is better than implicit". Therefore, I agree with you that explicit closure list, with the ability to copy and reference captured variables, is actually what C++ does right, not wrong.