> So one could say golang avoided gc for those 100 objects it didn't need in the first place.
That's a reduction of the number of allocations, not of GC. In a generational copying GC like that of Java, the GC runs a minor collection when you've allocated enough bytes in the nursery to fill it up; it doesn't matter whether those bytes came from 1 object or from 101 objects. From reading Go's documentation [1] it's unclear whether its GC collection cycle is triggered based on the number of objects or whether it's triggered based on memory usage, but I assume the latter as that's how most GCs work.
The pressure on the GC during the mark phase is based on the number of total pointers. You may be able to reduce the total number of pointers by packing pointer-free structs together, but I would be surprised if this helps mark performance that much in practice.
The main way to really reduce GC pressure in a fully GC'd system, short of improving the GC itself, is to use pooling. Which both Java and Go programmers do regularly.
Golang case: 1 object, no child nodes. Nothing to do. Code that does not need to run is the fastest.
JVM case: visit array object and its 100 child nodes. JVM's GC still needs to visit all nodes of the graph. What actually takes time in GC is all those L1/L2/TLB misses and extra page faults caused by following the object graph. If all objects happen to be on a different cache line, L1 spills happen after just 512 references on Intel Sandy Bridge, Ivy Bridge, Haswell, etc. (and sooner in reality). Those extra loads from L2 are not free.
So, in this case Golang needed to visit 1 struct (object). JVM needed to visit 101 objects.
I wasn't talking about object lifetimes at all. I was talking about memory layout and point out in golang you can have the objects sequentially in memory without need for pointer indirection (references).
Thus generational GC doesn't have anything to do with this. Gen GC just something nice to have when a limited call graph generates a lot of temporary objects, i.e. garbage.
I mentioned this in the second paragraph. Like I said, I'd be surprised if that helps mark times that much in practice. For minor collections in a generational GC you're typically doing a Cheney scan, so it's very unlikely to matter as you're copying the whole live region of the semispace anyhow. For major collections on tenured objects, in theory it could help, but again I'm skeptical that it will affect mark performance that much, because compaction does an excellent job of mitigating the cache effects. (It's impossible to accurately measure this stuff right now, as the fact that Go's GC is much more immature than the HotSpot GC will skew the numbers.)
Here's an explanation from Russ Cox (in the form of an SO answer) into how Go and Java differ in terms of object allocations and control of memory layout. http://stackoverflow.com/a/22214673/1567738
I'm sure there's more info in the golang-dev group (https://groups.google.com/forum/#!forum/golang-dev) related to the GC specifics, but it's a moving target and may change substantially in the next few versions.
The main way to really reduce GC pressure in a fully GC'd system, short of improving the GC itself, is to use pooling.
It occurs to me, that I don't understand how pooling helps a non generational GC like golang's. In a generational collector, the contents of the pool would be promoted to regions of GC memory that are copied less. Go's GC isn't generational, so what is going on?
If you use a pool, you can explicitly return memory to it without going through the GC. This causes mark/sweep cycles to occur less often. (Of course, using pools opens you up to use-after-free and memory leaks, albeit without the type- and memory-safety consequences of use-after-free in C/C++.)
If you use a pool, you can explicitly return memory to it without going through the GC. This causes mark/sweep cycles to occur less often.
In a non-generational collector, why less often? Is it that the GC "sees" less garbage, and this figures into the frequency of mark/sweep cycles?
EDIT: Okay, I just got it. If you assume it's a bump allocator, it's easiest to picture. So Go does have another big advantage with regards to reducing GC pressure, in that one can stack allocate structs and arrays. (One would be passing slices on those arrays most often, and stack allocated would be slightly less flexible, of course.)
Right. See the documentation here [1]. Typically GCs run on allocation when they see a high ratio of the live set from the previous run to total memory in use.
That's a reduction of the number of allocations, not of GC. In a generational copying GC like that of Java, the GC runs a minor collection when you've allocated enough bytes in the nursery to fill it up; it doesn't matter whether those bytes came from 1 object or from 101 objects. From reading Go's documentation [1] it's unclear whether its GC collection cycle is triggered based on the number of objects or whether it's triggered based on memory usage, but I assume the latter as that's how most GCs work.
The pressure on the GC during the mark phase is based on the number of total pointers. You may be able to reduce the total number of pointers by packing pointer-free structs together, but I would be surprised if this helps mark performance that much in practice.
The main way to really reduce GC pressure in a fully GC'd system, short of improving the GC itself, is to use pooling. Which both Java and Go programmers do regularly.
[1]: http://golang.org/pkg/runtime/debug/#SetGCPercent