Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Reviewing Microsoft's Automatic Insertion of Telemetry into C++ Binaries (infoq.com)
112 points by osopanda on June 9, 2016 | hide | past | favorite | 33 comments


People commenting here apparently haven't read the comment @cremno linked to:

> Lots of things inside Windows emit ETW events, which is Windows equivalent of DTrace (basically the entire OS, and .NET), it's super useful for debugging performance related events. It's not "Telemetry" like Google Analytics, it's for _you_ to debug your own programs. The easiest way to view the output of them is via WPA, you can watch some videos about it at https://msdn.microsoft.com/en-us/library/windows/hardware/hh...

https://news.ycombinator.com/item?id=11652077


>> Lots of things inside Windows emit ETW events, which is Windows equivalent of DTrace

It'd be less wrong to say "like Windows equivalent of syslog".

ETW is nowhere near as sophisticated as DTrace. ETW is more or less a simple logging subsystem, it's not actually modifying any code in memory to dynamically (that's what that D is for) insert instrumentation.

So if something is not compiled to emit ETW events, ETW can't trace it at all.

DTrace can trace anything that runs on the system. No need for binaries to emit any events.


> it's not actually modifying any code in memory to dynamically (that's what that D is for) insert instrumentation.

First of all, it is impossible to instrument a random binary on the fly without private symbols. Even otherwise, DTrace requires instrumentation to be inserted in user code. Without it, it will only get data from built in probes in the OS libraries/kernel/etc. (Same as ETW)

>So if something is not compiled to emit ETW events, ETW can't trace it at all.

This is false. I don't know if you've ever used ETW but I use it regularly to run traces on binaries that don't have any ETW specific instrumentation.


> First of all, it is impossible to instrument a random binary on the fly without private symbols.

That's not true. You can use DTrace instrumentation on any binaries. It's more effort, of course, to figure out what address is what. You can work backwards from system calls to get some reference point about target process data structures. And of course trace obvious things such as function entry points, particular instructions, etc.

Also other dynamic binary tracing frameworks can do tracing without symbols.


>> It's more effort, of course, to figure out what address is what.

Um.. at that point you're pretty much making your own symbols. So I don't know what your point is.


Is it a phonebook when someone finds out and scribbles down phone numbers for a few people?


I don't think you get it. Getting the address of the function call and capturing the entire call stack of a running process is trivial. It has been possible to do this for decades on all platforms where the ABI is known. Knowing what address belongs to which function, knowing the type of the parameters, return values, etc, requires a ton of effort. But I could be wrong. Can you point to any common usecase of dtrace where people use it when no debugging symbols and no providers are present in the user code and everything has to be reverse engineered? I'm not going to bother with inlined functions and the like - which would be unfair for any tool to automatically know about.


> I use it regularly to run traces on binaries that don't have any ETW specific instrumentation.

Then those binaries are making use of other binaries which do emit ETW events -- including, possibly, binaries that ship with Windows itself. ETW is basically a pub-sub system. If there isn't code that explicitly sends an event, then there's no event. (And if there aren't currently any listeners subscribed, then the cost of emitting the event is pretty small.)


I know what ETW is and how it works. I don't know what your point is, sorry.


Sometimes conversations are less like arrows and more like trees -- few points, lots of branches.


I think his point was: you really should read about dtrace, esp. how to instrument user programs (without instrumentation).


Err, if you already know the memory map of the functions inside a process image, then understanding the call stacks is trivial for any debugging or performance tool - on any platform. This does not require any instrumentation. And if you're reverse engineering, then dtrace is the wrong tool anyway. So, can _you_ point to any common use case of dtrace where the probes aren't already present in the kernel/libs/user code?


Bruce Dawson's UIforETW makes it easy to record ETW traces.

https://randomascii.wordpress.com/2015/04/14/uiforetw-window...


Typical Microsoft. They were (or maybe are still) in the telemetry-all-the-things craze so what can be more logical than adding an event called "telemetry" to every binary produced with their compiler. In the words of Raymond Chen, "I bet somebody got a bonus for that feature." But as MS's Carrol is quoted in the the fine article:

"We haven’t actually gone through this full exercise with any customers to date though, and we are so far relying on our established approaches to investigate and address potential problems instead."

Even if that ETW event isn't "telemetry" itself, it could be, of course, used in the some hypothetical telemetry code that that would use ETW infrastructure. Which is not bad as is. It's just an event, independent of who uses it.

The main question is, of course, is there such code ("real" telemetry using ETW) in Windows, or is ETW used only for debugging, that is, only by developers.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa3...

My guess is: ETW is already used for some real telemetry: its model seems to allow this.

Technical details of what the functions do:

https://www.reddit.com/r/cpp/comments/4hoyzr/msvc_mutex_is_s...

And the explanation of the ETW by xon_xoff:

https://www.reddit.com/r/cpp/comments/4hoyzr/msvc_mutex_is_s...

"ETW is a general mechanism to log any kind of event, not just performance events, and is used throughout Windows for more than just profiling. Furthermore, it supports both multiple simultaneous consumers and storage in .etl files for later processing. Any program with sufficient privilege can enable tracing of specific event types throughout the system, and user intervention is not required to do so. An example is an automatically generated file called ExplorerStartupLog.etl in the AppData\Local\Microsoft\Windows\Explorer folder. These files being generated locally doesn't mean they can't be transmitted later, and some problem reporting tools use ETW+ETL files to efficiently capture telemetry for upload."


I think it is not true in this case. The thing that does telemetry in Windows is called SQM AFAIK. I think it might use ETW sometimes, but this is only one thing ETW can be used for.


I'm quite sure Windows uses ETW on various places: searching for etl files, I find 136 on my C: disk, and all are generated by the system, and not due to my debugging actions. E.g.

    C:\ProgramData\Microsoft\Diagnosis\ETLLogs\AutoLogger\AutoLogger-Diagtrack-Listener.etl
    C:\Users\myname\AppData\Local\Microsoft\Windows\Explorer\ExplorerStartupLog.etl
    C:\Users\myname\AppData\Local\Microsoft\Windows\Explorer\ExplorerStartupLog_RunOnce.etl
    C:\Users\myname\AppData\Local\Microsoft\Windows\SkyDrive\logs\app\SkyDriveApp-2015-08-30.2157.4952-1.etl
    C:\Users\myname\AppData\Local\Packages\microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\LiveComm.etl
    C:\Users\myname\AppData\Local\Packages\microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\LiveCommLast.etl
    C:\Users\myname\AppData\Local\Packages\microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\Microsoft.WindowsLive.Mail.etl
    C:\Users\myname\AppData\Local\Packages\microsoft.windowscommunicationsapps_8wekyb3d8bbwe\LocalState\Microsoft.WindowsLive.People.etl
    C:\Users\myname\AppData\Local\Packages\Microsoft.WindowsReadingList_8wekyb3d8bbwe\LocalState\StashApp.etl
    C:\Users\myname\AppData\Local\Packages\Microsoft.WindowsReadingList_8wekyb3d8bbwe\LocalState\StashShare.etl
    C:\Windows\Inf\netcfgx.4.etl
    C:\Windows\Logs\PBR\RjvTrace_Configure.etl
    C:\Windows\Logs\SystemRestore\restore.0.etl
    C:\Windows\System32\wdi\{86432a0b-3c7d-4ddf-a89c-172faa90485d}\{fef283f0-94cd-4a98-a8e7-4fd5e3eca941}\snapshot.etl
    C:\Windows\System32\wdi\LogFiles\BootCKCL.etl
    C:\Windows\System32\wdi\LogFiles\SecondaryLogonCKCL.etl
    C:\Windows\System32\wdi\LogFiles\ShutdownCKCL.etl
    C:\Windows\System32\wfp\wfpdiag.etl
If we would track everything that their different telemetries uploads, I'm quite sure .etl files would be somewhere inside. Or that their number rises with the installation of the KB updates that mention telemetry (e.g. everything related to "prepare your system for Windows 10," for example). Or that they really planned to use .etl files too fro these purposes but just haven't yet. But the .etl are widely used, they are simply system-level provided effective logs.

Everybody can check for himself too.


The point is that the particular ETW logs in question in this case are unlikely to be in the list.


Personally I don't see that they invade privacy more than they can with other their information gathering methods, I however do belive that the names weren't given randomly but really related to some telemetry they planned, even when unfinished.


Foremost, we don't know what it does and we don't know what it will do or if there are any associated security vulnerabilities.

Writing to a log might introduce performance problems as it us serialized resource.


1) I want to control what my binaries look like.

2) The article itself essentially makes the same point - the problem is unneeded debug statements are being inserted into release builds. The article mentions that Microsoft promises to fix this.

3) Sure, this sounds harmless, one shouldn't explain with bad intentions things that can be explained with incompetence. BUT, big but, in Internet Security, we've seen many situations where chains of "unnecessary but harmless" stuff, incompetently inserted resulted in horrible things ("shell shock" was "it's standard and harmless to get this functionality by shelling out, what could possibly go wrong" etc).


They could probably get away with keeping it but turning it into an off by default feature. Renaming the linker object file into telemetry.obj (or something entirely different) to enable it. It seems as though it wasn't really used though so I guess with the public outcry they just decided to axe it and be done with it.



Who cares if it's old news. It's important nonetheless. Damn news addicts. :)


you know what makes the news overwhelming to digest? choosing to drink from the mouth of the firehose, and depending on repeating stories to see them, instead of a clean historical record to pour over.

there would be a lot less news if people went back and read what they missed, instead of waiting for it to get brought up again.


Old to you, new to me... I'm one of the 10,000

https://xkcd.com/1053/


Something from 1 month ago isn't "old" in the HN context. Heck we often discuss stuff years after it happened.


It's new news that they are now removing it.

But even while they changed their minds, with a display of lack of judgment like this, who will trust their compilers anymore?


> who will trust their compilers anymore?

Me.

Microsoft has some real issues with regards to privacy, security, and performance. An ETW event on main entry/exit is none of the above.

It's worth yanking from Microsoft's POV because "Telemetry" is a big bad scary word, but from my POV it's a pointless distraction from actual privacy, security, and performance concerns.

C'est la vie...


No, it's also old news that they're removing it, considering they announced that a whole three days after people started complaining.


>new news

The linked comment was posted 29 days ago. That's not really new but then again some people might not be aware of it or the whole thing.



I can't be the only one who finds it a bit (actually very) scary that just a little wire brushing things like this are uncovered. When I also read about things like MS's secretive constant-calling-home and Intel's secretive IME (the "ultimate rootkit") I get very worried about the real state of personal privacy nowadays.


So, it has come to this: https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp... (Ken Thompson "Reflections on Trusting Trust") in real life...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: