If I need to publish something and have it remain completely anonymous, convert it to ASCII TEXT. That is one of the few formats I understand well enough to be CERTAIN that it is free of metadata.
Now all I have to worry about is how I can anonymously publish it, text style analysis, and how to include diagrams without resorting to ASCII drawings.
It's not foolproof, but that does require that you have released a sufficiently large sample of LaTeX source under your real name (or something that can be traced to your real name) for comparison. Additionally, if you really want to release something typeset like that, but you're worried about fingerprinting, my guess is that it's not that hard to deliberately change your personal style by using a minimal set of packages and when necessary restricting yourself to the most popular packages. Even if your new personal style still ends up unique, it wouldn't match the fingerprint of anything you've released under your own name.
Yeah well that's not me... I've been on BBSes since 1988 and got my first proper internet shell account in 1991 when, incidentally, I was 13.
The reason I made the comment is because it rubbed me the wrong way to cite 13-year-old girl talk as prototypical gibberish. Are 13-year-old girls generally excitable and potentially annoying? Yes. Do they talk in gibberish as a matter of course? No, that's a thoughtless sexist trope.
And I'll place a blind bet that they text faster than you.
I wonder if there is a tool to obfuscate text to prevent analysis. one way I could think of is to use any online translation tool to translate to a foreign language and then back to english. and then fix the grammar slightly. the structure of the text should hopefully be different enough for any tool/human to recognize it's your style. is there an easier way to do this?
One of the methods they mentioned is machine translation, but they found that it wasn't terribly useful. It's a really neat talk and I highly recommend it. They also wrote some software to anonymize texts (Anonymouth) and their stylometry software (JStylo) is also freely available:
There is some free software available[0] to do stylometry analysis. And some software which purports to assist in anonymizing writings[1]. I've not really played around with either, so I can't speak to their ease of use and/or effectiveness. But it's at least somewhere to start.
Yeah, some work has been done in this area, but I only know about it in passing via a mention in one of Jacob Appelbaum's talks. This seems like it might be a decent place to start: https://www.youtube.com/watch?v=-b0Ta9h62_E
Reminds me of a concept of "google translate fixed point", i.e. you translate between english and an foreign language back and forth until the translation stops changing.
Don't be too sure about ASCII text file either, it may still contain a bit of metadata like the BOM (although then technically it's not pure ASCII, but rather UTF, but still it means you have to check the encoding used by your favourite text editor).
In case of UTF8, BOM isn't information at all, it's always the same 3 bytes (endianness mark is meaningless on byte-based encoding). And being added by default Windows plain text editor, it's fingerprinting usefulness is rather limited.
I think you'd be okay with any format, such as HTML, that you can eyeball in a plain text editor; you just need to eschew binary formats. So diagrams in SVG in preference to PNG?
If I need to publish something and have it remain completely anonymous, convert it to ASCII TEXT. That is one of the few formats I understand well enough to be CERTAIN that it is free of metadata.
Now all I have to worry about is how I can anonymously publish it, text style analysis, and how to include diagrams without resorting to ASCII drawings.