Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think Discord makes any claims that the audio is P2P encrypted. There are legitimate reasons why Discord might be dropping malformed packets, apart from an indication that they are spying on you (they may be doing that too).

1) to improve audio quality.

2) to help prevent RCE attacks on the destination client.

3) re-encoding at lower bitrates for low bandwidth clients.

I don't really see the issue here unless Discord claimed they do not decrypt the audio.



3) is most certainly at play here, as Discord allows clients to set their preferred bitrate (RX&TX), which would not be possible in multi-party calls without re-encoding.


Could they not just drop the quality of the whole call down to the lowest bandwidth allowed by a user? I feel that would reduce a computational burden on Discord's end, while allowing the lowest client-to-client latency


Keep in mind that a major use case for Discord is open voice chats (e.g, for gaming groups), not just organized person-to-person calls. Having the quality for a whole chat drop just because someone joined from a mobile phone would be a really disappointing user experience.


This is absolutely something that Discord does, though. I've had friends just drop the bitrate slider as low as possible in Discord just to make the whole channel sound awful


That's a channel-specific option though. You can set per-channel bitrate, but it's not something Discord does automatically to accommodate lower throughput clients.


They could, but if you have 4 people in a call and 3 can receive high bandwidth audio, lowering it just for the 1 person with low bandwidth is the best user experience.

Otherwise people with good networks who have their call quality dragged down will just think Discords voice chat is bad.


Wouldn’t it be possible to send a stream cipher encrypted packet of audio to a hub server where the codec has a “progressive” decoding mode? If I’m not mistaken, Opus can already do something like this. That way a client could set their desired bitrate and the server would truncate the packets before passing them off and doesn’t need to be re-encoded. This only works with a stream cipher though.

Maybe an audio engineer or cryptographer could chime in?


It looks like Ogg Vorbis has theoretical support in the spec for something called "Bitrate Peeling"[1], but there is no functional implementation for this yet, and there's been an open bounty on it since 2004.[2]

This is a really neat idea though. Truncating the packet to change the bitrate per client without re-encoding.

[1]: https://en.wikipedia.org/wiki/Bitrate_peeling

[2]: https://wiki.xiph.org/index.php?title=Bounties&diff=196&oldi...


Peeling has been a goal for many audio/video codecs in the past. Nobody's been able to make it work acceptably, though -- either the low-bitrate version sounds awful, or the high-bitrate version increases in size to the point that it might as well just have a low-bitrate version alongside it.


So in that case you could peel off whichever stream you don’t need, but you waste some upload bandwidth over simply negotiating a bitrate in advance.


The channel bitrate is just a client hint as far as i know (at least for now).

Bots are able to send whatever bitrate they want to the channel, and other clients received as is.

The server simply relay opus data without re-encode it.


> Discord allows clients to set their preferred bitrate (RX&TX)

Where? All I see is setting bitrates on channels, but not on the client/app as a whole.



That's from 2017, it is entirely possible that was true in 2017 but not now.


Discord specifically stated that they will not implement e2ee so they can continue to spy on their users, because “Think of the children!”

https://www.reddit.com/r/discordapp/comments/8nzb5d/why_is_d...


In their Terms and Agreements it's clearly stated that all data created is owned by Discord and can be used in commercial purposes.


3 is tens of megahertz of one CPU core per re-encode, clearly out of the question with todays norm being ~ 4GHz 8 core CPUs, that or they are spyi^^^ recording everything for 'metrics/analytics'.


CPU time is irrelevant for re-encoding due to bandwidth reasons though. If a client requests 64kbps voice, sending it packets at 128kbps and letting it re-encode once it has them is pointless.


4) Maybe they are doing remuxing when multiple clients are on the same chat? Would definitely lower bandwidth.


ToS certainly allows it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: