I am in the process of abandoning Cloudfront, because they have a serious bug when serving video files. They serve HTTP 206 (Range-Get) as HTTP 1.0 but 206 didn't exist in HTTP 1.0. Chrome and Firefox treat this as "uncacheable", thus media assets bypass the local media cache.
Depending on the nature of the content, Cloudfront is not usable for video, particularly the kind where people seek around a bunch, like instructional videos and tutorials. Also, people with slow connections expect the video to buffer while paused. This doesn't happen if you serve the videos from cloudfront.
If that is the only problem, flipping one bit in every response seems like a really simple solution. Why hasn't Cloudfront fixed it yet, do they know about this?
Yes they know about it. Below is the response from Amazon. The logic they employ is that since it is broken in an old version of Squid, it is fine for it to be broken on Cloudfront.
While we are aware of the issue with range request HTTP/1.0 206 responses and Chrome, we cannot provide an ETA for a fix. Since this issue is specific to range requests, an immediate workaround is to disable range requests on your origin server if this is possible for your use case.
It is also worth mentioning that multiple web proxy and cache application vendors have using HTTP/1.0 as a de facto standard for many years, so you will probably sporadically get similar reports from your end users using Chrome, but not other browsers such as Firefox or Safari. For example, here is a discussion between a Chrome developer on the mailing list for the popular Squid web cache about a similar report:
http://www.squid-cache.org/mail-archive/squid-dev/201204/011...
I am not saying that always returning HTTP/1.0 will stick around forever, but it is fairly common in real world situations today.
Why would Cloudfront cause the player to stop buffering the video? Does it happen with a HTTP Download distribution, or the RTMP streaming distribution? I've used both CloudFront and Akamai for serving video, and I don't remember running into those issues. I have only tested it with normal HTTP Download distributions, however.
Cloudfront serves an invalid HTTP response, thus Chrome/Firefox refuse to cache the data. The theory is, it is better to not risk caching bad data. Chrome will still play the video served from Cloudfront, but it wont write any of the data to the media cache. Thus when you pause and buffer, it will only buffer a few seconds of video in the current playback window. It never writes to the cache so buffering must stop.
To see this in action, play a video with chrome://media-internals/ open.
A video served from S3 will get saved to the media cache, and the full video can buffer when paused. A video served from cloudfront will only buffer a few seconds of video.
Update:
It is probably just easier to see for yourself:
Here is a video served from S3. Pause the video in chrome, and you will see the whole video gets buffered:
Oh is that why my Coursera videos suddenly won't buffer for more than a few seconds? I figured it was an optimization, like with YouTube, so they don't waste bandwidth loading videos for people who have the tabs open but won't watch them.
I don't think there is one. You can't even compile a custom Chromium because it doesn't have h264 codecs. I even spend a day reading the Chrome source code, hoping there was some combination of headers that could trick Chrome into using the media cache, but I didn't find anything. I probably should have just spent the day finding a better CDN.
Thanks for the info, thats very informative. Can you recommend any CDNs that work well with online video? It would be great to find one that is as easy to configure as cloudfront that doesn't have this issue.
I'm not a big fan of Akamai's control panel -- setting up new distributions is way to complicated (albeit more flexible), and configuration changes were taking almost a day to propogate.
I haven't had the time to find another vendor, thus can't recommend one.
Amazon promised a fix, but then back tracked. Their explanation was something along the lines of "an old version of Squid has the same problem, thus it is okay". I cannot comprehend the logic system they employ to think that is a good reason to live with the bug.
Chrome on iOS is probably using Apple's AVPlayer, which makes sense because there is probably no other way to access the dedicated decoding hardware. Apple is not afraid to Cache an HTTP 1.0 206 response.
CDN that support Range Requests correctly. I want to use S3 as the origin. I need an edge location in Australia. Also, there can be no 301/302 HTTP redirects, thus I have eliminated Google as a potential offering. My current spend is around $1000 a month, so it is a tiny account.
Don't have an edge location in Australia currently, but our Hong Kong POP could work with reasonable latency - we can test that. Let's discuss offline.
You're doing something horribly wrong. I work for a live streaming company and we make extensive use of Varnish. It can probably solve the problem you're describing.
I'm not doing a damn thing wrong other than using Cloudfront. The problem is on their end, not mine. Thinking Varnish could solve this problem is utterly confused. Do you know what CDN does? CDNs have servers located around the world so files are loaded quickly and with low latency.
Furthermore, your profile suggest you work for a pump and dump penny stock company (basically a scam). If your employer is paying you in something other than cash, you need to walk away asap.
Sounds like troll bait, recall the recent article about PG's modding algorithms. Life's too short.
There's lots of video CDN "solutions", and it's almost always cheapest (even after labor support) to DIY with bare metal at very large scale. If it were me, I would eval video CDN shops using tsung test cases wired up as nagios checks. Gotta make sure their stuff stays working.
A payment gateway once mistakenly deployed API changes to production without notice. Trust no one.
and it's almost always cheapest (even after labor support) to DIY with bare metal at very large scale
If you simply need to deliver files or live streams, without needing to provide complex functionality at the edge (various kinds of protection, geo blocking, or pay-per-minute), and your traffic patterns are predictable - it's often cheaper to build your own solution. Once you start thinking about backbone and colo redundancy, deploy in different countries with contract commits - things get expensive very quickly.
The beauty of using a massive third party delivery service isn't performance, it's elasticity. Just like with the web apps (frequently hosted on DIY systems) that go down as soon as the link goes up on HN - being able to absorb traffic spikes without failing (and without forcing you to commit to a higher tier for a year) can be very valuable.
I'm entirely aware of the financial situation. How is calling my employer a pump and dump scam NOT spiteful when I've worked on this project from the beginning?
Edit: Also, I don't appreciate you posting that. It's completely off-topic. Keep it classy.
At this point, the article has left hackernews. I am writing to you as fellow hacker looking to help you out.
You are involved in a stock fraud. The company you work for is a sham.
If you live in the US, then you have a plausible defense that you have no understanding of the underlying business. In this case, you likely can't afford the lawyer to present this case.
If you don't live in the US, then be careful. Imagine, ten years down the road, you are a successful engineer, and want to take your family to Disney world. Unfortunately, there is an outstanding bench warrant for your arrest, and rather than a nice family vacation that your wife wanted, you end up in a US prison.
Have you (or anyone else) had a chance to A/B test caches? That is, setup network API requests to be duplicated/filtered from production and sent to test environment(s).. credit to netflix.
Setup enough identical boxes with each of Squid, Nginx, Varnish, trafficserver, etc. and evaluate each with basically the same traffic and however much tweaking.
Let's keep in mind that individual box performance will not directly translate into your cluster performance or global network performance. A box can be fine-tuned to serve a file at lightning speed, but once you connect a bunch of them together, and start delivering lots of different files to millions of people - different factors come into play. Distributing files, replacing files when updated, content churn, etc etc
Simple caching works for images, but doesn't work for large video files, for example (look at latest financials from public CDNs - they are all bleeding cash).
It's really not that simple as testing a box to see which setup works best.
Depending on the nature of the content, Cloudfront is not usable for video, particularly the kind where people seek around a bunch, like instructional videos and tutorials. Also, people with slow connections expect the video to buffer while paused. This doesn't happen if you serve the videos from cloudfront.