... while it's not really useful for the NES, which is so old that emulating it ...

michael_h · on June 7, 2013

Emulating accurately takes much more power than you'd think: http://arstechnica.com/gaming/2011/08/accuracy-takes-power-o...;

Pxtl · on June 7, 2013

Oh, I know, but that's usually really tiny edge-case things. NES emulation was "good-enough" speed and accuracy-wise a decade ago. These days even an older-model smartphone can emulate NES games solidly well.

boomlinde · on June 7, 2013

You're right in that emulating the NES "good-enough" for most games is no big feat, in terms of power, but you are wrong in saying that it's because the platform is "old". There are old systems that are much trickier to emulate simply because software takes advantage of every corner case in the hardware design, using cycle perfect timing to exploit unexpected behavior in the chipset. The C64 is the perfect example, still having a pretty vibrant demoscene, coming up with new tricks that break emulators every year. Then, to get a reasonable level of accuracy, you need to emulate everything per cycle, complete with a bunch of analog hardware (simply emulating the SID sound chip is quite a heavy task) and registers bound to pins simply left floating.

mAritz · on June 7, 2013

Unless you get into speedrunning games where accuracy matters a lot and some emulators are "banned" because they are lacking in it. Banned in this case of course only meaning that you will not get recognition from most of the communities for performing speedruns on such emulators.

simias · on June 7, 2013

I think it might actually work better for more modern hardware. Less handcrafted ASM tricks, much more regular (compiler generated) machine code. And of course no self-modifying code that would be extremely difficult to recompile correctly.

Modern hardware (GPU, sound cards,...) is also very similar to what you find on a PC so it would be more straightforward to port all this code. No messing around with the framebuffer mid-scanline to create a cool effect, no quirky special purpose hardware for very specific tasks.

delroth · on June 7, 2013

This is so wrong on multiple levels.

First, self-modifying code is still extremely present on modern consoles, at least on the current generation (PS3/X360/Wii/WiiU). Loading code from external media is basically the same problem as self-modifying code (statically recompiling it is trivially equivalent to solving the halting problem).

Second, modern hardware might be similar, but game consoles SDKs export a lot more features to the developers than PC drivers do through DX/GL. The example I take every time is fetch shaders on the WiiU: these are a kind of shaders supported by AMD R600 GPUs but completely abstracted by DX/GL.

Third, maybe there is no more mid-scanline framebuffer tricks, but you have a ton of other problems with the framebuffer: while a PC assumes separate CPU/GPU memory, on modern consoles the framebuffer (and a few textures) are often stored in memory that is shared and synchronized with both CPU and GPU. This is incredibly hard to emulate because a full GPU->CPU FB transfer induces a lot of latency (several hundreds of us last time I checked). IGPs and APUs make this problem a bit more manageable, but we're still missing the graphics API support for shared FB and shared textures.

Some things are better than older consoles but some other things are also a lot worse. JIT-ing shader bytecode is another problem that I don't think has been tackled yet (except maybe for Xbox emulation - which is still in its infancy and for a console using a very old GPU with no use of stuff like compute shaders).

drbawb · on June 7, 2013

The other interesting thing about older hardware is that each cart could embed special hardware that the NES could take advantage of. To play those games: that extra hardware has to be emulated as well.

So far as I know: this is unheard of with current gen consoles.

The most recent example I can think of is for a handheld console. The Pokemon Walker that was bundled with the newer Pokemon games for the Nintendo DS; which I believe has the IR hardware embedded in the cart itself.

So in addition to worrying about rather interesting use of the stock hardware, you also have to consider interesting use of _secondary_ hardware.

---

The latest batch of consoles [Xbox One, PS4] look to be x86 PCs with high-bandwidth memory; if that's the case, I'm hoping PC ports are more common, and perhaps we'll even see a virtualization based approach to running next gen games on standard PC hardware.

Guvante · on June 7, 2013

I don't know how common add-ons were in the NES era, but they were very common for the SNES (which was very similar to the NES power wise).

Games stopped embedding hardware when they went to discs. There is no way to put a parallel processor into a DVD.

drbawb · on June 7, 2013

Well, modern consoles _could_ still be extensible; but their hardware is already so general purpose that there's not much point.

Best you could do w/ current gen technolgy is bundle a dongle w/ the game, where the user plugs in some kind of co-processor through USB.

So far I haven't really seen anything like that -- the only USB dongles I've seen bundled w/ games are for games like RockBand and they're just RF receivers.

Aside from bandwidth concerns, and the poor sales of previous attempts (for e.g the SEGA's whole 32x/CD addon), there's nothing preventing a disc-based from having an external co-processor.

AndyKelley · on June 7, 2013

I hinted at the end what kind of technique I think might actually be useful:

  For example, one such technique is to identify a section of code, make some
  assumptions based on heuristics which allow for highly optimized native code
  generation, and then detect if those assumptions are broken. If the assumptions
  are broken, the generated native code is tossed, and emulation takes over.
  However, if the assumptions are upheld, the recompiled block of code will
  execute with blazing fast native speed.

Someone · on June 7, 2013

You probably know it, but that approach is used everywhere, for example in JavaScript and ruby, where it is typically is impossible to prove much about your program (in some dark corner of the program, someone might redefine that function that appears to add one to a number, if the program is run on Thursdays)

I also have a minor, minor nitpick on the article: I think you should point out that those INY instructions, in general, are insufficient to increase 16-bit pointers. You have to check for wraparound, and increase the high byte if a value wraps to zero. Somebody must have checked (or hoped) that that didn't happen with these tables (developing with the long, safe form and replacing it by the short form before release is tricky, as shortening the code will move entry points)

darkf · on June 7, 2013

While not completely static, some modern emulators do use dynamic recompilation (essentially JIT) instead, which gives you more information to work with and lets you generate more optimized code. You can always fall back to interpreted code as the author does in this article, too.

davvolun · on June 7, 2013

Personally was very interested in this experiment after seeing this article yesterday: http://www.tested.com/tech/gaming/456272-straightforward-gui...;