My student worker, Alex, borrowed a digital oscilloscope and photoresistor from a coworker of mine and we sat down at my workstation to collect some data in an area that's often discussed (vociferously!) but rarely actually tested: latency. Most latency testing is unscientific voodoo ("I can *feel* it") that also suffers from confused terminology (see: the fighting game community's complaints about "lag" and how it makes them drop their combos). In this case, we're specifically examining input latency; that is, the difference in time between pressing a button on the controller and the action taking effect on the screen.
Here's a picture of our test bench, which consisted of a button from my trusty Happ-modded Mad Catz SE wired into the aforementioned oscilloscope:
The input from the button is compared against the voltage running through the photoresistor attached to a battery (and a momentary switch to keep the resistor from just draining the battery):
The photoresistor gets placed against my computer monitor while the button is used to make things happen in the emulators. As the brightness changes underneath the photoresistor, the resistance also changes changes, and the oscilloscope displays the difference in time between the voltage drop in the button and the change in voltage from the resistor/battery circuit, which looks something like this:
We only had the equipment for a day, so I couldn't test as much as I would have liked, but I tried to be as consistent as possible. To that end, we sampled 5 data points for each variable and did all of the testing on the same machine. All SNES comparisons used the Super Famicom Controller Test ROM, while the arcade comparisons used Espgaluda from Cave (in hindsight, probably not the best choice, but it's what I had on-hand). I also didn't have a good way of getting a baseline latency--I'm using a modern, crappy Dell LCD setup rather than a CRT and Windows 7 64-bit, which I chose out of both convenience and the assumption that it would be similar to a typical user's setup--so I was forced to provide data for *total system latency* rather than being able to isolate the latency caused by the individual variables. In attempting to get some sort of baseline, we held up the tester to the built-in gamepad testing applet in Windows, which gave results hovering around 75 ms, which is obviously not accurate, since some of our emulators performed better than that... With that in mind, these results should only be considered relative and not absolute.
Note: my full system info is Intel i7 Sandy Bridge with AMD R7 200 series with all of the GPU control panel crap turned off except for Eyefinity.
Anyway, here are the graphs that illustrate some of the more interesting comparisons:
First off, Aero compositing is bad news for both latency and variance. The increased variance is a real kick in the pants because it makes your performance less predictable. If you want consistent behavior and generally improved latency, stick with a "classic" non-Aero theme. Interestingly, disabling Aero did not seem to help with Higan.
Overall, this graph shows us that exclusive fullscreen is significantly better than windowed for latency, which is expected based on our Aero compositing findings. You'll notice there's no benefit to fullscreen in Higan (it's worse, in fact) because it's not *exclusive* fullscreen. Instead, it's what's known as "windowed" or "borderless" fullscreen. You can also see that ZSNES in exclusive fullscreen is extremely fast; faster than my supposed baseline of 75 ms :O
Higan had the highest latency figures here, even after correcting for the shaders--which I'll talk about more in a sec--with RetroArch about a frame lower (this includes data from both the snes9x and bsnes-compatible cores, which were not significantly different [87 ms vs 92 ms, which is within the variance of USB polling rates]). This also combines both windowed and fullscreen, which hurt ZSNES and ZMZ, the clear winners in exclusive fullscreen mode from our previous graph. Note: when ZSNES and ZMZ went into exclusive fullscreen, they broke Eyefinity, which other testing suggested adds up to ~8ms (or a half-frame) of latency, so keep that in mind when looking at their results.
This one was a dagger in my heart, but I'm posting it here anyway because of SCIENCE. I had always assumed that shaders would never increase latency because, in a worst-case scenario, they would just reduce the framerate (i.e., if the shader takes >16 ms to render). This is obviously not the case, as cgwg's crt-geom increases latency considerably in both Higan and RetroArch, as does crt-lottes. Crt-hyllian, on the other hand, has almost no effect on latency. To explore whether it's just heavy-duty math that causes the latency and whether it's exacerbated by multiple shader passes, I also tested Hyllian's xBR-lvl4-multipass in RetroArch. Shockingly, this one produced lower latency than no shader at all, which I find highly dubious.
I kept this one in here because there's been a contentious debate as to which of these platforms provides the best experience for emulating arcade games. However, there are some serious caveats to keep in mind before drawing too strong of a conclusion: 1.) this used a different test ROM from the SNES emus, 2.) the test ROM I used was selected out of convenience and actually had a lot of potentially confounding noise in the form of enemy bullets passing through my test area, 3.) GroovyMAME and RetroArch are really at their best running in KMS via Linux rather than Win64, so they would likely have more pronounced benefits vs mainline MAME if I could have measured that, and 4.) in initial testing the day before I ran these measurements, mainline MAME performed incredibly badly, with GroovyMAME close behind, which suggests that there may be some other variance involved.
That all said, these data indicate that RetroArch is approximately 1.5 frames slower than GroovyMAME, while the difference between mainline MAME and GroovyMAME is within the variance of USB polling rates. However, in light of the counfounders, I think the strongest conclusion we can draw reliably from the arcade comparison is that RetroArch isn't any *better* in Win64 (i.e., a null finding), so users should go with whichever platform has the features that best suit their needs rather than worrying about slim-to-nonexistent latency differences.
Conclusions
While the testing was not 100% reliable due to multiple confounders in several areas, we can see some trends emerge that can inform our discussions about latency in emulation. Windowed is definitely worse than fullscreen, and enabling Aero compositing is worse than without while also increasing variance and unpredictability. Shaders can actually cause excess latency, sometimes severely so. ZSNES, which has become a bit of a punching bag among SNES emulation scenesters, has outrageously low latency in fullscreen, so if you can stomach the terrible accuracy, there's actually some justification for using it now other than OMGSNOW!1! Alcaro's ZMZ also performed very well and can utilize more accurate emulation cores, so it can be a means to leverage some of ZSNES' latency benefits without being stuck with its poor accuracy.
In the future, I would like repeat these tests with a CRT monitor, which would have a predictable baseline of near 0 ms. I would also like to test latency in other environments, namely Linux+KMS. Finally, it would be very useful to have some comparative figures for original SNES hardware (both via CRT and upscaled via XRGB-Mini) and for RetroArch running via console.
Here is a link to download the raw data in Excel format, in case anyone would like to look at the numbers in more detail and/or perform other comparisons that I didn't think of.
EDIT: I think some people are drawing more conclusions from these data than is really appropriate; specifically, some folks are trying to draw direct comparison between the emulators/frontends tested. These data are simply not extensive enough for that. Furthermore, it's important to keep in mind that I didn't test the quality of sync, which could heavily affect the results. Namely, ZSNES and ZMZ both suffer from frequent audio crackling and frame stutters, which indicate issues with vsync, while RetroArch has none of either. I didn't test RA with vsync disabled (i.e., blocking on audio with video tearing), which could have an effect, and in general gameplay, users need to decide whether improvements in sync are worth minor (potential) increases in relative latency.