What do you think of FPGAs?

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-07 (#241507)

Dwedit covered the main points I think, but I want to elaborate just a little:

There's also a third option, i.e. dedicated ASIC, that should be included here. For NES/Famicom clones these are sometimes referred to as NOAC (Nintendo on a Chip), but in general this is a single purpose non-reprogrammable chip.

ASIC

Hardware compatible (can use original cartridges, peripherals)
No latency
Expensive to set up manufacture $$$$$$ (only larger companies can design these)
Very cheap unit cost when mass produced $ (appropriate for disposable toys)
Can't be modified or upgraded

FPGA

Hardware compatible
No latency
Development tools are less expensive $$$ (accessible to hobbyist)
Unit cost of FPGA is moderate $$
Can receive firmware updates

Software Emulator

Not hardware compatible, all aspects of hardware have to be simulated
Latency depends on qualities of the hardware and software
Development tools are free
Costs nothing to duplicate, easily runs on computers you already have, or cheap ones (you supply your own hardware)
Very easy to update

You may notice that I didn't list accuracy here, and that's because none of these systems are inherently an accurate or inaccurate reproduction of the original system. In theory all of these could have some perfect recreation, in practice there are a few factors that weigh strongly:

1. FPGAs became viable to hobbyists for this purpose only in the past few years. This means they can take advantage of years and years worth of existing documented emulation knowledge. Emulators and ASIC clones have been around much longer, and because of that many older less accurate ones are still available. Going for an FPGA system means you're getting a relatively "new" solution. For the other two, you may have to be more careful to find what you want. There are only a couple of viable FPGA products, but there's like a thousand NES emulators out there.

2. Platforms that can be updated have a lot better shelf life, and will ultimately get closer to that goal. Emulators can receive new versions as often as their author wishes, for the most part. FPGAs are reprogrammed every time they turn on, so they too can have a software upgrade. ASICs on the other hand, can't change. They don't get upgraded, they get thrown out, and setting up a new design for a new production run is massively expensive, so it will happen not very often at all. So, despite having some similar hardware properties as an FPGA, the ASIC's long term development is stifled by how difficult it is to upgrade.

3. The latency issue is something many emulators struggle with. Sometimes there is latency inherent in the PC hardware they're connected to, or the operating system, sometimes it comes from the software solution of the emulator. There exists very good, very low latency emulator solutions, but this is something that is difficult to engineer and not guaranteed. On the other hand, with an FPGA or ASIC solution, you have effectively no latency by default, much harder for anything to go wrong in that respect.

4. In general the current FPGA systems are for-profit. Emulator software generally is not. There are exceptions, but this is the norm. Even with something like the Retron 5, the emulator component was created from free emulator software and not written by Hyperkin. How much time a free emulator author wants to volunteer to support their software is a very individual thing, but overall this commercial aspect does have an impact on how much and how quickly support can be provided. ASICs in contrast are always for-profit and can't be supported anyway, only replaced.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-07 (#241509)

gauauu wrote:

I've beat Mike Tyson's Punch Out on real hardware with a CRT. Never managed on an emulator ever.

That game is kinda the poster child for latency. The time you have to react to what you see in that game is so miniscule any delay at all drastically cuts into your ability to respond. (I too consider it unplayable except on a CRT.)

Most other games give you more time to react than Punch Out does, though, so the effect there is much more slight. Definitely can matter for e.g. high level speedrun play, but for some games it's barely worth noticing. Depends on what you're playing, in most cases a few frames less of lag will feel a little better but probably won't be critical. (None of the Punch Out sequels rely on such short reaction times, and still play fine with a little bit of lag, IMO.)

3-5 frames of latency seem to be pretty common. In some setups emulators can definitely have less than that, but there's a bunch of issues here, many of which aren't under the emulators control (operating system, video drivers, monitor, etc.), and it's not the easiest thing for the average user to directly measure and diagnose.

Though at the same time if your monitor/TV has its own latency, even an FPGA isn't going to fix that. It for sure won't have any added latency at the source, at least, but the TV hardware is something you have to deal with separately.

Re: What do you think of FPGAs?
by Dwedit on 2019-08-07 (#241511)

Punch-Out's dirty secret is 2 frames of input lag baked into the game itself, plus at least 3 frames of additional before dodges and punches. Only Pressing Start to start a fight, and starting to dodge with Down have the lowest possible lag of 2 frames.

Fortunately, this means you can backdate input 5 frames into the past, and eliminate 5 frames of input lag from a modern system, and it will look about the same.

Re: What do you think of FPGAs?
by 93143 on 2019-08-07 (#241515)

I find that at least on my laptop, Super Mario World and Super Mario Kart are a fair bit harder to play because of the lag. Precision jumping is much harder in SMW, and I get pilot-induced oscillation in SMK. F-Zero isn't as bad for some reason, but it's still not great, and more importantly it trains me to compensate for the lag, which means I actually get worse at the real thing.

It's not just Punch-Out.

...

It seems to me (who has roughly a Wikipedia-level understanding of this matter) that perfect accuracy would be much easier to achieve in an FPGA. Given a decap of all the chips, you simply have to reproduce the circuit design, and all the quirks and edge cases will automatically be there. In a software emulator you have to model all that stuff explicitly, and some of it may be quite expensive in terms of CPU power. The SA-1, for instance, may never be perfectly emulated in software, but an FPGA could handle it fine.

Which brings up the matter of FPGA special chips. SD2SNES has proven the concept. If a homebrewer had written a game that used an expansion chip and wanted to produce a cartridge version, he would have basically three options: (a) destroy a bunch of irreplaceable originals to harvest their chips, (b) do an ASIC production run, or (c) use an FPGA.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-07 (#241517)

Dwedit wrote:

...you can backdate input 5 frames into the past, and eliminate 5 frames of input lag from a modern system, and it will look about the same.

That's an interesting work-around for Punch Out at least. I've heard of this kind of thing for network play to deal with the extra inherent latency there too, i.e. rewind and re-emulate a few frames when you find out input changed in the past.

93143 wrote:

I find that at least on my laptop... it's not just Punch-Out.

Punch Out is just the worst example I can think of. It affects everything, and it's a relative measure. If your setup has worse lag, then it affects everything more.

93143 wrote:

It seems to me that perfect accuracy would be much easier to achieve in an FPGA. Given a decap of all the chips, you simply have to reproduce the circuit design, and all the quirks and edge cases will automatically be there. In a software emulator you have to model all that stuff explicitly, and some of it may be quite expensive in terms of CPU power. The SA-1, for instance, may never be perfectly emulated in software, but an FPGA could handle it fine.

I don't really agree. An FPGA is not a transistor-for-transistor recreation. FPGA software is still written in a higher level language, and the way it implements things is different than the same stuff would be in an ASIC.

In theory you could have a decap, and in theory you could have a perfect mechanical analysis of the die photos that you could automatically translate into an FPGA program... in practice this is not done. Even projects like Visual 2A03 took a lot of careful analysis and understanding to correct the mechanical errors and get running correctly. But, even with some perfect netlist description, such a thing would probably be unfeasible to implement in an FPGA directly. It needs to be simplified into higher level constructs to be translated into something higher level that more efficiently fits the FPGA's capabilities. (Size and power and cost of the specific FPGA become relevant here.)

So ultimately, the way things are done, if you understand something enough to make a "perfect" FPGA recreation, the same thing could be done in an emulator. There's no free lunch here. Where the FPGA excels is at having several simple devices running in parallel that have timing that correlates, but these aren't an insurmountable problem for software emulation.

Cases of "it would be too much CPU to emulate this accurately" are pretty rare, IMO, at least until you get to systems that are out of scope for FPGA anyway. Usually stuff like that falls under analogue effect categories that an FPGA can't be a solution to either. Like obviously Visual2A03 is a very slow approach, but the FPGA consoles aren't doing what Visual2A03 does, they are higher level programs based on understanding of the logic, not some automated conversion of the decap.

I'd vaguely agree that it's probably easier to implement a more accurate FPGA than an emulator, but this is a very vague comparison. How much easier? I have no idea, but yes at least the digital logic and timing components of an FPGA are more similar to what the original hardware did internally, at least. I think whatever difference there is here is greatly outweighed by the developer's individual skill. (On the other side, I'd suggest that all the extra hardware tools and domain knowledge required makes it harder to develop for FPGA, just in general, even if maybe the "accuracy" goal might be vaguely easier?)

93143 wrote:

If a homebrewer had written a game that used an expansion chip and wanted to produce a cartridge version, he would have basically three options: (a) destroy a bunch of irreplaceable originals to harvest their chips, (b) do an ASIC production run, or (c) use an FPGA.

Well, there's another missing option here which is CPLD, which is kind of like a cheaper one-time-programmable FPGA. Those are pretty good for replacing a single mapper/expansion chip, and we've seen them used a lot already for this purpose. FPGA's specific advantage is that it can be rewritten with a new program, which you don't need for that purpose. It has a disadvantage in that it needs to be reprogrammed every time it powers on, which makes it less appropriate for a cartridge. FPGA is great for something like PowerPak that needs to simulate a hundred different mappers, or for an expensive console like the NT that needs a way to do firmware updates, and can afford to do some "setup" of the FPGA during boot.

Re: What do you think of FPGAs?
by 93143 on 2019-08-07 (#241519)

Like I said, my understanding is pretty much Wikipedia-level. I've seen what an FPGA can do, and I've read descriptions of what they are, but I haven't engaged with the concept at a technical level.

It seems I wasn't entirely wrong, so that's something...

rainwarrior wrote:

Cases of "it would be too much CPU to emulate this accurately" are pretty rare, IMO, at least until you get to systems that are out of scope for FPGA anyway.

I did mention an application that's very demanding in a CPU power sense, which is the SA-1. It's not demanding in itself, but the way it interacts with the S-CPU is extremely fine-grained and difficult to emulate in software.

Quote:

Well, there's another missing option here which is CPLD, which is kind of like a cheaper one-time-programmable FPGA.

I knew I was forgetting something... Are there CPLDs that could manage an SA-1 or Super FX? Combined with an MSU1?

Are they programmable with the same tools? Could you port an FPGA design to a CPLD with minimal work? Or would you have to start from scratch if you wanted to (say) release a GSU2 game on cartridge without either shelling out for FPGAs (and making sure the game doesn't try to use the chip before it's ready) or going Aztec on a pile of Yoshi's Island and Doom carts?

Re: What do you think of FPGAs?
by Bregalad on 2019-08-07 (#241522)

With FPGAs il will be hard to model quicks like OAM decay, colour emphasis, audio DAC nonlinearity, and other analogue-related effects. Sure those can be somehow simulable, but it'll end up looking like a hack and an unelegant solution no matter whatr you do. For example to simulate OAM decay you'd need a lot of timers and write random data to OAM after some time not reading it.

Re: What do you think of FPGAs?
by TmEE on 2019-08-08 (#241523)

They're no more harder or different to do than in software and nobody forbids you to bring out the signals for actual processing in analog domain either but if that's worth it is another question. You could probably do a fairly good job on sound related things that way.

Main thing for me is that you can do all these things in power efficient manner. Some of the very accurate emulators require you to have a machine that consumes hundreds of W of power for optimal experience, while the same logic on FPGA would not even reach 10, often you don't even need a heatsink on the chip.

In general I'm a big fan of the FPGAs but not so much because you can reimplement old things but you can also make completely new ones too ~

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-08 (#241524)

93143 wrote:

It's not demanding in itself, but the way it interacts with the S-CPU is extremely fine-grained and difficult to emulate in software.

Yes, this is the kind of thing that I would say might be, in a very vague way, easier in FPGA, but in the bigger picture I think it can be done just as well in software, even if it's a "harder" thing to do in software. (In a lot of ways doing a hard software task might still be easier than doing an easy FPGA task, if you know what I mean.)

93143 wrote:

Are there CPLDs that could manage an SA-1 or Super FX? Combined with an MSU1?

Asking whether there's a specific CPLD on the market that could implement a replacement for a specific ASIC is something that would require a bit of research and engineering to answer, but probably trivially the answer is yes on some level. CPLDs tend to be less powerful and less versatile than FPGA, but there's ranges of capabilities in available devices, and you can use more than one. I think this is more a question about cost than anything else?

93143 wrote:

Are they programmable with the same tools? Could you port an FPGA design to a CPLD with minimal work? Or would you have to start from scratch if you wanted to (say) release a GSU2 game on cartridge without either shelling out for FPGAs (and making sure the game doesn't try to use the chip before it's ready) or going Aztec on a pile of Yoshi's Island and Doom carts?

CPLDs and FPGAs have a lot of overlap. You can more or less write a verilog program and put it on either type of device. There are CPLDs and FPGAs available with various cost/size/features, and there are a bunch of nitpicky details (e.g. how low level logic actually gets implemented is different, even if they provide the same high level function), but they are somewhat interchangeable.

The main thing is just that FPGAs are more expensive and often more powerful, and normally have volatile program storage, so they are reprogrammable (big advantage if you need it, esp. for prototyping) but also require more supporting hardware to deliver that program every time they power on.

CPLDs on the other hand have non-volatile programs. They can start functioning immediately from power-on, and don't require the kind of extra support components an FPGA does. They're more or less a customizable replacement for an ASIC. However, reprogramming them is usually more difficult.

Re: What do you think of FPGAs?
by koitsu on 2019-08-08 (#241526)

What do I think of FPGAs? I think they're neat
Do you think they are important in any meaningful way? Yes
Do you think they can "preserve" history as some people or magazines put it? In many/most regards, yes
Are they important or just optional/non-essential tools for homebrew developers? Both (but skill set and skill level matters tremendously for the latter)
Is this going to be a temporary trend that will fade away? Probably not, that is, until hardware folks create some alternate technology (e.g. CPLD) that provides similar functionality

Re: What do you think of FPGAs?
by Oziphantom on 2019-08-08 (#241528)

gauauu wrote:

Oziphantom wrote:

most of the arguments come down to cargo cult garbage.

Oh but that just code running on a CPU, where as this is almost like real hardware its a pure truer source of emulation... if it runs on a computer or an FPGA or is physical bits of silicon arranged on a die, if the output is the same it doesn't matter, none is better or more pure than the rest.

I've beat Mike Tyson's Punch Out on real hardware with a CRT. Never managed on an emulator ever. Maybe that's just anecdotal, but it convinced me that in some cases, there IS a difference in emulation. (and I mostly DO play on emulators these days, so I'm not saying that as a hardware purist)

An I could have an FPGA that has 20 frames latency, and an emulator that has none. Being FPGA or Emulation makes no difference its the quality of the emulation rather than how it is built. Run a Raspberry Pi in a bare metal emulator and use the composite output, there is a custom version of VICE that does to get true 50hz displays. It's more accurate than the C64 ASIC from the DTV

93143 wrote:

Like I said, my understanding is pretty much Wikipedia-level. I've seen what an FPGA can do, and I've read descriptions of what they are, but I haven't engaged with the concept at a technical level.

https://www.youtube.com/watch?v=gUsHwi4M4xE is a good introduction to them.

there is software you use when developing FPGA/CPLD programs that emulates the logic on the PC, so run the FPGA implementation in software on your pc, where are your gods now? The problem is it runs in a pure theoretical "logic" rather than emulating exactly how the chip does it with the chips latency issues, so you get the case works fine in the "simulator", fails on the actual chip

FPGA doesn't lets you emulate the transistor level, they have broad blocks that get combined in odd ways to achieve this or that. For example a 6502 holds its register values, stack, pc and flags "on the line", if you stop the CPU or slow it down to much, then the 6502 looses the PC and register values, that is why it has a min hz spec. To emulate this you need to have traces inside the chip that are the exact same length and width.. this can cause compatibility issues between the NMOS and HMOS versions of the 6502 as they have a different process and hence different size of the "lines" and don't last the same amount of time.

Re: What do you think of FPGAs?
by tepples on 2019-08-08 (#241530)

Oziphantom wrote:

An I could have an FPGA that has 20 frames latency, and an emulator that has none.

Let me try to express in a more rigorous way what I believe was intended:
The obvious way to architect an emulator produces more latency than the obvious way to architect an FPGA. This goes double when considering the latency to and from authentic cartridges.

Oziphantom wrote:

if you stop the CPU or slow it down to much, then the 6502 looses the PC and register values, that is why it has a min hz spec.

But it's still straightforward for an FPGA to simulate "6502 including stable unofficial opcodes, run in spec" or "6502 without decimal mode including stable unofficial opcodes, run in spec" or the like.

Re: What do you think of FPGAs?
by supercat on 2019-08-08 (#241531)

rainwarrior wrote:

Well, there's another missing option here which is CPLD, which is kind of like a cheaper one-time-programmable FPGA.

There are three general approaches that can be used to make any form of programmable logic devices programmable:

1. Use one-time-programmable fuses or antifuses which are selectively destroyed in the process of programming (an antifuse is an insulator which can be induced to permanently fail shorted).

2. Use an addressable storage mechanism near each switching element to enable or disable it.

3. Use a latch near each switching element to enable or disable it.

Approach #1 used to be common for FPGAs even after it was largely abandoned for logic-array devices like PLDs; I wouldn't be surprised if they're still used for some specialized applications. Approach #2 is common in CPLDs (which have very regular configuration layouts), and approach #3 is more common in FPGAs. It's generally cheaper to have the configuration for an FPGA stored in an external chip than have to have the entire chip include all the layers necessary for non-volatile storage, but CPLDs typically include non-volatile storage elements nestled among the switching circuitry.

Re: What do you think of FPGAs?
by 93143 on 2019-08-08 (#241533)

Oziphantom wrote:

An I could have an FPGA that has 20 frames latency

Seems to me you'd pretty much have to do that deliberately, unless you were incredibly bad at FPGAs.

Quote:

and an emulator that has none.

Good luck emulating a fully accurate Super Accelerator System with sub-pixel latency (ie: "none"), or even a few scanlines of latency, on any modern system. A Raspberry Pi isn't powerful enough, and a full-scale PC isn't real-time enough (I'm not sure it's powerful enough either). Just using a framebuffer adds a whole frame of latency by itself. You'd basically have to completely take over a powerful PC at the hardware level, bypassing the OS and drivers, and probably write parts of the emulator in shader language, to get anywhere near what an FPGA could do if properly programmed.

Quote:

so run the FPGA implementation in software on your pc, where are your gods now?

I get the feeling you might be fighting a straw man.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-08 (#241534)

93143 wrote:

This is kind of a bizarre statement. What is "sub-pixel latency" supposed to mean?

If you want 2 simulated chips to correlate with cycle-accurate timing, that's entirely doable on PCs and RPis with emulators. This is not a performance problem, and I don't know why you think it must be. Many emulators are already doing this kind of thing just fine.

Otherwise I have no idea what the relevance is, like you want the video output (i.e. light emitted from your screen) to change at some level smaller than a pixel? What are you even talking about?

Even outputting video scanline by scanline has very little relevance at the FPGA or PC emulator level, that's a video device problem, kind of a different domain. Sure, an FPGA could be used to generate a composite video signal, but so could a suitable generator attached to a PC-- but it wouldn't make any difference unless you were connected to a CRT. Otherwise your monitor will process and display the video signal with its own process that has very little to do with that. Incidentally, you could probably do scanline-by-scanline output on many PCs' built-in video hardware in VGA mode while connected to a CRT, if you really wanted to go down this road. (Shader language would not help with this in any way, IMO.)

But even if you did this, this concept of zero latency vs 1 frame of latency is almost meaningless. The main place it matters is if you want to use an original light gun device that needs to sense light in a very immediate way, much faster than human perception. Like I said in my summary though, emulators don't work with original hardware anyway, all aspects of that have to be simulated, and there are plenty of emulators that simulate a light gun in a very accurate way. (There are even ways to go the other way and use modern pointing devices to connect with the original light gun hardware.)

5 frames of lag on video output is something to care about. <=1 frame of lag? Hardly significant. Scanline by scanline only matters if you're trying to interface with some original hardware that can see light changes that fast (not relevant when you're simulating the hardware). Pixel by pixel, I don't think many things are meaningfully sensitive to this. Sub-pixel? No, that's not a thing.

So the light gun thing is an important point for FPGA clones. That was in my list of features in my earlier post: if you want to interoperate with original hardware (carts, peripherals, etc.) then these are capable of that. If you want to replace a system but otherwise keep all the same hardware connected, this is the solution.

If you're simulating the hardware (ROM file instead of cartridge, USB joystick instead of original gamepad, mouse instead of lightgun, window on a desktop instead of a dedicated television, etc.), the simulation can certainly be as accurate as any FPGA clone could, and plenty of emulators do accomplish this.

Latency on a scale relevant to human interaction is definitely a persistent problem on PCs, but not unsolvable. It's generally down to operating systems, video hardware interfaces, monitors, and the parts of the software that interact with those. Much of what affects that can't be solved at the level of the emulator's software (e.g. video drivers, processing in the monitor itself, operating system settings, USB drivers, etc.) and unfortunately needs work from the user to address. (Fullscreen mode instead of windowed can often help. A 120Hz monitor may address the issue too. This is a whole system configuration topic of its own though.)

CPU power also doesn't really help here. Very little of the latency problem has to do with CPU speed, generally it's all upstream/downstream from the emulator itself.

...but all of that is unrelated to accuracy of emulation. Latency is its own separate issue. If you don't have a CRT and original peripherals to hook up to your FPGA system then it's going to have the same monitor latency issue anyway, at the very least.

The FPGA clones I've seen are good stuff, but you shouldn't conflate output (or input) latency with emulation timing accuracy. You can have one and not the other, and that goes both ways. There are low latency emulator setups, but FPGA systems will have that by default. Cycle accuracy, on the other hand, good emulators already have. FPGAs have no monopoly on that.

The main advantage of emulators is that they're free and run on a computer you already have, which is something FPGA clones can't compete with. The converse is also true, your PC is never going to run an NES cartridge, it needs ROM dumps. That's the main tradeoff. Accuracy is not an issue, a good emulator is just as good as an FPGA for this.

Latency is an issue, and an FPGA clone might be worth its cost to just not have to try and figure out your PC latency problems. It's solvable, though. Many people do manage to get good low-latency setups using emulators.

Re: What do you think of FPGAs?
by Dwedit on 2019-08-08 (#241535)

Emulators are able to use tricks involving savestates (RunAhead) to skip the game's internal lag frames, and show frames from the future, and thus reduce input lag. When combined with a lag-free CRT, you even beat the original hardware. See this post on twitter for a slow-motion video: https://twitter.com/TylerLoch/status/980278954786467842 (youtube link: https://www.youtube.com/watch?v=_qys9sdzJKI )

==Internal Lag==
Super Mario Bros 1 is known to have one frame of internal lag. Here's the rough frame timing of SMB1, assuming Mario is at the bottom of the screen.

Timing of SMB1:

- Vblank -
Copy to OAM
Read Joypad (line 11 within vblank)
- Frame Start -
Run game logic reacting to input (around line 32 within frame)
Generate sprite data (line 76 within frame)
Display Raster reaches location of Mario's Sprite (Y = 177)
- Vblank -
Copy to OAM
Read Joypad
...

Now you want to jump, you hit A.
The minimum possible lag for this game happens when you press the button right before Line 11 on vblank time. Any later, and your input won't be seen until the next frame.
While it is rendering the frame, it is reacting to your joypad input, and generating the sprites that it will display on the next frame.
Eventually the screen raster reaches reaches the point where Mario is, and Mario is drawn there.
Next frame:
Now it's actually displaying the graphics that resulted from the game's state. The screen raster reaches Y=177 and Mario's new state appears on the screen.

So it took about 1.7 frames (best case) for Mario to appear on the TV, and 2.7 frames (worst case) if you missed the deadline for sampling the joypad.

==Emulator on PC==
You can do tricks with savestates (RunAhead) to outright remove such input lag.

Example of RunAhead 1:

Run a frame (discarding audio and video)
Save State
Run a frame (presenting audio and video)
Load State

You have just displayed a future frame, which removes 1 frame of input lag. If you have a CRT plugged in, you have beaten the original hardware at latency.

For different values of RunAhead (such as 2) it looks like this:
Run a frame (discarding audio and video)
Save State
Run a frame (discarding audio and video)
Run a frame (presenting audio and video)
Load State

It's RunAhead 2 because you ran one frame, then ran two more frames, displaying the last one.

==My Own Testing==
Meanwhile, Super Mario World on the SNES is known to have 2 frames of internal lag.

I had done some latency testing on my laptop, which has about 1 frame of display lag on the screen itself. Since you need to wait for vblank before presenting frames, it's more like 2 frames of lag.
Using RetroArch with RunAhead 2, and Hard GPU sync, I got a total input lag of 3 frames. Really nice for modern hardware.
Meanwhile, on a true SNES and true CRT, you have the 2 frames of internal lag.

The RunAhead feature did not originate with RetroArch. Gens 11 re-recording had it first, and that inspired me to add it into RetroArch.
Anyone who wants to reduce input lag in their emulator will need to run in Exclusive Fullscreen mode in Windows, and have a feature similar to RetroArch's Hard GPU sync.
RunAhead is just a few lines of code, it just requires that your emulator has rock-solid savestates, and good enough performance to run multiple frames at once.

Re: What do you think of FPGAs?
by Oziphantom on 2019-08-09 (#241538)

93143 wrote:

Super Accelerator System

This is an SA-1 cart? Adding a 10mhz 65816 is not going to be a problem at all, even if we write the code to run at the minimal step and get the code to emulate various bus levels I don't see that a current cpu would have any issue. Is there somewhere that documents the issues faced? From my limited knowledge of the SA-1 it runs its clock in a strict multiple of the base clock? We got the Amiga emulated, and we have the PS2 mostly emulated, the SA-1 is trivial in comparison. I think VICE even supports detecting the Shift-lock key vs Shift key on a C64 ( they are wired to the exact same pin on the mobo ) trick.

Re: What do you think of FPGAs?
by Rahsennor on 2019-08-09 (#241539)

I know nothing about FPGAs, but on the topic of the discussion/flame war I do know this: getting low latency on a 'modern' PC is a stone bitch. Your keyboard alone probably has more latency than a NES+CRT end-to-end, nevermind the rest of your computer.

It's possible, sure, but you've really got to do your research... and since manufacturers (intentionally) don't publish hard specs on their hardware, that's going to involve either owning a logic analyzer or knowing someone who does.

Re: What do you think of FPGAs?
by calima on 2019-08-09 (#241541)

pcsx2 is a big pile of hacks that still has glitches in almost every game, and doesn't run full speed in software. GPU renderers are faster but bring their own issues. Then there's the "only 32-bit" part. /siderant

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-09 (#241544)

PS2 is wayyyy out of scope for the current generation of FPGAs anyway, no?

Re: What do you think of FPGAs?
by Oziphantom on 2019-08-09 (#241550)

A single FPGA sure, no chance, but I don't see why you have to just use 1.

Re: What do you think of FPGAs?
by Oziphantom on 2019-08-09 (#241552)

Rahsennor wrote:

This isn't a Emulation vs FPGA argument, this is "custom designed thing to do task X" vs "giant general purpose machine that multitasks and runs lots of different software" argument.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-09 (#241560)

Oziphantom wrote:

A single FPGA sure, no chance, but I don't see why you have to just use 1.

I don't believe that's feasible either. The FPGAs have to be able to handle not just the complexity but the speed as well, and combining multiple FPGAs has diminishing returns, especially dealing with all the connections between them.

For a rough comparison, here's 3 generations of just the CPU:

SNES: 22K transistors, 4MHz CPU
PS1: 1M transistors, 34MHz CPU
PS2: 13M transistors, 300MHz CPU

SNES is currently proven to be commercially viable to reproduce in an FPGA. PS1... not yet. I've heard of a FPGA PS1 project, but not a complete one. PS2 isn't even on the table. (Please correct me if I'm wrong.)

I'm sure at a high enough price there are suitable FPGAs to reproduce a PS2, but I'm also pretty sure they're so expensive ($$$$?) it would be ridiculous to try to use them for this purpose.

General purpose CPUs and software emulators, on the other hand, are already overcoming these problems at a much more reasonable price. There's been like a 20 year lag between a system viable as a PC emulator, and one viable as an FPGA clone.

Re: What do you think of FPGAs?
by tepples on 2019-08-09 (#241561)

You don't have to put the whole console in one FPGA. If a single chip from the original console fills an FPGA, a reproduction can in theory use an FPGA for just that chip. For example, one could implement a Super NES in one 65816, a few RAMs, and six FPGAs: memory (incl. DMA and input) controller, audio CPU, audio DSP, sprite compositor, background reader and priority/color math compositor, and scaler. These would correspond to the S-CPU, S-SMP, S-DSP, the two S-PPUs, and S-ENC/S-RGB. How many chips make up a PlayStation console, model SCPH-100x?

Re: What do you think of FPGAs?
by psycopathicteen on 2019-08-09 (#241564)

rainwarrior wrote:

Oziphantom wrote:

A single FPGA sure, no chance, but I don't see why you have to just use 1.

A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?

Re: What do you think of FPGAs?
by Garth on 2019-08-09 (#241570)

psycopathicteen wrote:

A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?

According to https://en.wikipedia.org/wiki/Transistor_count, WDC's 65c02 (ie, CMOS) has 11,500 transistors, and the 65816 does have 22,000. My guess is that the CMOS uses totem poles all over the place to speed up operation (rather than passive pull-ups which have a longer RC time constant), and then of course there are all the added instructions and addressing modes, and additionally for the '816, more and wider registers (remember the bank registers, 16-bit stack pointer, 16-bit direct-page register, and A, X, and Y that can optionally be 16-bit), the 16-bit ALU, and the 24-bit mux'ed address bus).

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-09 (#241571)

tepples wrote:

You don't have to put the whole console in one FPGA.

I don't think anyone said you had to, but at the same time it's not necessarily helpful to split them up vs. choosing a bigger FPGA, either. The Super NT and Mega SG are both single-FPGA systems, and there's probably a good reason for that.

...and I still doubt there is currently any reasonably priced FPGA or combination of FPGAs that you could use to recreate a PS2 CPU. (Completely ignoring the additional mountain of engineering work required.)

As per calima's complaint, PS2 emulation is still struggling a bit / infancy, but there isn't an FPGA version waiting in the wings to solve all those problems. Developing a good FPGA clone takes as much research as developing an emulator. Switching paradigms doesn't do 20 years of platform research automatically. Give it time, and there'll be a better PS2 emulator. A good PS2 hardware clone is probably many years away yet, or maybe it will never come. The only reason the current FPGA systems exist is that we have arrived in some perfect storm of cheap enough hardware and deeeeeep public console research knowledge bases to borrow from.

Re: What do you think of FPGAs?
by Oziphantom on 2019-08-09 (#241574)

for mass production having 1 less chip to place, 1 less chip to program is a saver

Sure you would prefer a single chip design, but adding more is possible just adds more complexity and manufacturing costs

For something like a PS2 though I would go with a Zynq UltraScale+ as it has 2/4 ARM cores + FPGA logic, they are over $300 though... not a practical thing at this point as you can get a real PS2 for $50. With the PS2 I would argue that the "sub pixel" accurate press the button on the pad and there is 0 lag case doesn't really exist, PS2 games that run at 50fps are rare with most coming it at 12~25 so using a multi core cpu to emulate it would be "good enough", although the PS2, GC, Xbox are probably the final frontier for such a product as they are the last pre LCD and designed for Composite CRTs. The key technology innovation I would see for these emulators is getting a good way to program multiple cores and keep them lock-stepped.

Re: What do you think of FPGAs?
by 93143 on 2019-08-10 (#241579)

rainwarrior wrote:

What is "sub-pixel latency" supposed to mean?

I was responding to the claim, regarding lag, that an emulator could have "none". In the context of a SNES emulator, I interpreted that as meaning zero (or, to be generous, less than two) extra master clocks between input and output (not necessarily perceptible result, as that's partly on the display device) as compared with real hardware. I would probably be satisfied with less timing precision than that, personally, but that was the claim.

Quote:

If you want 2 simulated chips to correlate with cycle-accurate timing, that's entirely doable on PCs and RPis with emulators.

I know an emulator can be very accurate. I also know that multi-frame latency isn't inherent to software emulation in principle. But in practice, achieving both high accuracy and low latency at the same time is very hard, in particular for a specific console/chip combo that byuu has been complaining about quite recently. As you say, most of the latency is not due to the emulator, but is imposed by the computing environment that runs it. And while it may be different for NES, you can't run an accurate SNES emulator on a system that offers easier low-level access - you need a PC.

Quote:

This is not a performance problem, and I don't know why you think it must be. Many emulators are already doing this kind of thing just fine.

It is absolutely a performance problem. Why do you think higan takes so much CPU power? It's not the individual chips; it's all the syncing. You cannot run higan anywhere near full speed on a RPi.

Which brings up another way in which very low latency could potentially be expensive. If you're simulating exactly what the console does in real time, you have to sync all the chips every cycle. Unless things have changed quite a bit since last time I checked, the only reason higan runs at full speed even on a modern high-powered PC is that it's smart about only syncing when it has to. This means it can't generate half-dots every 93 ns on the tick; it's asynchronous and the output only comes together properly because the results are buffered.

(Correct me if I'm wrong about this.)

Quote:

that's a video device problem

The video device problem is a separate problem, and it applies equally to FPGAs or any other method of making new hardware pretend to be an old console. With a CRT it's not a problem. With an HDTV, the problem is actually worse for original hardware than for clones capable of using HDMI. I'd rather leave the display technology argument aside because it's a whole other discussion.

Quote:

Incidentally, you could probably do scanline-by-scanline output on many PCs' built-in video hardware in VGA mode while connected to a CRT, if you really wanted to go down this road. (Shader language would not help with this in any way, IMO.)

Interesting. But could you do pixel-by-pixel?

I mentioned shader language because I know that GPUs can be used for massively parallel general computing, and it occurred to me that it might be possible to leverage this in an emulator if the goal was a combination of very high accuracy and very low latency. I was basically handwaving at that point because I don't have any expertise with GPU programming.

Quote:

But even if you did this, this concept of zero latency vs 1 frame of latency is almost meaningless.

One frame is absolutely perceptible. Ever play Mario Golf? Even on a CRT, the shot control timing is noticeably late. Also, I've worked with digital music creation tools, and an ASIO buffer size of 20 ms (only a little more than a frame) is unacceptably long for live playing. It's on the same order as the amount of time it takes for a piano key to fully depress after being struck firmly.

According to that keyboard latency page linked earlier, humans can detect as little as 2 ms (0.12 frames) of lag, and perceived lag does make you worse at what you're doing.

...

Dwedit wrote:

Emulators are able to use tricks involving savestates (RunAhead) to skip the game's internal lag frames, and show frames from the future, and thus reduce input lag.

Interesting trick.

Quote:

If you have a CRT plugged in, you have beaten the original hardware at latency.

Not if you're using a framebuffer you haven't (well, the typical full-frame buffer anyway). Also, there are other factors besides the monitor that induce latency on a PC, even if you do figure out how to do direct line-by-line output.

What is hard GPU sync?

Quote:

good enough performance to run multiple frames at once.

That's kinda the catch, isn't it?

Besides, once you try to compensate for more lag than the original game had, you no longer have the necessary input data to emulate the future frames regardless of how fast you can render them, and run-ahead is no longer a perfect display of what the real system would show. If it takes five frames for your controller input to make it through the USB driver to the emulator to the graphics card to the monitor to the actual screen, you need to handle some of that some other way because run-ahead will give you glitches.

All of this game-specific hacking doesn't really damage the case that an FPGA is a "purer" way to get high accuracy at low latency (if anyone were to attempt to make such a case)...

...

Oziphantom wrote:

93143 wrote:

Super Accelerator System

From the Mesen-S thread:

byuu wrote:

The thing that hurts libco with the SA1 is that both the SNES CPU and SA1 can simultaneously access BWRAM and IRAM, which are of course volatile, and ROM can be dynamically remapped. So in effect, for perfect synchronization you would have to synchronize to the other component every time ROM, BWRAM, IRAM, and I/O registers were accessed, which is almost every cycle.

And again:

byuu wrote:

The design of the SA1 is ingenius and evil: the CPU cannot be stalled because the SNES CPU has no concept of external wait states (/DTACK on the Genesis, for instance.) So instead, the SA1 detects when the SA1 CPU tries to access ROM, BWRAM, or IRAM while the SNES CPU is accessing it, and will insert wait states into the SA1 CPU.

Three years ago in a different thread:

byuu wrote:

SA-1 memory conflict stalling is going to be the thing that totally destroys us. We're chasing our tails over a bit of SFX timing issues, but the SA1 is probably running 30% faster than it should.

Now, byuu is sometimes a bit hyperbolic about SNES emulation issues, but that doesn't sound trivial to me.

Oziphantom wrote:

Rahsennor wrote:

getting low latency on a 'modern' PC is a stone bitch.

This isn't a Emulation vs FPGA argument, this is "custom designed thing to do task X" vs "giant general purpose machine that multitasks and runs lots of different software" argument.

That's the entirety of the argument. An FPGA is not a unique philosophical primitive. If you think about it, it's really just a computer with an unusual architecture, programmed in an unusual language. Running a simulated console on an FPGA is software emulation.

And for a variety of practical reasons, it's much better suited to certain low-latency parallel applications than C++ on a Windows PC.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-10 (#241594)

93143 wrote:

rainwarrior wrote:

Incidentally, you could probably do scanline-by-scanline output on many PCs' built-in video hardware in VGA mode while connected to a CRT, if you really wanted to go down this road.

Interesting. But could you do pixel-by-pixel?

Maybe, but this is a bit ludicrous? This is orders of magnitude above all thresholds of human reaction or even perception (0.3μs?), and also above the fidelity of relevant peripherals (i.e. light gun). I completely don't understand the goal of synching video output that finely. There is no plausible purpose.

I mentioned VGA output as a curiosity, just because a lot of PCs happen to still have it, and you could probably run your computer in FreeDOS to get pretty direct access to it with essentially no intervening OS or drivers. It's kind of a bad suggestion though, since you now have a ton of other problems to solve (esp. peripherals like sound device). An RPi is probably infinitely more practical to think about.

93143 wrote:

rainwarrior wrote:

But even if you did this, this concept of zero latency vs 1 frame of latency is almost meaningless.

One frame is absolutely perceptible. ... Also, I've worked with digital music creation tools, and an ASIO buffer size of 20 ms (only a little more than a frame) is unacceptably long for live playing. It's on the same order as the amount of time it takes for a piano key to fully depress after being struck firmly.

According to that keyboard latency page linked earlier, humans can detect as little as 2 ms (0.12 frames) of lag, and perceived lag does make you worse at what you're doing.

I said "almost meaningless", not "imperceptible". There is no hard threshold where there is too much lag, everything depends on purpose.

1. 2ms of lag is not the same problem as 100ms of lag. They should not be equivocated.

2. Musical instrument applications are a different purpose with different latency constraints. The games we're talking about are universally synching to a frame, and gameplay is a reactive feedback loop. A piano is not this.

Under fairly ideal conditions, you have an average of ~7ms (half a frame) of input lag before your input is read by the game. Then there's a whole frame before it begins to display ~15ms, then there's the time until the part of the screen you want to see the change on displays, maybe another ~7ms average. On top of this basic ~30ms, you then have ~100-300ms of human reaction time completing the feedback loop.

An extra 2ms on 130ms is a very small effect. Certainly measurable by scientific methods, but still very small. 15ms makes a difference, but how much? Not very much compared to 100ms, which can easily happen on an unluckily configured PC/peripheral/monitor combo. That scale of latency is worth addressing. Less than 1 frame? I really don't care. Magnitude is relevant.

Playing a musical instrument through ASIO is a different act entirely, and most of the problem you're describing is granularity rather than latency. Having a rhythmic input quantized to the size of the buffer jitters it very noticeably. If you simply added a few ms of latency without being quantized to the buffer, the effect of latency alone is much more subtle on the scales we're talking about. 15ms quantization is pretty severe, 15ms much less so, at least in a musical context. 15ms is the same amount of latency you get by playing along with a drummer who is 5 metres away. 15ms is not even a single wavelength of a moderately low tone.

Playing music is not about making split second feedback reactions against visual stimulus, it's about doing something in precise regular rhythm and timing. Again there is a problem of conflating two separate but correlated problems, because software used with ASIO doesn't usually have much ability to deal with granularity as a separate issue.

93143 wrote:

(SA-1 stuff) Now, byuu is sometimes a bit hyperbolic about SNES emulation issues, but that doesn't sound trivial to me.

Well, okay there's a lot going on there, I will grant that this synchronization is probably an easier problem to solve in the FPGA domain, at least when ignoring all other issues. I think I said as much in an earlier post already.

I would still argue that it's not an insurmountable problem and eventually a better solution could be found, but that begs the question whether it's worth solving. Another quote from just below one of the quotes you took:

byuu wrote:

So essentially, yes, bsnes' SA1 core is cycle accurate. But Snes9X's SNES CPU and SA1 cores are both opcode-based, and to my knowledge it does not break any games.

Essentially we're talking about passing a hardware test ROM. Yes, it's a measurable difference in that scientific context. Does it make any difference to the visual our audio output of any of the relevant games?

Byuu is going down a line of research, testing every minute difference that can be accounted for to see if it makes a difference. Usually the first way of implementing anything like this is going to be deliberately inefficient. The goal is to get a test running, not make it efficient. The initial point is finding the magnitude of its effect.

So the question of whether it could be more efficient is a bit moot. If this never demonstrates an effect that we can see in a game we want to run... how much engineering effort do you want to put into making this feature we don't need faster or more efficient? There's a call to action missing here.

I'm not saying that byuu is slouching at all. What I'm saying is that this particular thing probably could be solved more efficiently than byuu's current solution (there's always a faster way). The real question is whether there is a valuable point in doing so vs. how much work it would take. I'm sure there's been effort already to address it, but ultimately how much it affects things is very relevant when choosing what to spend time on. If it made Yoshi explode, it'd get solved faster.

If the only difference is "I can write a test that determines this is not a SNES", well there's a bunch of ways an FPGA system will fail too if you deliberately write new software to expose such a thing (and some of those things are easier to emulate). There are barriers here that are simply not practical to cross on either side of this problem. There's always another irrelevant test you can write.

Re: What do you think of FPGAs?
by tepples on 2019-08-10 (#241595)

I'm agreeing with what you say about 2 ms latency vs. 100 ms. To put it in context, 2 ms is about what you get with the scaler in a Hi-Def NES or Analogue Nt mini, as the circular buffer in Hi-Def NES block RAM holds that many scanlines' worth of pixels. I'm also agreeing with what you say about byuu getting it right first and fast later.

But one caveat related to bidirectional synchronization between S-CPU and SA1:

rainwarrior wrote:

I would still argue that it's not an insurmountable problem and eventually a better solution could be found, but that begs the question whether it's worth solving. Another quote from just below one of the quotes you took:

byuu wrote:

So essentially, yes, bsnes' SA1 core is cycle accurate. But Snes9X's SNES CPU and SA1 cores are both opcode-based, and to my knowledge it does not break any games.

It depends on whether "the relevant games" is a closed class (licensed games only) or an open one (headroom for compatibility with homebrew yet to be released). Without any sort of warning from the emulator that a game is entering into dangerous territory sync-wise, a homebrew dev may easily end up accidentally relying on some behavior difference between hardware and the primary emulator. It'd be like the NES from when NESticle-only homebrews and ROM hacks were common or Super NES now when Snes9x/ZSNES-only Super Mario World hacks are still common. This is why BGB, a Game Boy emulator, includes a bunch of warnings that the user can turn on to break into the debugger whenever the CPU performs a read or write that doesn't have the expected effect.

Attachment:

bgb_exceptions.png [ 5.81 KiB | Viewed 2228 times ]

I imagine what may be going through at least one reader's mind: "Then make one emulator that's real-time and another that's cycle accurate. Test a homebrew game under development for technical compliance in the accurate emulator, and test for play balance in the fast one." But relying solely on automated tests for technical compliance, with play balance testing being completely separate, has its own drawbacks. One is test coverage: extensive play balance testing may cover more obscure code paths that the programmers accidentally failed to test for technical compliance.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-10 (#241599)

The topic is FPGA vs Emulator, and the question was whether it can run accurately on an RPi. Demanding that developer tools also have to run at full framerate on an underpowered RPi is a bit out of hand, IMO. That's way outside the scope of what FPGA are even for. It's an inappropriate comparison.

If you want to test for hardware compliance in the way you're describing, neither an FPGA or an Emulator is a sufficient test, and for this purpose FPGAs are very much a worse tool for most aspects of the work.

Emulators can have integrated debuggers, breakpoints, watch for various conditions, scripting, etc. This is just not going to happen on an FPGA clone. That's a great thing about emulators like byuu's: you can throw extra CPU around to test edge cases that would be completely impractical to verify on hardware.

In terms of accuracy FPGA clones are at best no better than the most accurate emulators, all other factors ignored. As a software development tool they are worse.

In terms of efficiency for playing games accurately, they might be able to function with lower power consumption but that's a little bit apples to oranges considering all the other hardware involved. Again, the primary purpose of the FPGA clone is that it's compatible with your cartridges and other original hardware.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-10 (#241602)

tepples wrote:

Then make one emulator that's real-time and another that's cycle accurate." ... But relying solely on automated tests for technical compliance, with play balance testing being completely separate, has its own drawbacks. One is test coverage: extensive play balance testing may cover more obscure code paths that the programmers accidentally failed to test for technical compliance.

This is not solved by FPGA clones, and you're taking the case in the discussion way out of proportion.

For broad testing the best coverage is by having more coverage. Having something that runs in emulators will get a lot more people to play it. A compromised emulator in 1000 hands is a better than an accurate one in 3.

Broad testing is most likely to catch logical errors in programming. Bugs due to subtle SA-1 timing differences would be extraordinarily unlikely to come up in that kind of testing only and not in the short term. Yeah, in some absolute sense it "could" happen. A million monkeys typing shakespeare and what not. In the real world it's just not very significant. Proportion is important. FPGA clones are great to get testing on, but I think the idea that they will usefully catch some specific class of bug that emulators can't is way off base.

This is also ignoring the ways in which it's easier for an emulator to be accurate than an FPGA. It is only in some particular issues like correlating timing that it seems to be easier... and even then often it isn't, really. Often the accurate thing is still very "real-time" viable, and it almost feels a little bit propagandist to start throwing around terms like that to describe the issue. Accuracy and CPU load aren't a 1:1 thing, and many things that are accurate are perfectly efficient to implement, they just take the right knowledge. A lot of things that make various emulators high-CPU have nothing to do with accuracy. That term just doesn't apply very cleanly to anything in the real world except extremes like Visual2A03. Even when talking about higan it's application is pretty muddy.

Re: What do you think of FPGAs?
by 93143 on 2019-08-10 (#241608)

rainwarrior wrote:

93143 wrote:

But could you do pixel-by-pixel?

Maybe, but this is a bit ludicrous?

Of course, but the claim I was responding to was that an emulator could have no latency.

There are two issues here. One is the literal no-latency case, which, while not obviously impossible, is certainly absurdly difficult to pull off and not really necessary even if you could. The other is the matter of how low you can reasonably get the latency, and whether it matters, vs. what you can do with an FPGA.

Quote:

2ms of lag is not the same problem as 100ms of lag. They should not be equivocated.

You mean equated, I hope, or something along those lines. Equivocation is a different word entirely, and should not be used lightly in a debate...

The point is that latency much smaller than a frame is in fact perceptible, and may even affect performance at certain tasks. It seems reasonable to extrapolate that the difference between 1.7 and 2.7 frames of end-to-end lag is not irrelevant even in a practical sense. You will feel it, and it will affect gameplay.

Quote:

On top of this basic ~30ms, you then have ~100-300ms of human reaction time completing the feedback loop.

It's not just about a feedback loop. It's about timing your actions precisely, which requires much finer control than that. The PIO I've experienced in emulated Super Mario Kart is an egregious worst case; there are a lot of ways the gamefeel can get worse before it reaches that level.

The human brain compensates for its own signal acquisition and processing lag. You can't just add it to the lag from the computer, because it doesn't do the same thing to the experience.

Quote:

Playing a musical instrument through ASIO is a different act entirely, and most of the problem you're describing is granularity rather than latency.

It's not just buffer size that can cause problems. I've programmed a digital piano, and playing the key attack noise before the main tone starts really does substantially affect the feel, even if the attack is muted. The immediacy is gone. It's not as bad as jitter, but it's still relevant. 20 ms is not fast when you're talking about feedback from a tactile input device. Being a few metres away from the drummer isn't as big a deal because it's just the sound, the lag is consistent, and nothing else in the band needs to sync that precisely with the beat to sound good to the audience, but if you were the drummer and you experienced that level of lag it would absolutely throw you off.

...it occurs to me that being nearly 20 feet away from the drummer might actually be a problem in some cases. Live performances do suffer when there's enough space separating the performers, and the problem compounds quickly when they aren't experienced at lag compensation...

Quote:

Playing music is not about making split second feedback reactions against visual stimulus

Neither is videogaming, for the most part. Games like Punch-Out are the exception, not the rule. A game like Super Mario Bros. or Mario Golf or Guitar Hero is about performing relatively simple tasks with much finer precision than you could ever manage by just reacting with no plan.

Quote:

Essentially we're talking about passing a hardware test ROM.

There's another side to the argument tepples is making. If anybody ever writes an SA-1 game that depends on it working like real hardware beyond the level of the Snes9X hack, a solution will have to be found or that game won't work properly in emulators that don't go the extra mile.

...

Attempting a summary of what I feel are some of the relevant points:

PC:
- not hardware compatible
- latency depends on system configuration and emulator system interface design (and a publicly-distributed emulator can't rely on certain types of crazy hardware tricks)
- non-expert users may additionally suffer effects such as screen tearing, judder, long audio buffer times, and OS-induced timing glitches
- high power draw
- PC user experience, nothing like a console
- can be very accurate, but may require expensive hardware and creative programming to guarantee full speed

FPGA:
- hardware compatible
- no latency (quoting you here)
- lower power draw
- much more console-like user experience
- can be very (perfectly?) accurate at no additional cost to anyone other than the programmer

Raspberry Pi:
- not hardware compatible
- potentially low latency
- lower power draw
- potentially somewhat console-like user experience once it's set up (no carts though)
- lower accuracy due to limited computational resources

It's not so much "purity" as it is the overall fidelity of the experience. Being able to use real cartridges in a standalone device with minimal latency is a lot more like actually using a NES or SNES, as compared with spending hundreds of watts on laggy software emulation on a giant multipurpose PC that requires you to go to extreme measures to play your games without breaking the law. I'd go so far as to say that if PC emulation was all we had, the true experience of playing a classic console would ultimately be lost, regardless of the accuracy of the emulation.

Which is why the display technology issue bugs me so much. How many kids nowadays have tried Super Mario Bros. 3 in an emulator and gotten the impression that that was how the game actually felt back in the day? How many have tried it on a NES Classic on an HDTV and thought that? You'd think you could solve the problem by using a real NES - but plugged into an HDTV it's not going to be any better, and in fact it's likely to be worse.

Real pre-3D console games put modern gaming to shame with how immediate the gamefeel was. Something has been lost.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-10 (#241609)

93143 wrote:

Of course, but the claim I was responding to was that an emulator could have no latency.

You said this twice now, but nobody claimed this.

The closest was just dwedit who suggested a game's built-in latency could be taken advantage of with a rewind-and-replay emulator for lower overall latency, and even linked a video demonstration. It's quite factual and demonstrable, but it can do objectionable things to games that don't have built-in delays or have more quickly divergent simulations. Not a complete general solution, IMO, but an interesting solution for some games. (It has some great applications in netplay though.)

93143 wrote:

You mean equated, I hope, or something along those lines. Equivocation is a different word entirely, and should not be used lightly in a debate...

I meant only that I felt that disparate ranges of latency were being conflated in this discussion, and I felt the need to disambiguate them. Wasn't trying to call you a liar, if that's how you took it.

93143 wrote:

The human brain compensates for its own signal acquisition and processing lag. You can't just add it to the lag from the computer, because it doesn't do the same thing to the experience.

I completely disagree with this. The human response is absolutely part of the interaction feedback loop, and very much affects how strong the effect of additional lag is.

Similarly if you wanted to use an FPGA clone with a modern TV rather than a CRT, which is actually one of their big selling points, the amount of lag that TV has is completely relevant to the discussion. If you don't have a low latency TV, the better latency of an FPGA clone won't have nearly as much impact. That's something you need to know about and account for if low latency is a reason that attracts you to an FPGA clone product.

And that's kinda the whole point I was trying to make in a nutshell, in an absolute sense could could say that device A has 5ms more of latency than B, or similarly device A gets one particular aspect of timing better than device B, but OP is asking for opinions, and it's important to know how much these objective comparisons matter. The total amount of latency in the loop is relevant.

Similarly, I think the idea that the Super NT has more accurate SA-1 timing is meaningless for the machine's purpose. It won't make any difference for any game you play, and as a machine it is not built to be a developer tool or a perfect substitute. It's a very compatible machine, and that's it. If you really want to do something that depends on that kind of thing, this product is not good enough for that. Get a SNES and build yourself an SA-1 dev board.

The "but if some homebrewer needs it in the future" is an argument people bring up all the time for various pet emulator features and I don't buy it here. There are infinite homebrew possibilities like this, and almost none of them are going to ever happen in a real release. You and tepples aren't even proposing something someone could do with it, only that it's possible, and probably by accdient? That's not compelling.

The moment someone actually does release something that needs it that people want to play, emulators will adapt. Same deal with the Super NT: it's already had many firmware updates addressing compatibility issues. So has every FPGA clone system. They get fixed when incompatibilities are found, just like emulators do. FPGA clones can't be "perfectly?" accurate any more or less than emulators can.

Re: What do you think of FPGAs?
by psycopathicteen on 2019-08-10 (#241610)

Garth wrote:

psycopathicteen wrote:

A 65816 has 22k transistors? I thought the 6502 had 3500 transistors. If that's correct then how did it jump so much in transistor count?

Wow, I never knew CMOS inherently took more transistors than NMOS.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-10 (#241612)

93143 wrote:

What is hard GPU sync?

I assume it refers to something like this:
https://docs.microsoft.com/en-us/windows/uwp/gaming/reduce-latency-with-dxgi-1-3-swap-chains

There was a point maybe during or after the DirectX 9 era where the default way of presenting the backbuffer introduced extra latency in favour of higher GPU throughput/parallelism. This kinda crept in slowly through driver interfaces, and it's kinda why in some emulators there's superstition about Direct3D versus DirectX interfaces, sometimes a driver responded differently to either. (Part of the eternal arms race to get that FPS number higher.)

Thankfully they exposed this mechanism more directly and explicitly in DXGI 1.3 (circa 2013?) with "swap chains" which lets you take back that trade. Other APIs like Vulkan have similar capabilities. Video drivers may still override it at their discretion, but at this point I think it does work in most cases.

Re: What do you think of FPGAs?
by Dwedit on 2019-08-10 (#241615)

"What is Hard GPU Sync?" Good question. As far as I know, it's OpenGL only, I'll go peek at what it does.

Okay, here we go... It's this: glFenceSync combined with glClientWaitSync

Re: What do you think of FPGAs?
by Hojo_Norem on 2019-08-11 (#241636)

All this talk of lag with emulators vs FPGA vs original HW fail to take into account one thing (unless I've missed it):

Not all inputs on (most) display devices are treated equally... Unless you game exclusively on monitors.

Ignoring emulators for a moment, lets just compare PLD based re-implementations to original hardware.

Pretty much all original hardware before the Dreamcast era output a video signal that a large percentage of non CRT televisions will treat as interlaced... Add lag for de-interlacing. Next, the television will have to scale the SD image up to the resolution of its panel... Add more lag. Now, because the input is coming from a SD analoge source some TVs won't let you disable all picture processing and that's more lag again.

Add a device like a RetroTink X2 then you can reduce the lag down to under one frame on some displays. Add something like a OSSC and you're nearly golden, expect for the edge cases which gaming has a number of. A game wants to switch between 240p/480i? Enjoy your blank screen for a few seconds as everything re-syncs. You game switches resolution every time you go in and out of the menu? See what I'm getting at?

Now with a PLD re-implementation you can side step nearly all this. At the very least you can build in a scandoubler and you instantly remove one of the larger sources of lag in the chain. Many of the commercial FPGA based re-implementations output upto at least 720p, meaning that most TVs end up using simpler and quicker upscaling mechanics and those that can do 1080p can sidestep the TV's scaler all together. With the right tweaks to the video metadata, most TVs will treat the input like it coming from a computer and disable nearly all its built in lag inducing picture and motion 'enhancing' features.

Re: What do you think of FPGAs?
by supercat on 2019-08-11 (#241637)

rainwarrior wrote:

93143 wrote:

This is kind of a bizarre statement. What is "sub-pixel latency" supposed to mean?

If an Atari 2600 is feeding video to a black and white television set, the time between a store to one of the color registers and a corresponding change in the intensity of signal output by the beam could be less than 280ns (the approximate time required to show one pixel). Typical color television sets of the 1970s and early 1980s would add about a pixel worth of latency in their color decode circuitry. If a game were to include a loop like:

Code:

lp: ; Total loop time 76 cycles==1 line
    lda INP0 ; 3 cycles
    sta COLUBK ' 3 cycles
    ... another 10 repetitions of that, for 66 cycles total
    lda 0 ; Waste 3 cycles
    lda INTIM ; 4 cycles
    bmi lp ; Assume 3 cycles

then pressing the button would cause the beam's output intensity to change between about 3-6us later if the beam would be in the displayed part of the frame at that time.

In one particular weird corner case on the Atari 2600, a pixel can be split between multiple colors, though I don't think this could be very well exploited because the exact effects seem to vary with conditions like temperature. If the screen is programmed to switch color registers at the midpoint of each scan line, the switch isn't synchronized with the same pixel boundaries as everything else. If memory serves, the pixel where the split occurs would be shown as about 1/4 the old color, about 1/4 the logical AND of the old and new colors, and about 1/2 the new color, though as noted this is probably not consistent enough to actually be useful. While I wouldn't fault any hardware recreation that omits this quirk (it was certainly not intentional), a recreation that is trying to match the original hardware as precisely as possible should include it.

Re: What do you think of FPGAs?
by supercat on 2019-08-11 (#241641)

psycopathicteen wrote:

Wow, I never knew CMOS inherently took more transistors than NMOS.

A CMOS gate that produces an inverted function of any combination of AND and OR gates with separate inputs will generally require two transistors (one each P-type and N-type) per input. An NMOS gate will require one N-type transistor per input plus one passive pull-up. As process technologies get smaller, passive pull-ups don't shrink nearly as nicely as active transistors, because they need to be able to dissipate heat. There are a few situations where the passive-pull-up approach can be "win", however.

In NMOS, a 64-input NOR gate would be no problem operating quickly because it would have 64 N-type pull-up transistors wired in parallel with one passive pull-up opposing them. The passive pull-up might need to be larger than for a two-input NOR gate, but not outrageously so. In CMOS, the low-side network would be the same as for NMOS, but the high-side network would need to have 64 P-type transistors in *series*. That might be workable for something like an interrupt-detect circuit which needs to be able to respond rapidly to a rising edge on any of 64 inputs, but wouldn't need to react quickly to a falling edge. In general, however, a 64-input NOR gate would need to be implemented with sixteen four-input NOR gates feeding four four-input NAND gates which in turn feed a 4-input NOR gate. Although that would involve three levels of logic rather than one, the time required for a signal to propagate through a sequence of N transistors in series increases with time proportional to the square of N as N gets large (and for these purposes, anything over 10 would definitely qualify as "large"). If a NOR gate were implemented in one level of CMOS logic, the time required to react to a falling edge on the input might be about 30 times the time required for a 10-input gate to do likewise. Using three levels of logic would require a little more circuitry, but would likely be an order of magnitude faster.

Re: What do you think of FPGAs?
by rainwarrior on 2019-08-11 (#241645)

supercat wrote:

rainwarrior wrote:

This is kind of a bizarre statement. What is "sub-pixel latency" supposed to mean?

In one particular weird corner case on the Atari 2600, a pixel can be split between multiple colors...

Sub-pixel accuracy is very easily accomplished by emulators, that wasn't my question. There are lots of sub-pixel effects like that emulated already for many systems. My question was what the purpose of "sub-pixel latency", i.e. why would you ever need the video signal to physically display that fast? 93143 subsequently explained that they thought someone had said emulators can have "zero latency".

Also, as tepples pointed out, despite Analogue's claim of "zero latency" for their FPGA products, only the original NT (no longer available) had a composite output. The rest have HDMI output only with a very small amount of latency, small enough that I would say is fair to call zero in a causal sense, but 93143 clearly would not. Not really a fair comparison if we're talking about FPGA clones that are currently available.

Re: What do you think of FPGAs?
by 93143 on 2019-08-11 (#241668)

rainwarrior wrote:

93143 wrote:

Of course, but the claim I was responding to was that an emulator could have no latency.

You said this twice now, but nobody claimed this.

Right here:

Oziphantom wrote:

An I could have an FPGA that has 20 frames latency, and an emulator that has none.

That is the post I was replying to. Obviously it's somewhat rhetorical, but I'm not good at letting things by just because of that.

Quote:

93143 wrote:

The human brain compensates for its own signal acquisition and processing lag. You can't just add it to the lag from the computer, because it doesn't do the same thing to the experience.

I completely disagree with this.

Humans perceive our own reaction time as a lag between sensing something and being able to do something about it - a delay between stimulus and responsive action. We do not perceive any delay between doing something and seeing the thing being done. When pushing a button on a controller, the lag between action and perception is automatically eliminated from our consciousness, so we can accurately perceive simultaneity of action and consequence - you feel and hear the button being pushed exactly at the moment you push it. But the lag between that button being pushed and Mario actually jumping is of course not handled that way, so it registers as a delay between action and consequence. Which it is.

These two types of latency have totally different perceptual effects.

Quote:

The human response is absolutely part of the interaction feedback loop

Again, not every game can be reduced to a series of quicktime events. The ability to precisely control a videogame requires the ability to schedule fine motor activities with a speed and precision far beyond human reaction time. And a perceived action-response delay adversely affects the play experience well before it reaches the level of preventing timely reactions to unexpected stimuli.

A long running jump in a Super Mario Bros. game might have a window of only a few frames for a successful maneuver. You can do it because you see Mario approaching the hole and realize that you'll have to jump when you get there. Imagine trying to make a jump like that without knowing you had to jump at all before the input window arrived...

Or take Mario Golf. The gameplay has virtually nothing to do with reaction time at all. You push the button, the cursor starts moving left on the swing meter, you push the button again exactly as it reaches the end, it starts moving right, you push the button again exactly as it returns to the start. Nice Shot! Your inputs basically have to be frame-perfect, and all of it is scheduled ahead of time by watching the cursor move and figuring out exactly when you should move your thumb based on where you predict the cursor will be. Added lag is very noticeable and makes the game harder. In fact, even with a real N64 on a CRT, there's enough lag to be bothersome.

If there's any reaction time involved in Mario Golf, it's in noticing that you botched the power side of the swing and hurriedly compensating with the accuracy side and/or adding spin with the Control Stick. You have the entire traverse of the swing meter to notice the problem and make that decision, and then do a frame-perfect input at the end of it.

Quote:

the amount of lag that TV has is completely relevant to the discussion. ... The total amount of latency in the loop is relevant.

Yes, but it's also important to separate out the causes so they can be thought about and dealt with separately. A computer monitor is actually better for lag than an HDTV, so why does a NES Classic have less lag than Nintendulator? Why is a real NES worse than either of them when using that same HDTV, and is there something wrong with the NES that makes it an inaccurate emulation of itself?

As far as I can tell the OP was asking a philosophical question, not a practical one. He wants to know if FPGAs are a Good Thing in general, not whether he should go buy a particular FPGA console to use with a particular TV.

Re: What do you think of FPGAs?
by ShiningSun on 2019-08-12 (#241690)

My my, it seems like quite a few people got their input in, and I've seen a ton of good and interesting answers so thanks for filling those in! I'll address some of the doubts respecting the original objective of the thread.

Yes, it is correct that I was looking for input regarding the use of FPGAs on devices such as the Nt, AVS or Mega SG and I wanted to know how much impact they had on the market, on people and ultimately how important they are as a whole on the retro-gaming ecosystem in comparison to established platforms like emulators for example.

I am not looking or finding a way to get myself an FPGA device like those as I feel options like emulators and technical docs are ultimately what keeps or preserves older videogames and consoles along with the availability of them being possible to be played on a lot of devices. However, I cannot deny the seemingly important stance that FPGAs devices are taking nowadays and how popular they are getting as more and more they keep getting mentioned on the Internet. So I wanted to know if people felt that these devices ultimately contribute to something or will be just the kind of thing that appears for awhile and then disappears.

Lastly, and this is a bit offtopic, I'd like to agree with rainwarrior's point that high performance on emulators can be achieved with PCs and devices such as the Raspberry Pi. 93143 seems to have countered with the idea that Higan is a very slow emulator and not appropriate for hardware like those, however, there is a very good reason it is like that and is explained very well here http://mgba.io/2017/04/30/emulation-accuracy/ (This is Endrift, the author of the most accurate and one of the fastest GBA emulators to date)

I believe that if emulators are properly optimized, modified and customized to on performance/speed profiles while trying to maintain accuracy then it is entirely possible to do so. The author of this article goes about it very well in-depth. Ultimately, I feel like that as long as the user is able to enjoy their emulated game on the device they choose then they'll be happy and everyone else. Of course not everyone agrees with this point of view but I just wanted to let it be known.

Again, I welcome input for topics like these as I find them rather interesting.

Re: What do you think of FPGAs?
by 93143 on 2019-08-12 (#241735)

byuu has said, IIRC, that higan could be about twice as fast if it were optimized for speed. He has also said (IIRC) that the SA-1 cuts the speed in half...

We'll see where Mesen-S ends up. Currently it's looking like it's tracking the former estimate pretty well, considering what it doesn't bother with.

My perspective is a little unique. I'm not primarily interested in emulating popular games at the moment, because I have a real SNES and a nice CRT TV (and a Super Everdrive, if it comes to that). But I'm developing a game that uses the Super FX as well as a lot of mid-scanline raster shenanigans, and even before I had any Super FX code running, the basic display engine didn't work on anything before higan accuracy core v095. (I also have an SA-1 game planned; at my current rate of progress I will never get to it, but who knows...)

I am a perfectionist. I won't be satisfied until it's possible to obtain a clone console and TV that will work properly with the Super Scope without hacks. I see no reason why the signal chain should need more than a couple of scanlines of latency end-to-end, and a CRT emulation running on the TV should be easily capable of handling a 240p/480i switch just as fast and seamlessly as a real CRT. This may very well never happen*, and most people won't care. That's fine. I do care, and that's why I plan to keep a real SNES and CRT around...

*Especially the Super Scope bit. It sounds like it relies on the raster spot being very bright, and I'm not sure it's healthy for any current flatscreen technology to try to do that.