Topic: ASIO bug: driver failure when using multiple HDSPe AES PCIe cards
I'm an audio programmer here at Naughy Dog in the US. We're in the process of upgrading our DAWs here, and we're switching to dual HDSPe AES PCIe cards under Windows 10 as our primary digital I/O AES interface. We've been trying to get this working since last year, and we've almost got all our issues ironed out, but we are still having a serious problem with random RME ASIO driver failures. We reported this issue to Synthax back in October of last year but they were not able to help; they said they've never heard of this issue and can't reproduce it. So, I thought I would try posting here to see if RME can directly help with the problem. Recently, we believe we've been able to devise a 100% reproduction case, so hopefully RME engineers can now fix this.
Our configuration is as follows: A Windows 10 PC with dual HDSPe AES PCIe cards and an nVidia 1060 display adapter, 64GB of RAM, an Intel Xeon E5-2690 CPU, a 1TB SSD system drive, and a 6TB RAID using the motherboard's Intel RST driver. We have used Pro Tools, Nuendo, and Reaper on this configuration. The first RME card is the master system clock; the second one is clocked by the first using the external BNC connecter (we've tried the internal sync cable, no difference). We normally run at 96kHz though the bug also happens at 48kHz. We have both cards in multichannel mode, with WDM enabled for all I/O. Windows Speaker devices are enabled for outputs 9-16 on the second card; this is looped back to the inputs via TotalMix FX; in this way we can monitor and capture any OS audio from any app (web browser, sound library, etc.) directly into the DAW. We have other lower numbered inputs used for various other input sources, all clocked against the first RME card (though the bug happens even if not inputting anything, i.e. all other input physically disconnected). The main outputs are the first card's outputs 1-8, which are sent to our speaker monitor controller.
The bug is that after some random amount of time, for no reason - even if just left idle, the ASIO outputs for the first card's I/O simply stop working. This is without warning; there is no error dialog, or system event, that indicates anything is wrong. The first card's 16 output simply stop producing audio. This is also confirmed by looking at the meters in TotalMix - they are dead. This failure happens no matter what ASIO software you a using. We've also run soak tests and noticed that this can happen even if no ASIO software is running: i.e., boot the computer, let it sit idle for a week, then try to run an ASIO app, and the first 16 outputs will not work right from the start. The second card's ASIO I/O continues to work. When this failure happens, it is always the first logical card that fails. This is true even if the cards are swapped in the RME preferences; in that case, the other card will now be the first card, and it will lose its ASIO I/O.
We've tried swapping cards, swapping parts, changing drivers (this has happened on every RME driver since October, including the lastest 4.18 drivers); nothing matters. The frequency of the bug is somewhat random: it can take almost a week to happen, or it can happen multiple times in one day. We we first encountered this, we were working around it by rebooting the PC, which resets the driver and fixes the problem. The more we started using the PCs, the more frequent this would happen, until it got so frequent that rebooting is killing our workflow.
Some interesting points to note are that we have a few PCs that don't need 32 channels of I/O and are only using one RME AES card. Those PCs are rock solid - they are not affected by this bug. Those PCs are identical in every way to the ones that are failing: same motherboard, CPU, drives, and video card - the only difference is they have 1 RME card instead of 2. So the bug seems unique to the scenario of using 2 RME cards.
We discovered an inelegant workaround that often saves us from rebooting: if, when this bug happens, you exit all audio applications, and then run a program called ASIOSigGen (you can find this on the internet), you can use the app to force both cards to change sample rates. If you do this, then change the sample rate back again, it seems to reset the ASIO driver such that it works again. For example: if you are working at 96kHz and the bug happens, close all the DAW applications, run ASIOSigGen, pick 48kHz, wait a few seconds, then pick 96kHz, wait a few seconds, exit the app, restart the DAW software, and now it's working again.
Although this is an improvement over rebooting, this is still not ideal. However, it gets worse: we've recently started experimenting with Reaper, and we've found a normal workflow pattern in Reaper that causes the RME ASIO driver to fail 100% of the time (at least, for us). That is good news, I guess - hopefully that will help RME engineers identify the bug and fix it. It's bad news for our sound designers using Reaper because this failure is now much more frequent.
Here are the steps to reproduce the 2-card ASIO bug 100% of the time in Reaper:
1. Restart the computer
2. Load up a blank session in Reaper (latest version)
3. Select a track in Reaper
4. Go to “Actions” -> “Show actions list”
5. Type inside the “filter” line: “move tracks to subproject”
6. Select “Track: Move tracks to subproject” in the actions list and press “Run” at the bottom of the window
We really want to get this fixed because it is impeding our ability to finish upgrading our DAW hardware; none of the sound designers here want to use these new PCs until this bug is fixed. We've got 1 user here who's been suffering with this bug since last October and another one who's been dealing with it since April. People are starting to complain that maybe we should start looking at other audio hardware and I don't want to do that if I can avoid it because the RME has a unique feature set that really works well for us. I need to roll out 6 more of these machines, so any help RME can give in fixing this would be appreciated.
I'm also an audio programmer, and I have access to developer tools and I am willing to gather whatever diagnostics or forensics that will help track the problem down.
Any help would be greatly appreciated. Thanks!
- Jonathan