No, waedi just expresses himself a little strangely sometimes.
Thunderbolt 4 is of course not legacy, but that doesn't really help. An implementation of TB4 for a recording interface is too complex for the pure function of a recording interface. There are also compatibility problems with TB1+2.
More seriously, Intel no longer produces TB1-3 chips. The manufacturers will use what is available and after that, Thunderbolt on recording interfaces will probably be a thing of the past. At least that is the case from RME's point of view. That's what Waedi will have meant, but he wrote it a bit too generally.
I have already answered your questions about performance, but you seem to have missed the point.
Let me put it another way. A recording interface is not an accelerator for audio processing. It's not like a graphics card where you stuff a bigger model into the computer and suddenly a miracle happens and everything is faster. It doesn't work like that. Even with a graphics card, the CPU has to be powerful enough to deliver data to the card quickly enough, otherwise the CPU is the bottleneck.
When it comes to audio processing, neither Windows nor macOS are real-time operating systems in which audio processes can really be prioritised accordingly. To ensure data integrity, low level routines have absolute priority over everything else, including the prioritisation of processes by the process scheduler.
Low level routines are drivers .. It is completely up to drivers on your system, how long they run on a CPU core. Only the driver can detach itself from a CPU core. There are programming conventions for how long this should be at maximum, but some drivers are simply written bad or the company wants to shine in Benchmarks. Such "bad drivers" cause these so called DPC latencies. They can block a CPU core for too long. If important audio processes are scheduled to run on such a CPU core, then it can come to audio drops.
Its is not only important that your system has a fast CPU. There are so many more aspects where the system needs to be excellent, if you want to execute performance hungry VST/VSTi and at the same time with small ASIO buffers.
Like in formula 1 .. You do not win, if you only look for the most powerful motor, everything needs to perform.
So .. again .. a recording interface is no recording or audio accelerator. A system with bad drivers or bad bios/mainboard chipset will get problems with too low buffer sizes earlier than a good system.
The more channels the PC has to process and the higher the sample rate is .. the more stress you put on the system.
Every channel needs to be transferred instantly by your system over PCIe or USB .. no matter whether the channel is in use or not (only a few exception to this rule of thumb exist).
You want to compare UCX with UFX III.
The UCX is an older interface with a little slower converters, but differences in converter latency are tiny compared to latencies due to higher ASIO buffers. Check the RME manual chapter 39.2, Latency and Monitoring.
If you compare this between UCX and UFX III you will see that there is not so much difference.
When you compare UCX USB driver and UFX III MADIface driver. The MADIface driver allows for 32 samples ASIO buffer size, the USB driver has 48 samples as minimum. But also here, this is not a big difference. And tbh .. nobody would really run instantly with only 32 samples buffersize. Its nice to see that this is possible, but for real work you use 64 or 128 (when working with VSTi). If you are only recording, then better use big ASIO buffer sizes for safety. For Mixing and Mastering it depends on the projects, if they are not so big you might work well with 128, 256 or 512 samples buffer at single speed.
My projects are not that big. I am running nearly instantly at 64 samples for normal operation and this with a system which is 10y old with E5-1680v4. But all components are selected and the mainboard is a good server mainboard. With this system everything works smoothly. But too CPU hungry VST/VSTi can bring it down. Therefore I avoid such tools.
If I would do something critical I would definitively increase the buffer size, only keep it at lower values when playing over a virtual amp. But also here with 128 samples ASIO buffersize I stay under 10ms RTL which makes smooth playing possible.
Regarding my system, I have a synthetic Benchmark, it performs with UFX, UFX+, UFX III as well as with a RayDAT PCIe card.
https://www.tonstudio-forum.de/blog/Ent … cks-de-en/
But every system behaves differently ... and no workload is really the same.
BR Ramses - UFX III, 12Mic, XTC, ADI-2 Pro FS R BE, RayDAT, X10SRi-F, E5-1680v4, Win10Pro22H2, Cub14