Perhaps this is part of the fundamental misconception.
Your channel strip is just the hardware input channel up to its fader. The hardware output section is in no way part of the channel strip. It's more comparable to the subgroup section of a console (even though the signal doesn't go to the master mix, but only to the individual output). There is in fact no compelling direct logical connection between a hardware input and the corresponding output, definitely not as part of a virtual channel strip.
The output is in no way part of the channel strip? I'm not really following...
With an INLINE console, you have multiple inputs but you are only ever listening to one input. When you're recording, you are listening to the micpreamp input, when you are mixing - you are listening to the playback.
Now, is a virtual submix mixer/matrix the exact same as a console? No, because digital allows all sorts of different features and topologies that wouldn't be feasible/practical/economically viable in an analog console.
That doesn't mean their core function should be seen as so different that someone who is using them has to feel as if there is no correlation in function and session organization.
To avoid imbalance when switching e.g. an input between stereo and mono, you'd have to automatically apply the same to the output, which would make no sense at all. You'd also have to open the EQ or settings panel at the output if you open it at the input to preserve visual balance. To top it off, the EQ panel is wider than the settings panel, and the software playback channels don't even have the EQ panel.
If I had to choose between a UI thing making sense, and keeping everything symmetrical... I would choose keeping everything symmetrical... That could be accomplished many ways...
Why open sideways for 2-5 controls

If you don't want to do vertical link where controls are shown for entire symmetrical column (even though it could have uses in certain situations)... blank space

I don't have EQ/FX with the AoX-D... So it's not really important to me, but, you could flip from a large fader to a small fader and pack an EQ within the the format of the module without opening sideways.
When it comes to minimized view... yes I would probably just do this for entire columns... Because If I'm needing to minimize, it's to gain screen real-estate... and... keeping things symmetrical supersedes invariably minimizing things I might not need/want to... It doesn't matter, because the VUs are still visible, and, if you added a horizontal fader line to the VU minimized view that users could still grab onto.. It would be even less of a problem.
And if on a multichannel interface, you have to scroll sideways to access channels beyond whatever your screen width can show, would the hardware output section have to move along? But then, if there are control room channels, the hardware output section isn't even as wide as the input section... Problems, problems. There's no way this would work out the way you imagine it should.
Yes. Don't understand what you're saying in your second sentence.
Use channel names and colours...
Looking at pictures of users setups, and no offense to them, they've tweaked their setup and memorized it... It's a total mess even with colours.
And here's your Marian Beast. An entirely different thing. Indeed it is one channel strip from top to bottom, there is nothing like the software playback and hardware output sections in Totalmix. You choose whether the channel is using the input signal or the playback somewhere at the top. Absolutely no comparison, and I'm not saying there's anything wrong with the Marian mixer. It's just different. You can route signal to any physical output, but apparently you don't get another full set of EQ, dynamics, etc. at each individual output (or pair), like in Totalmix.

Yes, and I'm in talks with developer talking about their exclusion of a Input/Playback/Output submixer view and the importance of it. The Beast Mixer, the way it is set up, it is sort of a completely separate virtual space you have to route to.
What I meant to point out in similarity between Beast and Lynx Mixer/Ncontrol, is a.) They are static mixers, but more importantly... b.) how they both handle dual mono/stereo channels in stereo blocks.
And Kubrak above also pointed out that, at this point, perhaps this is the most important part of my contention.
ust... What about OP´s idea of key modifiers, that would temporarily link two mono faders/panners? I guess that is, what OP is mainly for. I would call it Instant Fader Group. OP is used to have presented all faders as mono, but be able to use them as stereo, if desired.
Look at video that OP has linked in #16 (https://imgur.com/vZBmdtF). And my coarse description how it could work in TM is in #82.
So, what about to add to mono faders a feature operate them temporarily as stereo ones? Just fader and panning. And maybe other parameters, like e.g. gain...
I'd prefer that all channels be "stereo" channels with dual faders, that can be unlinked to be dual mono... Instead of condensing/disappearing multiple channels into a single fader....
Guys, it's so confusing and messy.
Ncontrol is a better comparison than Beast

But, let's get to the core of this whole discussion. I'm not really sure what anyone here is talking about, or if anyone here actually does modern hybrid recording/production in a commercial setting with TMFX, or if everyone here is a home user?
Here is a first draft, I am in the process of reconfiguring my studio into 32 channels...

https://imgur.com/iEQPhqS <- click for bigger image
I don't really understand how people can say there is no correlation between Playback and Outputs, or Inputs and Outputs, or a mix of both even when it comes to headphone mixes from the DAW and Inputs of what is being recorded...
It's late here, long day... apologies for the scattered reply.... But.... What in the world are you guys talking about? I sincerely feel like I've fallen into a parallel inverted universe...