I have done it. It didn't support forwarding et al, but it certainly mattered *a lot*, it was cool to see the awesome speed increase

. Ofcourse the way I did it was a huge hack, so definately not fit for release. However DOS3, on which I am working (well not right now, but it's a project of mine

), will feature decent display routines.
By the way, the speed being unbearable on 3.5MHz, on 7MHz (which I use) it is indeed generally just acceptable. The disk access is definately not the limiting factor, the reason for it is that the BDOS changes the stack and does an interslot call for *every* character output to the screen. In DOS2, a string output is *exactly* the same as a loop outputting seperate characters to the console (string output isn't even implemented in the kernel!). In addition to that it also does other stuff which could also be optimized.
I doubt it is command.com which is not optimized. Optimizing the whole thing by patching it is not exactly easy, the best way would probably be by extending MSXDOS2.SYS, however you would need to write a fully functional character display routine using the current BIOS system variables, which is actually not *really* hard, but the tough thing is to implement DOS2 functionality, and implementing the redirection stuff for example. In other words, it is quite a hard job, and actually has to be done on DOS2 kernel level.
In DOS3 I will use buffers in RAM from which I update (I need them anyway since it's multitasking so there will be more than 1 console at the same time), and on the MSX2+ it will use VDP command for scrolling in screen 0. For string outputs I will only set the VRAM position once and output all characters to the console subsequently. Also I will make a more direct, faster interface between the several parts of the OS (without requiring interslot calls for every character hehehe), and lots, lots of block-level functions.
~Grauw