The second problem was performance - all the backend services were doing screen management over a network, which was already slow.  Add to this that the APIs were exceedingly primitive.  Like I said, I could only add or remove one row at a time from the table I was responsible for, and with network latency each command took about a quarter second to execute.  Not a problem when you're monitoring 5 or 10 objects, deadly when you're managing a couple hundred (it took on the order of 5 minutes to update the table if you were managing hundreds of reports).  Under realistic scenarios the whole thing just fell down hard.  We suggested to the Boeing team that they add calls to perform bulk updates (or at least a call to delete all the rows in the table), but those suggestions were dismissed as being "too hard".  
How this got past review and approved (especially since it couldn't satisfy some key functional requirements) is a mystery to me, meaning it was never really reviewed.  Someone at Boeing should have kicked that design back with "are you f___ing kidding me" stamped on every page in red ink, but didn't.  
At the time I told myself it was because this wasn't something in a critical path so they weren't putting their best team on it, but it was part of a larger modernization program and proved to be an accurate representation of the overall effort.  
A thousand years ago I started college as a computer science major. (switched out of that after a little over a year) One of the primary things my teachers drilled into us - other than you damn well better proof your work, anticipate every possible human error, no matter how small or unlikely, and provide a fix for it, and put comments in everywhere - was that you should make every bit of coding as fast and efficient as possible. It's not a big deal if your bit takes a quarter-second (back then that would have been blazing fast) until you realize that it might be a module that's part of a larger program, one which calls your little slice of code 100 times a day. And if that code could be executed in a tenth of second instead, that's a 60% time savings, multiplied by 100, and that adds up to real differences. An idealized view? Perhaps. But I mentioned that story to my nephew, who has a doctorate in and teaches computer science, and he said that that's still the standard, and something he also tells his students.
My mantra has been:
It doesn’t matter how fast your code is if it’s wrong.
It doesn’t matter how fast your code is if you can’t maintain it.
It doesn’t matter how fast your code is if it’s a malware vector.
It doesn’t matter how fast your code is if it falls over when someone sneezes in the next room.
Speed matters, but those things matter more.  Yes, if you can shave a couple of cycles out of each iteration of a tight loop that executes thousands of times, that’s worth doing.   Shaving a couple of cycles off a subroutine that executes exactly once at startup, however, isn’t worth the effort.  
Different war story on the maintainability front.  We were making a sonar simulator for a Navy lab.  We handled modeling the noise spectra and the movement of objects in the simulation.  A second group built a massive DSP to take our noise spectra and turn them into the signals you’d get from a towed array.  A third contractor was responsible for a 3D graphical display.
At one point we were asked to look at the graphics code to see if we could speed it up (this was written in C on a Silicon Graphics  system using (not-yet-open) GL).  A small program, only about 5000 lines or so, 
but they were all in main.  The original author believed that using actual subroutines was too inefficient, so he used gotos to branch all over the place - something like 15 gotos in all. It took my coworker a couple of weeks to puzzle out the flow of control.  
The code was so tightly coupled with itself we could not make any changes without breaking something.  We tried compiling with level 1 optimization - the compiler ate all the RAM, then ate all the swap space and the system panicked.  This code was literally unmaintainable.
We finally gave the lab two choices - let us rewrite the whole thing from the keel up, or buy faster hardware.
They wound up buying faster hardware.
And, as you know, with the complexity of programs, a seemingly minor module or path could somehow find its way into a very critical thread, so them not checking this stuff and putting that red stamp on it (love the imagery) really can't be excused.
Yup.  But again, the performance wasn’t the main problem, it was the absolute inability to satisfy a basic functional requirement that was so boggling.  How that got by anyone is an absolute travesty.