Thanks for your feedback Roberto! I went over my articles on OoO Execution again, to see how I could make the explanation more correct while keeping things simple.
I basicall chose to not mention the ROB, as I don't think it is that important to know about for what I am trying to convey here.
I basically just talk about the Instruction Queue instead. Obviously the ROB will be there helping the reordering but not all that important I think to get across how the M1 can process more instructions.
I have added some sentences to clarify enen more that threads and OoOE is of course complementary. Althogh I thought that was sort of clear already. But I see that maybe some of my sentences on that was ambigious.
Your remark on VLIW is a bit beyound my understanding. I have written a story now on VLIW, but I don't understand it well to see how this related to the size of the ROB.
AnandTech was indeed an important source for writing this story. In fact it was reading the AnandTech articles that made me realize that while the information is great, it is far too difficult for the average developer.
Ironically while I thought I wrote this story for developers, I have since discovered that a lot of market analysts types have enjoyed reading it. So there seem to be a much broader need/desire for user friendly discussion of these kinds of geeky details, than I had imagined when writing this.
I am not an expert on this, so I have benefitted a lot from those of you with more detailed knowledge to correct me. A previous comment e.g. allowed me to correct what I originally wrote on Unified Memory.
Of course in some cases one has to sort of paper over the complexity and decide what is the most important point to get across about this solution?