The acceleration of algorithms, via the architectural research,
could be achieved in one or more of the following 4 general approaches:
(a) By introducing improvements into the architecture of a single CPU (e.g., A),
(b) By changing the implementation technology (e.g., B),
(c) By changing the computational paradigm on the architectural level,
i.e. by switching from control flow to data flow (e.g., C),
or by introducing a more effective parallel processing into the control flow paradigm (e.g., D).


[A]
Milenkovic, A., & Milutinovic, V. (2000, August).
Cache injection: A novel technique for tolerating memory latency in bus-based SMPs.
In *European Conference on Parallel Processing* (pp. 558-566). Springer, Berlin, Heidelberg.

[B]
Milutinovic, V. (1996).
*Surviving the design of a 200MHz RISC microprocessor*.
IEEE Computer Society Press, Washington DC, USA.

[C]
Milutinovic, D., Milutinovic, V., & Soucek, B. (1987).
The honeycomb architecture.

[D]
Trobec, R., Vasiljević, R., Tomašević, M., Milutinović, V., Beivide, R., & Valero, M. (2016).
Interconnection networks in petascale computer systems:
A survey. *ACM Computing Surveys (CSUR)*, *49*(3), 1-24.