Simulation
Moderator/Panelists: James Hoe (Carnigie Mellon
University), Joel Emer (Intel), Doug Burger (Microsoft and University of
Texas at Austin) Microprocessors are simulated at
every development stage to evaluate their functionality, performance, and
lately, power and temperature. Because
microprocessor complexity generally grows faster then microprocessor performance,
some believe the performance of accurate simulators has been effectively
slowing down relative to the simulated system. This panel will discuss (i) if this problem really exists and, if so, (ii)
techniques to significantly improve simulation performance, specifically
through various simulator parallelization techniques. Joel Emer’s position statement Recalling that the purpose of
architecture research is to provide a sufficiently compelling case for an
idea to proceed toward design, I believe that are an increasing number of
cases where our current methodologies are ineffective. That is they are
incapable of providing compelling evidence of the merit of a design. Examples
include large multiprocessor systems and large caches (especially shared
caches). In each of these cases the length of the
simulation at an adequate level of fidelity (which I believe is quite high)
result in simulation lengths measured in days and weeks. Such
simulation lengths are impractical for wide exploration of a design space. I
would further argue that the more radical a proposal is the more insufficient
our current approaches due to the need for more and longer benchmarks to be
convincing. At this time, the most promising
approach that I am aware of is to use FPGAs as a
platform for performance modeling. Such an approach has the appeal of running
simulations more rapidly than a pure software simulator and being more
flexible for design space exploration than a prototype. Unfortunately, it has
the disadvantage of turning a software programming exercise into a hardware
design process. Thus, to be practical I believe we need to develop a more
systematic and easier-to-program approach to hardware design. Some of the
attributes of such a approach include: more
modularity, well-defined simulation primitives, debugging aids and a higher
level representation than traditional hardware design languages provide. Doug Burger’s position statement Our evaluation methodologies to
date have relied on shared research into general-purpose computing, where the
community works together to innovate on a shared substrate. The success of infrastructures such as SimpleScalar, M5, SimOS, and
others result from this shared model that many researchers use to advance
this shared state of the art. The exponential growth in transistors will
continue for a few more generations at least.
However, the shift away from devoting the bulk of the transistors from
more powerful single cores, coupled with the ongoing changes in the computing
industry, is making traditional microarchitectural simulation increasingly
irrelevant. The on-chip real estate is
going toward SoC functions (such as the inclusion
of network interfaces and memory controllers), accelerators (such as graphics
units), and additional cores. My view is that the era of advances in general-purpose computing,
while still important, have slowed appreciably and may be coming to an
end. Our simulation infrastructures
and in fact methodologies are not capable of simulating the rapid changes in
workloads and system requirements, and in particular, are not ready for a
world in which systems optimize for a domain of workloads or even specific
workloads. As the community fragments,
we need higher-level system models that permit rapid, high-level estimation
of the capability of a specific innovation.
This model is quite different from what researchers currently use, and
is perhaps different from some of the shared infrastructure projects in
flight today. |