Branch Prediction

Introduction

Many of the techniques used in modern processors to speed up execution are based on the fact that multiple instructions are executed simultaneously (i.e. Pipelining, Superscalar Execution). This can of course only be effective if a sufficient number of instructions can be executed at the same time. Especially, these instructions have to be known in advance.
This is not a big problem in a sequential program, where every instruction has exactly one successor. However, real-world processors have (and every program needs!) branches, and especially conditional branches, i.e. it is not known in advance if a branch will be taken or not, since that depends on some condition that has still to be computed.
A typical example for that is a code-fragment such as the following:

Address Instruction 1000 i=10 1001 <compute something> 1002 i=i-1 1003 if i>0 goto 1000

Here, a loop is executed ten times. Statically, it is not possible to say if the branch at address 1003 will be taken or not. However, observe that it is taken in almost all cases - except once every ten times it is encountered.
In real-world programs, loops often are executed even more often, and there are patterns to conditional branches that are much more complicated (e.g., subroutines). This leads to the development of

Branch Prediction

Branch Prediction is the term used to describe the art of guessing whether a conditional branch will be taken or not. Based on this decision, the processor knows which instructions to feed into its pipeline. However, if a prediction was wrong, a lot of work will have been wasted when that is found out (this is called the misprediction penalty).
On modern processors, the pipelines have become so long and so many executions can be executed in parallel that it is of crucial importance to have an excellent branch prediction, because only then can this massive amount of hardware result in the desired speedup.
Nowadays, branch prediction algorithms have an accuracy of more than 95%!

About the applet

We programmed an applet to interactively explore various algorithms that are used to predict the behaviour of branches. We modeled a rather simple processor that knows only a couple of instructions (Each Instruction is described as its opcode plus a comment):

NOB: No branch: really a NOP (no operation). Does nothing as far as our simulation is concerned.
J: Jump. Always taken.
BT(n): Mostly taken branch: Is usually taken, except every n-th time.
BNT(n): Mostly not taken branch: Is usually not taken, except every n-th time.
RBR(n%): Random branch: Is taken n percent of times.
BR(s): Custom branch: s is a string consisting only of T and N. The behaviour of this branch follows the string, where T means a branch is taken and N means it is not taken.

In addition, all instructions except NOB have a target address which is the instruction the program will jump to if the branch is taken. With these instructions, it is possible to model the behaviour of an arbitrary program fairly closely without having to explicitly program it into our simulator.

Using the applet

In principle, it should be fairly self-explanatory, but we give some hints on the usage in this section.
The applet is divided into four main sections.

On the upper left there is the pseudo-program in a table, giving, in left-to-right order, the address, the opcode of the instruction, its target address and the predictions of the three algorithms currently selected. The blue line is the one which can currently be edited in the fields immediately below the table. The red line is the one the program counter is currently at, i.e. the next instruction that will be executed.
On the lower left there is the control panel, allowing the user to execute the program step-by-step or continuously with a specified delay.
On the upper right we have the visualizations of the algorithms currently in use. These have built-in help texts where necessary.
On the lower right there are the statistics for the algorithms.
At the very bottom there are buttons for controlling the program: Load and save program (not available in the applet version), Clear Program (sets all instructions to NOB), Select Algorithms, Clear Statistics and Exit (not in the applet).

A J opcode can be entered by selecting BT(0). For further information on the status of a branch (e.g. which position of a custom branch will be executed next?), keep the mouse pointer on that instruction for some time, since this information is contained in the tool tip text.

Here it is

If you have a Java 1.3-conforming virtual machine in your browser, you should be able to use our applet right here:

Apparently, your browser cannot display applets.

Other mode of usage

The applet can also be started as an application, enabling additonal features such as loading and saving of programs. Also, it is somewhat easier to install a Java Runtime Environment (JRE) version 1.3 than installing a Java-Plugin conforming to that version of the Java Platform. To use it standalone, download either the binary-only version or the the binary plus source version and start it from the commandline using java -jar <filename>.

Legalese

The source code of this applet is covered under the GNU General Public License, or GPL. Informally, you may do whatever you want with it except distributing a program using our code without also covering that program under the same license. For more information, see the file COPYING in the jar-Archive. It also means that we make no warranties whatsoever.

This Applet was programmed by Hendrik Steller and Dirk Materlik for the computer architecture course in the summer of the year 2000 at the institute for computer science of the Free University of Berlin.

Last modified: Mon Oct 16 17:48:08 CEST 2000