BogoMips
BogoMips (from "bogus" and MIPS) is an unscientific measurement of CPU speed made by the Linux kernel when it boots to calibrate an internal busy-loop.[1] An oft-quoted definition of the term is "the number of million times per second a processor can do absolutely nothing".[2][3]
BogoMips is a value that can be used to verify whether the processor in question is in the proper range of similar processors, i.e. BogoMips represents a processor's clock frequency as well as the potentially present CPU cache. It is not usable for performance comparisons among different CPUs.[4][5]
History
In 1993, Lars Wirzenius posted an email message[6] explaining the reasons for its introduction in the Linux kernel on comp.os.linux:
- [...]
- MIPS is short for Millions of Instructions Per Second. It is a measure for the computation speed of a processor. Like most such measures, it is more often abused than used properly (it is very difficult to justly compare MIPS for different kinds of computers).
- BogoMips are Linus's own invention. The linux kernel version 0.99.11 (dated 11 July 1993) needed a timing loop (the time is too short and/or needs to be too exact for a non-busy-loop method of waiting), which must be calibrated to the processor speed of the machine. Hence, the kernel measures at boot time how fast a certain kind of busy loop runs on a computer. "Bogo" comes from "bogus", i.e, something which is a fake. Hence, the BogoMips value gives some indication of the processor speed, but it is way too unscientific to be called anything but BogoMips.
- The reasons (there are two) it is printed during boot-up is that a) it is slightly useful for debugging and for checking that the computer[’]s caches and turbo button work, and b) Linus loves to chuckle when he sees confused people on the news.
- [...]
Proper BogoMips ratings
As a very approximate guide, the BogoMips can be pre-calculated by the following table. The given rating is typical for that CPU with the then current and applicable Linux version. The index is the ratio of "BogoMips per clock speed" for any CPU to the same for an Intel 386DX CPU, for comparison purposes.
System | Rating | Index |
---|---|---|
Intel 8088 | clock × 0.004 | 0.02 |
Intel/AMD 386SX | clock × 0.14 | 0.8 |
Intel/AMD 386DX | clock × 0.18 | 1 (definition) |
Motorola 68030 | clock × 0.25 | 1.4 |
Cyrix/IBM 486 | clock × 0.34 | 1.8 |
Intel Pentium | clock × 0.40 | 2.2 |
Intel 486 | clock × 0.50 | 2.8 |
AMD 5x86 | clock × 0.50 | 2.8 |
MIPS R4000/R4400 | clock × 0.50 | 2.8 |
ARM9 | clock × 0.50 | 2.8 |
Motorola 8081 | clock × 0.65 | 3.6 |
Motorola 68040 | clock × 0.67 | 3.7 |
PowerPC 603 | clock × 0.67 | 3.7 |
Intel StrongARM | clock × 0.66 | 3.7 |
NexGen Nx586 | clock × 0.75 | 4.2 |
PowerPC 601 | clock × 0.84 | 4.7 |
Alpha 21064/21064A | clock × 0.99 | 5.5 |
Alpha 21066/21066A | clock × 0.99 | 5.5 |
Alpha 21164/21164A | clock × 0.99 | 5.5 |
Intel Pentium Pro | clock × 0.99 | 5.5 |
Cyrix 5x86/6x86 | clock × 1.00 | 5.6 |
Intel Pentium II/III | clock × 1.00 | 5.6 |
AMD K7/Athlon | clock × 1.00 | 5.6 |
Intel Celeron | clock × 1.00 | 5.6 |
Intel Itanium | clock × 1.00 | 5.6 |
R4600 | clock × 1.00 | 5.6 |
Hitachi SH-4 | clock × 1.00 | 5.6 |
Raspberry Pi (Model B) | clock × 1.00 | 5.6 |
Intel Itanium 2 | clock × 1.49 | 8.3 |
Alpha 21264 | clock × 1.99 | 11.1 |
VIA Centaur | clock × 1.99 | 11.1 |
AMD K5/K6/K6-2/K6-III | clock × 2.00 | 11.1 |
AMD Duron/Athlon XP | clock × 2.00 | 11.1 |
AMD Sempron | clock × 2.00 | 11.1 |
UltraSparc II | clock × 2.00 | 11.1 |
Intel Pentium MMX | clock × 2.00 | 11.1 |
Intel Pentium 4 | clock × 2.00 | 11.1 |
Intel Pentium M | clock × 2.00 | 11.1 |
Intel Core Duo | clock × 2.00 | 11.1 |
Intel Core 2 Duo | clock × 2.00 | 11.1 |
Intel Atom N455 | clock × 2.00 | 11.1 |
Centaur C6-2 | clock × 2.00 | 11.1 |
PowerPC 604/604e/750 | clock × 2.00 | 11.1 |
Intel Pentium III Coppermine | clock × 2.00 | 11.1 |
Intel Pentium III Xeon | clock × 2.00 | 11.1 |
Motorola 68060 | clock × 2.01 | 11.2 |
Intel Xeon MP (32-bit) (hyper-threading) | clock × 3.97 | 22.1 |
IBM S390 | not enough data (yet) | |
ARM | not enough data (yet) |
Source[7]
For a complete list, refer to the BogoMips mini-Howto.
With the 2.2.14 Linux kernel, a caching setting of the CPU state was moved from behind to before the BogoMips calculation. Although the BogoMips algorithm itself wasn't changed, from that kernel onward the BogoMips rating for then current Pentium CPUs was twice that of the rating before the change. The changed BogoMips outcome had no effect on real processor performance.
Computation of BogoMIPS
With kernel 2.6.x, BogoMIPS are implemented in the /usr/src/linux/init/calibrate.c
kernel source file. It computes the Linux kernel timing parameter loops_per_jiffy
(see jiffy) value. The explanation from source code:
/*
* A simple loop like
* while ( jiffies < start_jiffies+1)
* start = read_current_timer();
* will not do. As we don't really know whether jiffy switch
* happened first or timer_value was read first. And some asynchronous
* event can happen between these two events introducing errors in lpj.
*
* So, we do
* 1. pre_start <- When we are sure that jiffy switch hasn't happened
* 2. check jiffy switch
* 3. start <- timer value before or after jiffy switch
* 4. post_start <- When we are sure that jiffy switch has happened
*
* Note, we don't know anything about order of 2 and 3.
* Now, by looking at post_start and pre_start difference, we can
* check whether any asynchronous event happened or not
*/
loops_per_jiffy
is used to implement udelay
(delay in microseconds) and ndelay
(delay in nanoseconds) functions. These functions are needed by some drivers to wait for hardware. Note that a busy waiting technique is used, so the kernel is effectively blocked when executing ndelay
/udelay
functions. For i386 architecture delay_loop
is implemented in /usr/src/linux/arch/i386/lib/delay.c
as:
/* simple loop based delay: */
static void delay_loop(unsigned long loops)
{
int d0;
__asm__ __volatile__(
"\tjmp 1f\n"
".align 16\n"
"1:\tjmp 2f\n"
".align 16\n"
"2:\tdecl %0\n\tjns 2b"
:"=&a" (d0)
:"0" (loops));
}
equivalent to the following assembler code
; input: eax = d0
; output: eax = 0
jmp start
.align 16
start: jmp body
.align 16
body: decl eax
jns body
which can be rewritten to C-pseudocode
static void delay_loop(long loops)
{
long d0 = loops;
do {
--d0;
} while (d0 >= 0);
}
Full and complete information and details about BogoMips, and hundreds of reference entries can be found in the (outdated) BogoMips mini-Howto.[4]
Timer-based delays
In 2012, ARM contributed a new udelay
implementation allowing the system timer built into many ARMv7 CPUs to be used instead of a busy-wait loop.[8] Timer-based delays are more robust on systems that use frequency scaling to dynamically adjust the processor's speed at runtime, as loops_per_jiffies
values may not necessarily scale linearly. Also, since the timer frequency is known in advance, no calibration is needed at boot time.
One side effect of this change is that the BogoMIPS value will reflect the timer frequency, not the CPU's core frequency. Typically the timer frequency is much lower than the processor's maximum frequency, and some users may be surprised to see an unusually low BogoMIPS value when comparing against systems that use traditional busy-wait loops.
See also
References
- ↑ Van Dorst, Wim (January 1996). "The Quintessential Linux Benchmark". Linux Journal. Retrieved 2008-08-22.
- ↑ Eric S Raymond, and Geoff Mackenzie, published on the Internet in the early 1990s, untraceable origin.
- ↑ Raymond, Eric S. "Hackers Jargon File".
- 1 2 Van Dorst, Wim (2 March 2006). "BogoMips Mini-Howto" (V38 ed.). Retrieved 2008-08-22.
- ↑ Blidung, Thomas. "Re: How many BogoMips should I expect from my 486DX2 running Linux?".
- ↑ Wirzenius, Lars. "Re: printing & BogoMips".
- ↑ Bekman, Stas. "What is a BogoMip?".
- ↑ Deacon, Will. "ARM: 7452/1: delay: allow timer-based delay implementation to be selected".
External links
- BogoMips Mini-Howto, V38
- This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.
- Sources of classical standalone benchmark