68k instructions timings: Difference between revisions

From NeoGeo Development Wiki
Jump to navigation Jump to search
m (Formatting)
Line 429: Line 429:
calculation where indicated. Dynamic: register, static: immediate.
calculation where indicated. Dynamic: register, static: immediate.


instruction size dynamic static
instruction size           dynamic                 static
register  memory register  memory
                        register  memory       register  memory
BCHG byte   -   8(1/1) +   -   12(2/1) +
BCHG         byte         -       8(1/1) +       -       12(2/1) +
long 8(1/0) *    - 12(2/0) *    -
            long       8(1/0) *    -         12(2/0) *    -
BCLR byte   -   8(1/1) +   -   12(2/1) +
BCLR         byte         -       8(1/1) +       -       12(2/1) +
long 10(1/0) *    - 14(2/0) *    -
            long       10(1/0) *    -         14(2/0) *    -
BSET byte   -   8(1/1) +   -   12(2/1) +
BSET         byte         -       8(1/1) +       -       12(2/1) +
long 8(1/0) *    - 12(2/0) *    -
            long       8(1/0) *    -         12(2/0) *    -
BTST byte   -    4(1/0) +   -     8(2/0) +
BTST         byte         -    4(1/0) +       -       8(2/0) +
long 6(1/0)     - 10(2/0)      -
            long       6(1/0)      -         10(2/0)      -


+ add effective address calculation time
+ add effective address calculation time
* indicates maximum value
* indicates maximum value
</syntaxhighlight>
</syntaxhighlight>


=Conditional instructions=
=Conditional instructions=

Revision as of 16:39, 17 September 2017

Mirrored information from [oldwww.nvg.ntnu.no]

The number of bus read and write cycles are shown in parenthesis as (r/w). Any other cycles are internal.

In the following tables, the headings have the following meanings:

  • An : Address register operand
  • Dn : Data register operand
  • ea : Operand specified by an effective address
  • M : Memory effective address operand

To get the real execution time, multiply the total cycles count by 83.33ns (1/12MHz). An example is given in each section.

The vertical blank lasts exactly 40 lines * 384 pixels * 2 cycles per pixel = 30720 cycles (2.56ms).

See optimization.

Effective address operand calculation

This table lists the number of clock periods required to compute an instruction's effective address. It includes fetching of any extension words, the address computation, and fetching of the memory operand.

Syntax Adressing mode B,W L
Dn Data register direct 0(0/0) 0(0/0)
An Address register direct 0(0/0) 0(0/0)
(An) Address register indirect 4(1/0) 8(2/0)
(An)+ Address register indirect, post inc. 4(1/0) 8(2/0)
-(An) Address register indirect, pre dec. 6(1/0) 10(2/0)
d(An) Address register indirect, displacement 8(2/0) 12(3/0)
d(An,ix) Address register indirect, index 10(2/0) 14(3/0)
xxx.w Absolute short 8(2/0) 12(3/0)
xxx.l Absolute long 12(3/0) 16(4/0)
d(PC) PC with displacement 8(2/0) 12(3/0)
d(PC,ix) PC with index 10(2/0) 14(3/0)
#xxx Immediate 4(1/0) 8(2/0)

Notes:

  • Pre-dec is slower than post-inc
  • There are no write cycles involved in processing the effective address
  • The size of the index register (ix) does not affect execution time

Move instructions

These following two tables indicate the number of clock periods for the move instruction. This data includes instruction fetch, operand reads, and operand writes.

Byte and word

Example: move.b (a0)+,$10201D (Byte (An)+ to xxx.L) takes 20 cycles.

  Dn An (An) (An)+ -(An) d(An) d(An,ix) xxx.W xxx.L
Dn 4(1/0) 4(1/0) 8(1/1) 8(1/1) 8(1/1) 12(2/1) 14(2/1) 12(2/1) 16(3/1)
An 4(1/0) 4(1/0) 8(1/1) 8(1/1) 8(1/1) 12(2/1) 14(2/1) 12(2/1) 16(3/1)
(An) 8(2/0) 8(2/0) 12(2/1) 12(2/1) 12(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1)
(An)+ 8(2/0) 8(2/0) 12(2/1) 12(2/1) 12(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1)
-(An) 10(2/0) 10(2/0) 14(2/1) 14(2/1) 14(2/1) 18(3/1) 20(4/1) 18(3/1) 22(4/1)
d(An) 12(3/0) 12(3/0) 16(3/1) 16(3/1) 16(3/1) 20(4/1) 22(4/1) 20(4/1) 24(5/1)
d(An,ix) 14(3/0) 14(3/0) 18(3/1) 18(3/1) 18(3/1) 22(4/1) 24(4/1) 22(4/1) 26(5/1)
xxx.W 12(3/0) 12(3/0) 16(3/1) 16(3/1) 16(3/1) 20(4/1) 22(4/1) 20(4/1) 24(5/1)
xxx.L 16(4/0) 16(4/0) 20(4/1) 20(4/1) 20(4/1) 24(5/1) 26(5/1) 24(5/1) 28(6/1)
d(PC) 12(3/0) 12(3/0) 16(3/1) 16(3/1) 16(3/1) 20(4/1) 22(4/1) 20(4/1) 24(5/1)
d(PC,ix) 14(3/0) 14(3/0) 18(3/1) 18(3/1) 18(3/1) 22(4/1) 24(4/1) 22(4/1) 26(5/1)
#xxx 8(2/0) 8(2/0) 12(2/1) 12(2/1) 12(2/1) 16(3/1) 18(3/1) 16(3/1) 20(4/1)

The size of the index register (ix) does not affect execution time.

Long

Example: move.l $05012C,4(a1,d0) (Long xxx.L to d(An,ix)) takes 34 cycles.

  Dn An (An) (An)+ -(An) d(An) d(An,ix) xxx.W xxx.L
Dn 4(1/0) 4(1/0) 12(1/2) 12(1/2) 12(1/2) 16(2/2) 18(2/2) 16(2/2) 20(3/2)
An 4(1/0) 4(1/0) 12(1/2) 12(1/2) 12(1/2) 16(2/2) 18(2/2) 16(2/2) 20(3/2)
(An) 12(3/0) 12(3/0) 20(3/2) 20(3/2) 20(3/2) 24(4/2) 26(4/2) 24(4/2) 28(5/2)
(An)+ 12(3/0) 12(3/0) 20(3/2) 20(3/2) 20(3/2) 24(4/2) 26(4/2) 24(4/2) 28(5/2)
-(An) 14(3/0) 14(3/0) 22(3/2) 22(3/2) 22(3/2) 26(4/2) 28(4/2) 26(4/2) 30(5/2)
d(An) 16(4/0) 16(4/0) 24(4/2) 24(4/2) 24(4/2) 28(5/2) 30(5/2) 28(5/2) 32(6/2)
d(An,ix) 18(4/0) 18(4/0) 26(4/2) 26(4/2) 26(4/2) 30(5/2) 32(5/2) 30(5/2) 34(6/2)
xxx.W 16(4/0) 16(4/0) 24(4/2) 24(4/2) 24(4/2) 28(5/2) 30(5/2) 28(5/2) 32(6/2)
xxx.L 20(5/0) 20(5/0) 28(5/2) 28(5/2) 28(5/2) 32(6/2) 34(6/2) 32(6/2) 36(7/2)
d(PC) 16(4/0) 16(4/0) 24(4/2) 24(4/2) 24(4/2) 28(5/2) 30(5/2) 28(5/2) 32(5/2)
d(PC,ix) 18(4/0) 18(4/0) 26(4/2) 26(4/2) 26(4/2) 30(5/2) 32(5/2) 30(5/2) 34(6/2)
#xxx 12(3/0) 12(3/0) 20(3/2) 20(3/2) 20(3/2) 24(4/2) 26(4/2) 24(4/2) 28(5/2)

The size of the index register (ix) does not affect execution time.

Standard instructions

Example: add.w d3,a7 (Word ea Dn + An) takes 8 cycles.

The number of clock periods shown in this table indicates the time required to perform the operations, store the results and read the next instruction. The total number of clock periods must be added respectively to those of the effective address calculation where indicated (+).

  Size <ea>,An * <ea>,Dn Dn,<M>
ADD B,W 8(1/0)+ 4(1/0)+ 8(1/1)+
L 6(1/0)+** 6(1/0)+** 12(1/2)+
AND B,W - 4(1/0)+ 8(1/1)+
L - 6(1/0)+** 12(1/2)+
CMP B,W 6(1/0)+ 4(1/0)+ -
L 6(1/0)+ 6(1/0)+ -
DIVS - - 158(1/0)+ -
DIVU - - 140(1/0)+ -
EOR B,W - 4(1/0) *** 8(1/1) +
L - 8(1/0) *** 12(1/2) +
MULS - - 70(1/0)+* -
MULU - - 70(1/0)+* -
OR B,W - 4(1/0) +** 8(1/1) +
L - 6(1/0) +** 12(1/2) +
SUB B,W 8(1/0)+ 4(1/0)+ 8(1/1)+
L 6(1/0)+** 6(1/0)+** 12(1/2)+
notes:	+ Add effective address calculation time
	^ Word or long only
	* Indicates maximum value
       ** The base time of six clock periods is increased to eight		
	  if the effective address mode is register direct or 
	  immediate (effective address time should also be added)
      *** Only available effective address mode is data register direct
	  
	DIVS,DIVU - The divide algorithm used by the MC68000 provides less
		    than 10% difference between the best and the worst case
		    timings.
	MULS,MULU - The multiply algorithm requires 38+2n clocks where
		    n is defined as:
		MULU: n = the number of ones in the <ea>
		MULS: n = concatenate the <ea> with a zero as the LSB;
			  n is the resultant number of 10 or 01 patterns
			  in the 17-bit source; i.e., worst case happens
			  when the source is $5555

Immediate instructions

The number of clock periods periods shown in this table includes the time to fetch immediate operands, perform the operations, store the results and read the next operation. The total number of clock periods must be added respectively to those of the effective address calculation where indicated (+).

  Size #,Dn #,An #,M
ADDI B,W 8(2/0) - 12(2/1)+
L 16(3/0) - 20(3/2)+
ADDQ B,W 4(1/0) 8(1/0)* 12(1/2)+
L 8(1/0) 8(1/0) 12(1/2)+
ANDI B,W 8(2/0) - 12(1/2)+
L 16(3/0) - 20(3/1)+
CMPI B,W 8(2/0) - 8(2/0)+
L 14(3/0) - 12(3/1)+
EORI B,W 8(2/0) - 12(2/1)+
L 16(3/0) - 20(3/2)+
MOVEQ L 4(1/0) - -
ORI B,W 8(2/0) - 12(2/1)+
L 16(3/0) - 20(3/2)+
SUBI B,W 8(2/0) - 12(2/1)+
L 16(3/0) - 20(3/2)+
SUBQ B,W 4(1/0) 8(1/0)* 8(1/1)+
L 8(1/0) 8(1/0) 12(1/2)+
	+ Add effective address calculation time
	* word only

Single operand instructions

This table indicates the number of clock periods for the single operand
instructions. The number of clock periods and the number of read and write cycles
must be added respectively to those of the effective address calculation
where indicated.

instruction	size		register	 memory

CLR		byte,word	4(1/0)		 8(1/1) +
		  long		6(1/0)		12(1/2) +
NBCD		  byte		6(1/0)		 8(1/1) +
NEG		byte,word	4(1/0)		 8(1/1) +
		  long		6(1/0)		12(1/2) +
NEGX		byte,word	4(1/0)		 8(1/1) +
		  long		6(1/0)		12(1/2) +
NOT		byte,word	4(1/0)		 8(1/1) +
		  long		6(1/0)		12(1/2) +
Scc		byte,false	4(1/0)		 8(1/1) +
		byte,true	6(1/0)		 8(1/1) +
TAS #		  byte		4(1/0)		10(1/1) +
TST		byte,word	4(1/0)		 4(1/0) +
		  long		4(1/0)		 4(1/0) +

	+ add effective address calculation time
        # This instruction should never be used on the Amiga as its invisiable
          read/write cycle can disrupt system DMA.


Shift and rotate instructions

This table indicates the number of clock periods for the shift and rotate
instructions. The number of clock periods and the number of read and write
cycles must be added respectively to those of the effective address
calculation where indicated.

instruction	size		register	memory

ASR,ASL		byte,word	6+2n(1/0)	8(1/1) +
		  long		8+2n(1/0)	  -
LSR,LSL		byte,word	6+2n(1/0)	8(1/1) +
		  long		8+2n(1/0)	  -
ROR,ROL		byte,word	6+2n(1/0)	8(1/1) +
		  long		8+2n(1/0)	  -
ROXR,ROXL	byte,word	6+2n(1/0)	8(1/1) +
		  long		8+2n(1/0)	  -

	+ add effective address calculation time
	n is the shift or rotate count


Bit manipulation instructions

This table indicates the number of clock periods required for the bit
manipulation instructions. The number of clock periods and the number of read and 
write cycles must be added respectively to those of the effective address
calculation where indicated. Dynamic: register, static: immediate.

instruction  size            dynamic                 static
                        register   memory       register   memory	
BCHG         byte          -       8(1/1) +        -       12(2/1) +
             long        8(1/0) *    -          12(2/0) *     -
BCLR         byte          -       8(1/1) +        -       12(2/1) +
             long       10(1/0) *    -          14(2/0) *     -
BSET         byte          -       8(1/1) +        -       12(2/1) +
             long        8(1/0) *    -          12(2/0) *     -
BTST         byte          -  	   4(1/0) +        -        8(2/0) +
             long        6(1/0)      -          10(2/0)       -

	+ add effective address calculation time
	* indicates maximum value

Conditional instructions

Mnemonic Displacement Branch taken Not taken
Bcc byte 10(2/0) 8(1/0)
word 10(2/0) 12(1/0)
BRA byte 10(2/0)
word 10(2/0)
BSR byte 18(2/2)
word 18(2/2)
DBcc cc true 12(2/0)
cc false 10(2/0) 14(3/0)

JMP, JSR, LEA, PEA and MOVEM instructions

This Table indicates the number of clock periods required for the jump,
jump-to-subroutine, load effective address, push effective address and
move multiple registers instructions.

instr	size    (An)		(An)+		-(An)	 d(An)	
JMP     -	    8(2/0)	     -		      -    10(2/0)
JSR     -	   16(2/2)	     -		      -	   18(2/2)
LEA     -	    4(1/0)	     -		      -	    8(2/0)
PEA     -	   12(1/2)	     -		      -	   16(2/2)
MOVEM   word     12+4n       12+4n	      -      16+4n
M->R           (3+n/0)	   (3+n/0)	      -	   (4+n/0)
	    long     12+8n	     12+8n	      -	     16+8n
		      (3+2n/0)    (3+2n/0)	      -   (4+2n/0)
MOVEM	word	  8+4n	     -		     8+4n	 12+4n
R->M		     (2/n)	     -		    (2/n)	 (3/n)
	    long	  8+8n	     -		     8+8n	 12+8n
                (2/2n)	     -		   (2/2n)	(3/2n)

instr	size	d(An,ix)+   xxx.W      xxx.L      d(PC)      d(PC,ix)*
JMP	 -	 14(3/0)    10(2/0)    12(3/0)	  10(2/0)    14(3/0)
JSR	 -	 22(2/2)    18(2/2)    20(3/2)	  18(2/2)    22(2/2)
LEA	 -	 12(2/0)     8(2/0)    12(3/0)	   8(2/0)    12(2/0)
PEA	 -	 20(2/2)    16(2/2)    20(3/2)	  16(2/2)    20(2/2)
MOVEM	word	   18+4n      16+4n      20+4n	    16+4n      18+4n
M->R		 (4+n/0)    (4+n/0)    (5+n/0)	  (4+n/0)    (4+n/0)
	long	   18+8n      16+8n      20+8n	    16+8n      18+8n
		(4+2n/0)   (4+2n/0)   (5+2n/0)	 (4+2n/0)   (4+2n/0)
MOVEM	word	   14+4n      12+4n      16+4n	    -		-
R->M		   (3/n)      (3/n)      (4/n)	    -		-
        long   14+8n      12+8n      16+8n	    -		-
		  (3/2n)     (3/2n)     (4/2n)	    -		-

n is the number of registers to move
* is the size of the index register (ix) does not affect the instruction's
  execution time

Multi-precision instructions

This table indicates the number of clock periods for the multi-precision
instructions. The number of clock periods includes the time to fetch both
operands, perform the operations, store the results and read the next 
instructions.

instruction	size		op Dn,Dn	op M,M

ADDX		byte,word	4(1/0)		18(3/1)
		  long		8(1/0)		30(5/2)
CMPM		byte,word	  -		12(3/0)
		  long		  -		20(5/0)
SUBX		byte,word	4(1/0)		18(3/1)
		  long		8(1/0)		30(5/2)
ABCD		  byte		6(1/0)		18(3/1)
SBCD		  byte		6(1/0)		18(3/1)


Miscellaneous instructions

This table indicates the number of clock periods for the following 
miscellaneous instructions. The number of clock periods and plus the number
of read and write cycles must be added to those of the effective address
calculation where indicated.

instruction	size	register	memory

ANDI to CCR	byte	 20(3/0)	   -
ANDI to SR	word	 20(3/0)	   -
CHK		 -	 10(1/0) +	   -
EORI to CCR	byte	 20(3/0)	   -
EORI to SR	word	 20(3/0)	   -
ORI to CCR	byte	 20(3/0)	   -
ORI to SR	word	 20(3/0)	   -
MOVE from SR	 -	  6(1/0)	 8(1/1)+
MOVE to CCR	 -	 12(1/0)	12(1/0)+
MOVE to SR	 -	 12(1/0)	12(1/0)+
EXG		 -	  6(1/0)	   -
EXT		word	  4(1/0)	   -
		long	  4(1/0)	   -
LINK		 -	 16(2/2)	   -
MOVE from USP	 -	  4(1/0)	   -
MOVE to USP	 -	  4(1/0)	   -
NOP		 -	  4(1/0)	   -
RESET		 -	132(1/0)	   -
RTE		 -	 20(5/0)	   -
RTR		 -	 20(5/0)	   -
RTS		 -	 16(4/0)	   -
STOP		 -	  4(0/0)	   -
SWAP		 -	  4(1/0)	   -
TRAPV (No Trap)	 -	  4(1/0)	   -
UNLK		 -	 12(3/0)	   -

	+ add effective address calculation time


Move Peripheral instructions

instruction	size	register->memory	memory->register

MOVEP		word	16(2/2)			16(4/0)	
		long	24(2/4)			24(6/0)


Exception processing

This table indicates the number of clock periods for exception processing.
The number of clock periods includes the time for all stacking, the vector
fetch and the fetch of the first two instruction words of the handler routine.

	exception			periods

	address error			50(4/7)
	bus error			50(4/7)
	CHK instruction (trap taken)	44(5/3)+
	Divide by Zero			42(5/3)
	illegal instruction		34(4/3)
	interrupt			44(5/3)*
	privilege violation		34(4/3)
        _____
	RESET **			40(6/0)
	trace				34(4/3)
	TRAP instruction		38(4/3)
	TRAPV instruction (trap taken)	34(4/3)

	+ add effective address calculation time
	* the interrupt acknowledge cycle is assumed to take four
	  clock periods
                                       _____     ____
       ** indicates the time from when RESET and HALT are first
	  sampled as negated to when instruction execution starts