Optimization: Difference between revisions
(Created page with "As seen on the 68k instructions timings. =VRAM access= Since <pre>move.w *,xxx.L</pre> Is always slower than <pre>move.w *,d(An)</pre> Try reserving an address register...") |
(No difference)
|
Revision as of 14:49, 22 July 2016
As seen on the 68k instructions timings.
VRAM access
Since
move.w *,xxx.L
Is always slower than
move.w *,d(An)
Try reserving an address register to hold VRAM_ADDR and add offsets to access VRAM_RW and VRAM_MOD. But be careful about VRAM timings !
lea VRAM_ADDR,a5
move.w #$0001,4(a5) ; VRAM_MOD
...
move.w #$1234,(a5) ; VRAM_ADDR
...
move.w #$5678,2(a5) ; VRAM_RW
General 68k tricks
Many tricks are from [Easy68k].
Adressing
(An)+ is faster than -(An), except for MOVEs (same).
Because (An) is faster than x(An), access to the first element of a data structure is faster than to the others.
Don't assume that long operations are always slower than word-size ones. For instance, word address operations can be slower than long ones because of the time to sign-extend a word value.
Jump/call/return
lea return,a0 jmp routine return:
Then to return, just jmp (a0). Uses a0 but saves 8 cycles.
jsr subroutine -> jmp subroutine ; Saves 24 cycles rts
jsr sub ; 18/20 jmp next ; 10/12
pea next ; 16/20 jmp sub ; 10/12
Comparisons
cmp.l #xxx,Dn takes 14 cycles. If the value being tested for is small enough to fit in a moveq (-128 to +127), it's shorter and faster to put the value in a temporary register:
moveq.l #xxx,d0 cmp.l d0,d1
If the value xxx is between -8 and 8, and you don't mind altering the data register, you can just use subq #xxx,Dn (or addq) instead of cmp. Then you can use a conditional branch just as you would after a cmp. This works for word or longword comparisons.
Loops/searches
Since a taken short branch is slower than an untaken one, try to avoid taking most branches. For instance, if you have a loop searching for a null, the simple way to search is:
-: tst.b (a0)+ bne.s -
It takes only a bit more space to unroll one or more iterations of the loop:
-: tst.b (a0)+ beq.s found tst.b (a0)+ bne.s - found:
Clear data register
clr.l Dn -> moveq.l #0,Dn ; Saves 2 cycles
Clear address register
There are no CLR or MOVEQ for address registers.
move.l #0,An -> sub.l An,An ; Saves 4 cycles
Clear upper half of data register
andi.l #$0000FFFF,Dn -> swap Dn ; Saves 4 cycles clr.w Dn swap Dn
Set large constants
To move xxxx0000 values to a data register:
moveq.l #xxxx,Dn swap Dn
Shift/multiply data register
lsl.w #1,d0 -> add.w d0,d0 ; Saves 4 cycles
lsl.l #1,d0 -> add.l d0,d0 ; Saves 2 cycles, not sure ?
lsl.w #2,d0 -> add.w d0,d0 ; Saves 2 cycles add.w d0,d0
Add to address register
Useful when xxx is between -32768 and 32767.
adda.w #xxx,a0 -> lea 10(a0),a0
Rotates
moveq.l #16,d0 -> swap d1 ror.l d0,d1
moveq.l #15,d0 -> swap d1 ror.l d0,d1 rol.l #1,d1