<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.neogeodev.org//api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Anima</id>
	<title>NeoGeo Development Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.neogeodev.org//api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Anima"/>
	<link rel="alternate" type="text/html" href="https://wiki.neogeodev.org//index.php/Special:Contributions/Anima"/>
	<updated>2026-05-21T18:40:55Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.40.0</generator>
	<entry>
		<id>https://wiki.neogeodev.org//index.php?title=Optimization&amp;diff=5662</id>
		<title>Optimization</title>
		<link rel="alternate" type="text/html" href="https://wiki.neogeodev.org//index.php?title=Optimization&amp;diff=5662"/>
		<updated>2017-06-05T09:26:39Z</updated>

		<summary type="html">&lt;p&gt;Anima: &amp;quot;moveq.l&amp;quot; always sign extent the byte value to a long.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;As seen on the [[68k instructions timings]].&lt;br /&gt;
&lt;br /&gt;
=VRAM access=&lt;br /&gt;
&lt;br /&gt;
Since &amp;lt;pre&amp;gt;move.w *,xxx.L&amp;lt;/pre&amp;gt;&lt;br /&gt;
Is always slower than &amp;lt;pre&amp;gt;move.w *,d(An)&amp;lt;/pre&amp;gt;&lt;br /&gt;
Try reserving an address register to hold {{Reg|VRAM_ADDR}} and add offsets to access {{Reg|VRAM_RW}} and {{Reg|VRAM_MOD}}. But be careful about [[VRAM]] timings !&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
    lea      VRAM_RW,a5&lt;br /&gt;
    move.w   #$0001,2(a5)    ; VRAM_MOD&lt;br /&gt;
    ...&lt;br /&gt;
    move.w   #$1234,-2(a5)   ; VRAM_ADDR&lt;br /&gt;
    ...&lt;br /&gt;
    move.w   #$5678,(a5)     ; VRAM_RW&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=General 68k tricks=&lt;br /&gt;
&lt;br /&gt;
Many tricks are from [[http://www.easy68k.com/ Easy68k]].&lt;br /&gt;
&lt;br /&gt;
==Adressing==&lt;br /&gt;
&lt;br /&gt;
* (An)+ is faster than -(An), except for MOVEs (same).&lt;br /&gt;
* Because (An) is faster without pre-dec/post-inc, access to the first element of a data structure is faster than to the others.&lt;br /&gt;
* Don&#039;t assume that long operations are always slower than word-size ones. For instance, word address operations can be slower than long ones because of the time to sign-extend a word value.&lt;br /&gt;
&lt;br /&gt;
==CALL/RTS with JMP==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   lea   return, A0&lt;br /&gt;
   jmp   routine&lt;br /&gt;
return:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then to return, just jmp (A0). A0 Needs to be preserved but saves 8 cycles compared to call/rts.&lt;br /&gt;
&lt;br /&gt;
==Replace JSR+RTS==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   jsr subroutine  -&amp;gt;  jmp subroutine   ; Saves 24 cycles&lt;br /&gt;
   rts&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Replace JSR+JMP==&lt;br /&gt;
Instead of:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   jsr sub   ; 18/20&lt;br /&gt;
   jmp next  ; 10/12&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   pea next  ; 16/20&lt;br /&gt;
   jmp sub   ; 10/12&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Comparisons==&lt;br /&gt;
cmp.l #xxx,Dn takes 14 cycles. If the value being tested for is small enough to fit in a moveq (-128 to +127), it&#039;s shorter and faster to put the value in a temporary register:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
moveq.l #xxx,d0&lt;br /&gt;
cmp.l   d0,d1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If the value xxx is between -8 and 8, and you don&#039;t mind altering the data register, you can just use subq #xxx,Dn (or addq) instead of cmp. Then you can use a conditional branch just as you would after a cmp. This works for word or longword comparisons.&lt;br /&gt;
&lt;br /&gt;
==Loops/searches==&lt;br /&gt;
Since a taken short branch is slower than an untaken one, try to avoid taking most branches. For instance, if you have a loop searching for a null, the simple way to search is:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-:&lt;br /&gt;
   tst.b (a0)+&lt;br /&gt;
   bne.s -&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It takes only a bit more space to unroll one or more iterations of the loop:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-:&lt;br /&gt;
   tst.b (a0)+&lt;br /&gt;
   beq.s found&lt;br /&gt;
   tst.b (a0)+&lt;br /&gt;
   bne.s -&lt;br /&gt;
found:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Clear data register==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
clr.l   Dn      -&amp;gt;   moveq.l  #0,Dn     ; Saves 2 cycles&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Clear address register==&lt;br /&gt;
There are no CLR or MOVEQ for address registers.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
move.l  #0,An   -&amp;gt;   sub.l    An,An     ; Saves 4 cycles&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Clear upper half of data register==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
andi.l  #$0000FFFF,Dn  -&amp;gt;  swap   Dn    ; Saves 4 cycles&lt;br /&gt;
                           clr.w  Dn&lt;br /&gt;
                           swap   Dn&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Set large constants==&lt;br /&gt;
To move $00010000 ... $007F0000 values to a data register:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   moveq.l #X,Dn                       ; X = $01 ... $7F&lt;br /&gt;
   swap    Dn&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To move $FF80FFFF ... $FFFEFFFF values to a data register:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
   moveq.l #X,Dn                       ; X = $FFFFFF80 ... $FFFFFFFE&lt;br /&gt;
   swap    Dn&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Shift/multiply data register==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
lsl.w   #1,d0   -&amp;gt;   add.w    d0,d0     ; Saves 4 cycles&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
lsl.l   #1,d0   -&amp;gt;   add.l    d0,d0     ; Saves 4 cycles&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
lsl.w   #2,d0   -&amp;gt;   add.w    d0,d0     ; Saves 2 cycles&lt;br /&gt;
                     add.w    d0,d0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Add to address register==&lt;br /&gt;
Useful when xxx is between -32768 and 32767.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
adda.w  #xxx,a0  -&amp;gt;   lea      10(a0),a0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Rotates==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
moveq.l #16,d0  -&amp;gt;   swap     d1&lt;br /&gt;
ror.l   d0,d1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
moveq.l #15,d0  -&amp;gt;   swap     d1&lt;br /&gt;
ror.l   d0,d1        rol.l    #1,d1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Code]]&lt;/div&gt;</summary>
		<author><name>Anima</name></author>
	</entry>
</feed>