Difference between revisions of "Rendering logic"

From NeoGeo Development Wiki
Jump to: navigation, search
Line 1: Line 1:
On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to pairs of different chips used to generate the video signal.
+
On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to a chip or a group of different chips used to generate the video signal.
  
 
* [[PRO-A0]], [[PRO-B0]] (early)
 
* [[PRO-A0]], [[PRO-B0]] (early)
Line 6: Line 6:
 
* [[NEO-GRZ]] (CDZ, MV-1C ?)
 
* [[NEO-GRZ]] (CDZ, MV-1C ?)
  
See [[graphics pipeline]] for details and interconnections between chips and cartridges.
+
See [[graphics pipeline]] for an overview of the interconnections between chips and cartridges.
 +
 
 +
==Temporary notes==
 +
 
 +
*PCK2 rises with BNKB and CHBL
 +
*The first valid rendering cycle is 32mclk after CHBL low ?
 +
*Fix and sprite pixels are rendered at the same speed because sprite pixels are written by pairs
 +
*Tile pixel lines are rendered in halves:
 +
 
 +
*For the fix (32mclk = 8 pixels corresponds to 6MHz pixel clock):
 +
**Full address is ...1**** (PCK2 pulse)
 +
**2H1 is 0 for 2 pixels (columns 0 & 1), then 1 for 2 pixels (columns 2 & 3)
 +
**Full address is ...0**** (PCK2 pulse)
 +
**2H1 is 0 for 2 pixels (columns 4 & 5), then 1 for 2 pixels (columns 6 & 7)
 +
 
 +
*For sprites (32mclk = 16 pixels):
 +
**Full address is ...1***** (PCK1 pulse)
 +
**CA4 is 0 for 4 pixels (columns 0~3), then 1 for 4 pixels (columns 4~7)
 +
**Full address is ...0***** (PCK1 pulse)
 +
**CA4 is 0 for 4 pixels (columns 8~11), then 1 for 4 pixels (columns 12~15)
 +
 
 +
*As fix is rendered in realtime, the fix tile address is set before sprites (on a new line PCK2 pulses before PCK1)
 +
*X position to B1, just before each PCK2 pulse (SP during 1mclk), for 20 sprites next to each other (X+16px each time):
 +
** Start of line: 0000,0808,1010,1838,2000,2808,3010,3838,40C0,48E8,50F0,58F8,60C0,68E8,70F0,78F8,8000,8808,9010,9838,0,0,0...
  
 
=Video generation=
 
=Video generation=
Line 14: Line 37:
 
[[NEO-B1]] is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).
 
[[NEO-B1]] is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).
  
*The TMS0 signal lets LSPC chose how the pair of buffers are used:
+
*The TMS0 signal from LSPC tells B1 how the pair of buffers are used:
 
**0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
 
**0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
 
**1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
 
**1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
Line 21: Line 44:
 
*WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
 
*WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
 
*SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
 
*SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
 +
 +
*The rising edge of PCK2 latches the X position of the sprite (and something else in a byte ?)
 +
*1H1 (6MHz / 2 pixels per byte = 3MHz) is used to clock in the pixels of FIXD into the pixel buffers (or directly to the output ?) if they're not 0000.
  
 
=Sprite parsing=
 
=Sprite parsing=
Line 26: Line 52:
 
<span style="color:#FF0000">This is a draft. The following information shouldn't be considered as exact.</span>
 
<span style="color:#FF0000">This is a draft. The following information shouldn't be considered as exact.</span>
  
*LSPC runs at 24MHz.
+
*LSPC runs at 24MHz
*Fast VRAM is 35ns (28MHz) and is read at 16MHz (24/1.5).
+
*Fast VRAM is 35ns
*The read occurs 41.6ns (1clk) after address is set
+
*The reads always occur 1mclk (41.6ns) after address is set
 
[[file:timing_gpu1.png]]
 
[[file:timing_gpu1.png]]
  
*LSPC always starts in sprite list A ($8600) each new frame  
+
*FIXT: P23~16 are 0, P15~0 are S ROM address (+ external 2H1)
 +
*SPRT: P23~0 are C ROM address (+ external CA4)
 +
*LO: P23~16 are [[LO]] ROM data, P15~0 are LO address
 +
*FP: P19~16 is the fix tile palette, rest is 0
 +
*SP: P23~16 is the sprite tile palette, P15~8 is X position, P7~0 is ?
 +
 
 +
*LSPC always starts filling up active sprite list A ($8600) each new frame  
 
*Read sequence (100p capacitor delay on AES too on PCKxB ?):
 
*Read sequence (100p capacitor delay on AES too on PCKxB ?):
 
*Timing diagram when the sprite list for the actual line is already filled (no writes):
 
*Timing diagram when the sprite list for the actual line is already filled (no writes):

Revision as of 03:55, 5 January 2014

On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to a chip or a group of different chips used to generate the video signal.

See graphics pipeline for an overview of the interconnections between chips and cartridges.

Temporary notes

  • PCK2 rises with BNKB and CHBL
  • The first valid rendering cycle is 32mclk after CHBL low ?
  • Fix and sprite pixels are rendered at the same speed because sprite pixels are written by pairs
  • Tile pixel lines are rendered in halves:
  • For the fix (32mclk = 8 pixels corresponds to 6MHz pixel clock):
    • Full address is ...1**** (PCK2 pulse)
    • 2H1 is 0 for 2 pixels (columns 0 & 1), then 1 for 2 pixels (columns 2 & 3)
    • Full address is ...0**** (PCK2 pulse)
    • 2H1 is 0 for 2 pixels (columns 4 & 5), then 1 for 2 pixels (columns 6 & 7)
  • For sprites (32mclk = 16 pixels):
    • Full address is ...1***** (PCK1 pulse)
    • CA4 is 0 for 4 pixels (columns 0~3), then 1 for 4 pixels (columns 4~7)
    • Full address is ...0***** (PCK1 pulse)
    • CA4 is 0 for 4 pixels (columns 8~11), then 1 for 4 pixels (columns 12~15)
  • As fix is rendered in realtime, the fix tile address is set before sprites (on a new line PCK2 pulses before PCK1)
  • X position to B1, just before each PCK2 pulse (SP during 1mclk), for 20 sprites next to each other (X+16px each time):
    • Start of line: 0000,0808,1010,1838,2000,2808,3010,3838,40C0,48E8,50F0,58F8,60C0,68E8,70F0,78F8,8000,8808,9010,9838,0,0,0...

Video generation

See Display timing for the sync signal timing.

NEO-B1 is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).

  • The TMS0 signal from LSPC tells B1 how the pair of buffers are used:
    • 0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
    • 1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
  • CSK1~4 signals are used to step to the next pixel (rising edge ?), periodic for video output, VRAM-dependent when filling up. Inactive during H-blank.
  • WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
  • SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
  • The rising edge of PCK2 latches the X position of the sprite (and something else in a byte ?)
  • 1H1 (6MHz / 2 pixels per byte = 3MHz) is used to clock in the pixels of FIXD into the pixel buffers (or directly to the output ?) if they're not 0000.

Sprite parsing

This is a draft. The following information shouldn't be considered as exact.

  • LSPC runs at 24MHz
  • Fast VRAM is 35ns
  • The reads always occur 1mclk (41.6ns) after address is set

Timing gpu1.png

  • FIXT: P23~16 are 0, P15~0 are S ROM address (+ external 2H1)
  • SPRT: P23~0 are C ROM address (+ external CA4)
  • LO: P23~16 are LO ROM data, P15~0 are LO address
  • FP: P19~16 is the fix tile palette, rest is 0
  • SP: P23~16 is the sprite tile palette, P15~8 is X position, P7~0 is ?
  • LSPC always starts filling up active sprite list A ($8600) each new frame
  • Read sequence (100p capacitor delay on AES too on PCKxB ?):
  • Timing diagram when the sprite list for the actual line is already filled (no writes):
24M    |'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_
Addr   | 600 |  200  | 201 | 202 | 203 | 204 |  681  | 00E | 20E | 40E | 600 |  205  | 206 | 207 | 208 | 209 |  682  | 00F | 20F | 40F
PCK1   ______|'''|___________________________________________________________|'''|_____________________________________________________
PCK1B  '''''''|____|''''''''''''''''''''''''''''''''''''''''''''''''''''''''''|___|''''''''''''''''''''''''''''''''''''''''''''''''''''
LOAD   |'''''''|_______________________|'''''''|_______________________|'''''''|_______________________|'''''''|_______________________
12M    __|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|_
2Pixel       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
Read       ?       !     !     !     !     !       !     !     !     !     ?       !     !     !     !     !       !     !     !     !
What      1      2      2     2     2     2      3      4     5     6     1      2      2     2     2     2      3      4     5     6...
  • 1: ?
  • 2: Read SCB3 to see if sprite is in next scanline (just increments), starts frame at sprite 1 ?
  • 3: Read sprite list to get sprite #
  • 4: Read SCB2 zoom values
  • 5: Read SCB3 Y/size/chain
  • 6: Read SCB4 X
  • Timing diagram when the sprite list is being filled:
24M    |'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_
Addr   | 600 |  20F  | 210 | 211 | 600 | 601 |  684  | 005 | 205 | 405 | 600 |  212  | 213 | 602 | 603 | 214 |  685  | 006 | 206 | 406
PCK1   ______|'''|___________________________________________________________|'''|_____________________________________________________
PCK1B  '''''''|____|''''''''''''''''''''''''''''''''''''''''''''''''''''''''''|___|''''''''''''''''''''''''''''''''''''''''''''''''''''
LOAD   |'''''''|_______________________|'''''''|_______________________|'''''''|_______________________|'''''''|_______________________
12M    __|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|_
2Pixel       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
/WE    ''''''''''''''''''''''''''|___|'|___|'''''''''''''''''''''''''''''''''''''''''''''''|___|'|___|'''''''''''''''''''''''''''''''''
Read       ?       !     !     !                   !     !     !     !     ?       !     !                 !       !     !     !     !
  • R/W sequences: (2 write buffers ?)
  • 600 RRRWW... 600 RRWWR...
  • 600 WWRRW... 600 WRRWW... 600 RRWWR ... 600 RWWRR
  • Even lines: Write to list A, Read from list B (Start of display)
  • Odd lines: Write to list B, Read from list A
  • In 16clk, 2 sprites SCB3 max. are checked to fill up sprite list , and 1 sprite's attributes are read for output
  • 384px * 4clk/px = 1536clk/line
  • 1536clk / 16clk = 96 sprites max/line
  • This means that there's at least 2 sprite SCB3 checked each 16clk, 4 writes to sprite list can be done max per 16clk ?
  • Slow VRAM is 100ns (10MHz) and is read at ?

P bus

For fix map 7000,7001,7002... Top-down left-right (AES blue self-test passed screen).

When drawn: 7000, 7020, 7040, 7060... 74E0 (NEW LINE) 7001, 7021... 74E1 (NL) 7002, 7022... 74E2

5A5 5A5 5C5 15C5 17E5 17E5 1505 1505 1025 1025