Rendering logic: Difference between revisions

From NeoGeo Development Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to pairs of different chips used to generate the video signal.
On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to a chip or a group of different chips used to generate the video signal.


* [[PRO-A0]], [[PRO-B0]] (early)
* [[PRO-A0]], [[PRO-B0]] (early)
Line 6: Line 6:
* [[NEO-GRZ]] (CDZ, MV-1C ?)
* [[NEO-GRZ]] (CDZ, MV-1C ?)


See [[graphics pipeline]] for details and interconnections between chips and cartridges.
See [[graphics pipeline]] for an overview of the interconnections between chips and cartridges.
 
==Temporary notes==
 
*PCK2 rises with BNKB and CHBL
*The first valid rendering cycle is 32mclk after CHBL low ?
*Fix and sprite pixels are rendered at the same speed because sprite pixels are written by pairs
*Tile pixel lines are rendered in halves:
 
*For the fix (32mclk = 8 pixels corresponds to 6MHz pixel clock):
**Full address is ...1**** (PCK2 pulse)
**2H1 is 0 for 2 pixels (columns 0 & 1), then 1 for 2 pixels (columns 2 & 3)
**Full address is ...0**** (PCK2 pulse)
**2H1 is 0 for 2 pixels (columns 4 & 5), then 1 for 2 pixels (columns 6 & 7)
 
*For sprites (32mclk = 16 pixels):
**Full address is ...1***** (PCK1 pulse)
**CA4 is 0 for 4 pixels (columns 0~3), then 1 for 4 pixels (columns 4~7)
**Full address is ...0***** (PCK1 pulse)
**CA4 is 0 for 4 pixels (columns 8~11), then 1 for 4 pixels (columns 12~15)
 
*As fix is rendered in realtime, the fix tile address is set before sprites (on a new line PCK2 pulses before PCK1)
*X position to B1, just before each PCK2 pulse (SP during 1mclk), for 20 sprites next to each other (X+16px each time):
** Start of line: 0000,0808,1010,1838,2000,2808,3010,3838,40C0,48E8,50F0,58F8,60C0,68E8,70F0,78F8,8000,8808,9010,9838,0,0,0...


=Video generation=
=Video generation=
Line 14: Line 37:
[[NEO-B1]] is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).
[[NEO-B1]] is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).


*The TMS0 signal lets LSPC chose how the pair of buffers are used:
*The TMS0 signal from LSPC tells B1 how the pair of buffers are used:
**0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
**0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
**1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
**1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
Line 21: Line 44:
*WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
*WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
*SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
*SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
*The rising edge of PCK2 latches the X position of the sprite (and something else in a byte ?)
*1H1 (6MHz / 2 pixels per byte = 3MHz) is used to clock in the pixels of FIXD into the pixel buffers (or directly to the output ?) if they're not 0000.


=Sprite parsing=
=Sprite parsing=
Line 26: Line 52:
<span style="color:#FF0000">This is a draft. The following information shouldn't be considered as exact.</span>
<span style="color:#FF0000">This is a draft. The following information shouldn't be considered as exact.</span>


*LSPC runs at 24MHz.
*LSPC runs at 24MHz
*Fast VRAM is 35ns (28MHz) and is read at 16MHz (24/1.5).
*Fast VRAM is 35ns
*The read occurs 41.6ns (1clk) after address is set
*The reads always occur 1mclk (41.6ns) after address is set
[[file:timing_gpu1.png]]
[[file:timing_gpu1.png]]


*LSPC always starts in sprite list A ($8600) each new frame  
*FIXT: P23~16 are 0, P15~0 are S ROM address (+ external 2H1)
*SPRT: P23~0 are C ROM address (+ external CA4)
*LO: P23~16 are [[LO]] ROM data, P15~0 are LO address
*FP: P19~16 is the fix tile palette, rest is 0
*SP: P23~16 is the sprite tile palette, P15~8 is X position, P7~0 is ?
 
*LSPC always starts filling up active sprite list A ($8600) each new frame  
*Read sequence (100p capacitor delay on AES too on PCKxB ?):
*Read sequence (100p capacitor delay on AES too on PCKxB ?):
*Timing diagram when the sprite list for the actual line is already filled (no writes):
*Timing diagram when the sprite list for the actual line is already filled (no writes):

Revision as of 03:55, 5 January 2014

On the NeoGeo hardware, the term GPU (Graphics Processing Unit) may refer to a chip or a group of different chips used to generate the video signal.

See graphics pipeline for an overview of the interconnections between chips and cartridges.

Temporary notes

  • PCK2 rises with BNKB and CHBL
  • The first valid rendering cycle is 32mclk after CHBL low ?
  • Fix and sprite pixels are rendered at the same speed because sprite pixels are written by pairs
  • Tile pixel lines are rendered in halves:
  • For the fix (32mclk = 8 pixels corresponds to 6MHz pixel clock):
    • Full address is ...1**** (PCK2 pulse)
    • 2H1 is 0 for 2 pixels (columns 0 & 1), then 1 for 2 pixels (columns 2 & 3)
    • Full address is ...0**** (PCK2 pulse)
    • 2H1 is 0 for 2 pixels (columns 4 & 5), then 1 for 2 pixels (columns 6 & 7)
  • For sprites (32mclk = 16 pixels):
    • Full address is ...1***** (PCK1 pulse)
    • CA4 is 0 for 4 pixels (columns 0~3), then 1 for 4 pixels (columns 4~7)
    • Full address is ...0***** (PCK1 pulse)
    • CA4 is 0 for 4 pixels (columns 8~11), then 1 for 4 pixels (columns 12~15)
  • As fix is rendered in realtime, the fix tile address is set before sprites (on a new line PCK2 pulses before PCK1)
  • X position to B1, just before each PCK2 pulse (SP during 1mclk), for 20 sprites next to each other (X+16px each time):
    • Start of line: 0000,0808,1010,1838,2000,2808,3010,3838,40C0,48E8,50F0,58F8,60C0,68E8,70F0,78F8,8000,8808,9010,9838,0,0,0...

Video generation

See Display timing for the sync signal timing.

NEO-B1 is used for double-buffering scanlines. While a buffer is output to the TV, the other one is filled up. They're swapped each new scaline. Each of the two line buffers are actually 2 buffers of even/odd pixels. They will be named (1 & 2), and (3 & 4).

  • The TMS0 signal from LSPC tells B1 how the pair of buffers are used:
    • 0: Buffers 1&2 are output to the TV. Buffers 3&4 are written to.
    • 1: Buffers 1&2 are written to. Buffers 3&4 are output to the TV.
  • CSK1~4 signals are used to step to the next pixel (rising edge ?), periodic for video output, VRAM-dependent when filling up. Inactive during H-blank.
  • WSE1~4 signals are used to indicate if the pixel color from GAD/GBD needs to be written to the buffer (falling edge ?), matches CSK for video output (ignored ?), depends on DOTA/DOTB (opaque pixel signal) when filling up.
  • SS1~2 signals are used to reset the pixel pointers on falling edge ? (probably wrong)
  • The rising edge of PCK2 latches the X position of the sprite (and something else in a byte ?)
  • 1H1 (6MHz / 2 pixels per byte = 3MHz) is used to clock in the pixels of FIXD into the pixel buffers (or directly to the output ?) if they're not 0000.

Sprite parsing

This is a draft. The following information shouldn't be considered as exact.

  • LSPC runs at 24MHz
  • Fast VRAM is 35ns
  • The reads always occur 1mclk (41.6ns) after address is set

  • FIXT: P23~16 are 0, P15~0 are S ROM address (+ external 2H1)
  • SPRT: P23~0 are C ROM address (+ external CA4)
  • LO: P23~16 are LO ROM data, P15~0 are LO address
  • FP: P19~16 is the fix tile palette, rest is 0
  • SP: P23~16 is the sprite tile palette, P15~8 is X position, P7~0 is ?
  • LSPC always starts filling up active sprite list A ($8600) each new frame
  • Read sequence (100p capacitor delay on AES too on PCKxB ?):
  • Timing diagram when the sprite list for the actual line is already filled (no writes):
24M    |'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_
Addr   | 600 |  200  | 201 | 202 | 203 | 204 |  681  | 00E | 20E | 40E | 600 |  205  | 206 | 207 | 208 | 209 |  682  | 00F | 20F | 40F
PCK1   ______|'''|___________________________________________________________|'''|_____________________________________________________
PCK1B  '''''''|____|''''''''''''''''''''''''''''''''''''''''''''''''''''''''''|___|''''''''''''''''''''''''''''''''''''''''''''''''''''
LOAD   |'''''''|_______________________|'''''''|_______________________|'''''''|_______________________|'''''''|_______________________
12M    __|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|_
2Pixel       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
Read       ?       !     !     !     !     !       !     !     !     !     ?       !     !     !     !     !       !     !     !     !
What      1      2      2     2     2     2      3      4     5     6     1      2      2     2     2     2      3      4     5     6...
  • 1: ?
  • 2: Read SCB3 to see if sprite is in next scanline (just increments), starts frame at sprite 1 ?
  • 3: Read sprite list to get sprite #
  • 4: Read SCB2 zoom values
  • 5: Read SCB3 Y/size/chain
  • 6: Read SCB4 X
  • Timing diagram when the sprite list is being filled:
24M    |'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_|'|_
Addr   | 600 |  20F  | 210 | 211 | 600 | 601 |  684  | 005 | 205 | 405 | 600 |  212  | 213 | 602 | 603 | 214 |  685  | 006 | 206 | 406
PCK1   ______|'''|___________________________________________________________|'''|_____________________________________________________
PCK1B  '''''''|____|''''''''''''''''''''''''''''''''''''''''''''''''''''''''''|___|''''''''''''''''''''''''''''''''''''''''''''''''''''
LOAD   |'''''''|_______________________|'''''''|_______________________|'''''''|_______________________|'''''''|_______________________
12M    __|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|___|'''|_
2Pixel       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
/WE    ''''''''''''''''''''''''''|___|'|___|'''''''''''''''''''''''''''''''''''''''''''''''|___|'|___|'''''''''''''''''''''''''''''''''
Read       ?       !     !     !                   !     !     !     !     ?       !     !                 !       !     !     !     !
  • R/W sequences: (2 write buffers ?)
  • 600 RRRWW... 600 RRWWR...
  • 600 WWRRW... 600 WRRWW... 600 RRWWR ... 600 RWWRR
  • Even lines: Write to list A, Read from list B (Start of display)
  • Odd lines: Write to list B, Read from list A
  • In 16clk, 2 sprites SCB3 max. are checked to fill up sprite list , and 1 sprite's attributes are read for output
  • 384px * 4clk/px = 1536clk/line
  • 1536clk / 16clk = 96 sprites max/line
  • This means that there's at least 2 sprite SCB3 checked each 16clk, 4 writes to sprite list can be done max per 16clk ?
  • Slow VRAM is 100ns (10MHz) and is read at ?

P bus

For fix map 7000,7001,7002... Top-down left-right (AES blue self-test passed screen).

When drawn: 7000, 7020, 7040, 7060... 74E0 (NEW LINE) 7001, 7021... 74E1 (NL) 7002, 7022... 74E2

5A5 5A5 5C5 15C5 17E5 17E5 1505 1505 1025 1025