PPU rendering

From NESdev Wiki
Revision as of 20:49, 3 April 2011 by Drag (talk | contribs)
Jump to navigationJump to search

Work in progress, do not alter.

The PPU contains the following:

  • 2 16-bit shift registers - These contain the bitmap data for two tiles. Every 8 cycles, the bitmap data for the next tile is loaded into the upper 8 bits of this shift register. Meanwhile, the pixel to render is fetched from one of the lower 8 bits.
  • 2 8-bit shift registers - These contain the palette attributes for the lower 8 pixels of the 16-bit shift register. These registers are fed by a latch which contains the palette attribute for the next tile. Every 8 cycles, the latch is loaded with the palette attribute for the next tile.

Every cycle, a bit is fetched from these 4 shift registers in order to create a pixel on screen. Exactly which bit is fetched depends on the fine X scroll, set by $2005 (this is how fine X scrolling is possible). Afterwards, the shift registers are shifted once, to the data for the next pixel.

Every 8 cycles/shifts, new data is loaded into these registers.

NTSC PPU

The PPU renders 262 scanlines per frame. Each scanline lasts for 341 PPU clock cycles (113.667 CPU clock cycles; 1 CPU cycle = 3 PPU cycles), with each clock cycle producing one pixel.

Scanline -1 or 261

This is a dummy scanline, whose sole purpose is to perform the sprite evaluation for the next scanline, and to fill the shift registers with the data for the first two tiles of the next scanline. Although no pixels are rendered for this scanline, the PPU still makes the same memory accesses it would for a regular scanline.

Scanlines 0-239

These are the visible scanlines, which contain the graphics to be displayed on the screen. During these scanlines, the PPU is busy fetching data, so the program should not try to access PPU memory during this time, unless rendering is turned off.

Cycles 1-256

The data for each tile is fetched during this phase. Each memory access takes 2 PPU cycles to complete, and 4 must be performed per tile:

  1. Nametable byte
  2. Attribute table byte
  3. Tile bitmap A
  4. Tile bitmap B (+8 bytes from tile bitmap A)

The data fetched from these accesses is placed into internal latches, and then fed to the appropriate shift registers when it's time to do so (every 8 cycles). Because the PPU can only fetch an attribute byte every 8 cycles, each sequential string of 8 pixels is forced to have the same palette attribute.

Note: At the beginning of each scanline, the data for the first two tiles is already loaded into the shift registers (and ready to be rendered), so the first tile that gets fetched is Tile 3.

Cycles 257-320

Cycles 321-336

Cycles 337-340

Cycle 341

Scanline 240

The PPU just idles during this scanline. However, this scanline is not considered part of VBlank.

Scanlines 241-260

These occur during VBlank. The VBlank flag of the PPU is pulled low during scanline 0, so the VBlank NMI occurs here. During this time, the PPU makes no memory accesses, so PPU memory can be freely accessed by the program.