User:Myask/Myapper thoughts: Difference between revisions

From NESdev Wiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 22: Line 22:
  ppu_d[7:0] = (is_chr_access) ? (invert_cur ^ (horiz_flip ? chr_d[0:7] : chr_d[7:0])) : something_else);
  ppu_d[7:0] = (is_chr_access) ? (invert_cur ^ (horiz_flip ? chr_d[0:7] : chr_d[7:0])) : something_else);
Rotation would require orthogonal accesses, all-but requiring in-chip CHR.
Rotation would require orthogonal accesses, all-but requiring in-chip CHR.
Note: one should only do this if things are following (AT,NT,PT,PT) pattern. If not, one is fetching sprite patterns. While it is of mild interest that there are 64 sprites filling out the 32x32 count, OAM already deals with that better.
Though, one could use the garbage NT fetches (which are a double fetch, can leech off logic detecting the triple fetch!) to add extended per-sprite banking, though detecting which sprite is which is Another Problem, requiring either duplicating/spying OAM's tile entry, or requiring some specific pattern of sprite arrangement (useful for static images?). This also allows some odd bits, like changing sprite tile mid-sprite (vertically).
==DMA theft==
==DMA theft==
If you watch for a $4014 write, you can then watch the DMA happen...and thus have another destination to copy it to. This could alleviate the VBLANK crunch a bit (for e.g. the 8x1 mapper), as DMA copies much faster than program can.
If you watch for a $4014 write, you can then watch the DMA happen...and thus have another destination to copy it to. This could alleviate the VBLANK crunch a bit (for e.g. the 8x1 mapper), as DMA copies much faster than program can. Obviously bringing it into the chip 'à la' MMC5 means one could allow access to both buses, but that costs a lot of chip resources.  
==Slideshow mirroring==
==Slideshow mirroring==
  PPUD[0:7] = PPUD[0:7] & ~PPUA[11];// (relies on tile 0 being blank)
  PPUDOUT[0:7] = PPUDIN[0:7] & {8,{~PPUA[11]&PPUA[9]}};// (relies on tile 0 being blank)
  CIRAMA10 = PPUA[10];
  CIRAMA10 = PPUA[10];
  //ab
  //ab
  //00
  //00
Of course, the slideshow effect could be done without any special hardware just by using a render-disable raster effect instead of a scroll.
Of course, the slideshow effect could be done without any special hardware just by using a render-disable raster effect instead of a scroll.
TODO: fix, presently this nulls all pattern reads.
==Fill Mode==
…is an expansion of the above:
wire is_fill_mode = (nt_mapping[PPU_A[11:10]] == FILL_MODE) && is_nt_fetch;
PPUDOUT[0:7] = is_fill_mode ? fill_chr[0:7] : chrdin[0:7];
Actually, now I think it, "CIRAM0, CIRAM1, disable, fill mode" isn't a bad set of 4. Catch is, of course, that that would mean cartside NT0-3 makes it 12 bits of mirroring to allow full control, two writes. That leaves four bits of unused write…

Revision as of 00:41, 30 June 2016

8x8 attributes, beginning of 8x1

User:Myask/MyaGrafx

8x1 attributes?

It's all in the fine Y and keeping it in mind. The intensive solution is to watch the PPU bus and keep our own copy of PPUSCROLL, which requires knowing the latch and thus watching several of the registers. Pattern-fetch snooping can yield it, but as it is after the attribute fetch, it does not work for the first fetch of a scanline. Triple fetch is after said 2 pre-fetched tiles. Is really a double-fetch duplicating next one normally, but VBLANK is between the last line's double-fetch and the first of pre-render, which is interrupted by whatever the program wants to access. One could count tiles, then:

if(this_fetch == last_fetch) reset(counter);
if(is_AT(this_fetch) && is_NT(last_fetch)) counter += 1;
if(counter == (32? 33? 34?)) //override the usual pattern, we're doing first prefetch
///...after which we can just let the pattern fetch give us the information

But is this any simpler than register-snooping? This segues nicely into a scanline counter interrupt, though for some reason I have the thought of a nametable-relative Y-based interrupt instead (which is just a different frame of reference on the same thing.) Resetting to "scanline 0" if no triple-fetch detected in 128 CPU clocks seems workable.

Flipping

BG tileflipping is pretty easy.

chr_a[2:0] = ppu_a[2:0] ^ {3{is_chr_access & vert_flip}};
ppu_d[7:0] = (is_chr_access) ? (horiz_flip ? chr_d[0:7] : chr_d[7:0]) : something_else);

If one stored the attributes in the same byte you can piggyback that to set the flip bits. But what to use the other four bits for? Just do two tiles per byte? Allow MMC5-like extended tile index allowance? Allow swapping of two colors in the attribute?

wire invert_cur = ppu_a[4] ? invert_plane1 : invert_plane0;
ppu_d[7:0] = (is_chr_access) ? (invert_cur ^ (horiz_flip ? chr_d[0:7] : chr_d[7:0])) : something_else);

Rotation would require orthogonal accesses, all-but requiring in-chip CHR. Note: one should only do this if things are following (AT,NT,PT,PT) pattern. If not, one is fetching sprite patterns. While it is of mild interest that there are 64 sprites filling out the 32x32 count, OAM already deals with that better.

Though, one could use the garbage NT fetches (which are a double fetch, can leech off logic detecting the triple fetch!) to add extended per-sprite banking, though detecting which sprite is which is Another Problem, requiring either duplicating/spying OAM's tile entry, or requiring some specific pattern of sprite arrangement (useful for static images?). This also allows some odd bits, like changing sprite tile mid-sprite (vertically).

DMA theft

If you watch for a $4014 write, you can then watch the DMA happen...and thus have another destination to copy it to. This could alleviate the VBLANK crunch a bit (for e.g. the 8x1 mapper), as DMA copies much faster than program can. Obviously bringing it into the chip 'à la' MMC5 means one could allow access to both buses, but that costs a lot of chip resources.

Slideshow mirroring

PPUDOUT[0:7] = PPUDIN[0:7] & {8,{~PPUA[11]&PPUA[9]}};// (relies on tile 0 being blank)
CIRAMA10 = PPUA[10];
//ab
//00

Of course, the slideshow effect could be done without any special hardware just by using a render-disable raster effect instead of a scroll.

Fill Mode

…is an expansion of the above:

wire is_fill_mode = (nt_mapping[PPU_A[11:10]] == FILL_MODE) && is_nt_fetch;
PPUDOUT[0:7] = is_fill_mode ? fill_chr[0:7] : chrdin[0:7];

Actually, now I think it, "CIRAM0, CIRAM1, disable, fill mode" isn't a bad set of 4. Catch is, of course, that that would mean cartside NT0-3 makes it 12 bits of mirroring to allow full control, two writes. That leaves four bits of unused write…