Thursday, February 14, 2013

Designing a Large-scale Phosphor Filter/Shader

Update (5/9/2013): I ported my shader to the Cg shader language and made some improvements in the process (click here to download). It now looks decent at sane, non-4k scale factors and has some easily-modified variables to allow users to make their own adjustments more easily. See the bottom of the post for new screenshots.

Update (3/29/2013): Some other versions of the shader and more screenies at the bottom of the post.
Update (3/18/2013): I think the shader is pretty much done! You can download it here. Skip to the bottom of the post for images and some details.

With 4k TVs on the horizon and high-dpi computer monitors already here, I figured it was time to start thinking about using these screens to better recreate the look of CRT TVs.

cgwg's CRT shader is already fantastic, but it was designed with the assumption of relatively low resolution (i.e., the current HD/1080p paradigm at the upper end and 720p screens at the lower end, resulting in ~3-4x scale factors) and single-pass shader implementations, and it makes certain concessions with those limitations in mind.

At 10x scale (2240 vertical pixels for NTSC) and assuming multipass support, we can forego some of those concessions, though. For example, you can draw the shadow mask right on the screen at 1 pixel width and each phosphor lens at 3 pixels tall by 9 pixels wide, which gives you a pattern like this (blurriness comes from the bilinear upscaling; click for normal size):
In this figure, the phosphors have a horizontal orientation rather than the vertical orientation seen in all of the phosphor images I've found online (like this one). I tried both ways and the vertical orientation didn't look right to me, insofar as it had vertical scanlines rather than horizontal ones. I'll show results both ways later on and you can judge for yourself.

Update (3/18/2013): Turns out the vertical orientation is correct. I got a macro lens and took some pictures of a CRT up close (the letter 'e' followed by a heart icon from Zelda OoT):

I still think it looks better horizontally at small scales (i.e., 1080p and lower) but vertically at larger scales. Anyway, back to the original post.

With that LUT tiled up to cover the entire SNES image (you can download it here), we're ready to add in our image, scaled to 10x using nearest-neighbor, which looks like this (click thumbnails for full resolution):
Next, we need to invert the colors in the image, which looks like this:
We then need to subtract these pixel color values from our LUT matrix, which results in this:
We subtract because the phosphor LUT represents a full 100% brightness in all phosphors, which is blended by our eyes into flat white. In our image, the white pixels are inverted to black, which subtracts nothing from our LUT and leaves those parts white, while magenta blanks out the red and blue pixels, leaving just green, and so on. Likewise, brighter pixels are inverted to dim, and they subtract less brightness from our LUT phosphors, leaving them bright. At this point, we have our basic phosphor coloration and luminance values represented.

Next, we'll make a new pass using the result of the previous pass as input (or in Photoshop, create a new layer), to which we'll apply a gaussian blur that will approximate the luminance-based color bleed into neighboring phosphors (no picture here, since it looks the same, just blurry; not very exciting). According to cgwg's comments in his existing CRT shader, this blur portion may be better suited to a 2-part process of blurring, once horizontally and once vertically, but I just did one all-over blur. You can use a blur intensity of 3 pixels to keep some sharpness or increase the value to 5 pixels to get a more bloomy, halated look. I'll show examples of both in a bit.

Obviously, this is image is still waaaay too dark, so we'll also need to crank up the brightness of our blurred layer to around 150% and combine it with the underlying raw image using the 'screen' method, which lightens everything a bit more. The math for a screen combine is f(a , b) = 1- (1 - a)(1 - b), where 'a' is the base layer value and 'b' is the blend value.

That gives us this result:
At this point, it looks pretty good and we could probably call it quits, but the darkening from the shadow mask is still monkeying with our perceived colors a bit, so I played around with the levels to try and get something that looked closer to the original image. I found that moving the black point to around 23 and the white point to around 179 while keeping the gray point at 1.0 gives a good, bright result:
This adjustment also ensures that blacks are actually black and not brown and whites are white instead of gray. Here's a comparison shot:
While we're looking at this Super Metroid shot anyway, lets zoom in and take a better look at the luminance-based glow created by our blur:
You can see where the brighter pixels bleed into the adjacent phosphors, particularly on the glowing suit highlights.

Using a 5 px gaussian blur bleeds the colors even more but ends up making the entire picture more hazy (3 px blur followed by 5 px blur):

You can see from the thumbnails that the 5 px picture looks nice from a distance, while the 3 px looks a lot better up close. So, you could make a valid argument for either one depending on your preferences and intended viewing distance.

The above process could work alright at a 5x scale, using a bilinear reduction on the LUT and a 1.5 px blur, and not look too bad:
But dropping it to 4x starts to really cause some problems with the colors, giving the appearance of too much gamma: 
At these scales and lower, cgwg's CRT shader looks much, much better, due to his compensation for LCD subpixel aliasing (among other things, such as gamma interpolation, screen curvature, etc.).

The last thing I want to cover is the phosphor orientation. I oriented my LUT with red at the bottom and blue at the top because doing it the opposite way looks weird up close (notice how the red pixels on the shell create a sort of halo instead of having the intended cartoony black outline, as compared with the pictures above):
Likewise, using vertical phosphors--as shown in all of the phosphor images I could find online--looks like crap close up *and* far off (eschew the cutoff edge; click to embiggen):

At the moment, I've just been playing around with Photoshop, but I hope to do some work on a legit shader that performs this process in real time. Unfortunately, I suck ass at writing GLSL, so it may never get anywhere. If anyone else would like to take a stab at it, I'm happy to provide any information I have.

Update: For the Photoshop steps, Romain Dura has helpfully converted the math functions to GLSL, and the gaussian blur steps from cgwg's halated CRT shader can drop right in.

Update 2: I cobbled together some code that works at least partially and I'm pleased with the outcome so far. I don't have a high dpi screen, so I'm stuck working at 4x, but I think the code should work fine with a larger resolution.

Currently, I can get it to the point where it has the shadow mask and phosphor lenses, as well as gaussian blur (though the blur is happening underneath the phosphors, which isn't exactly what I wanted):
This is obviously much too dark, but when I tried to lighten it up, I couldn't get things looking quite right. It would either blast out the colors underneath or make the phosphors visible on a flat black screen, which is unacceptable. When I tried adding a bloom pass, the phosphor bleed looked good (if a bit exaggerated) and the brightness was pretty close but the colors got totally wrecked >.<
Oh well... I'll keep at it. If anyone wants to play around with the code I've got so far and/or play around with my phosphor LUT, you can get it here.

Update 3: As cgwg pointed out in the comments, mapping the phosphors 1:1 on the SNES pixels, like I did above, isn't really how TVs worked. They were designed to receive a 480i signal (i.e., 2 sets of 240 lines of resolution, alternating which lines get displayed for an effective 30 fps), as per the NTSC broadcast spec, which Nintendo and others tweaked to show just 240(ish) lines, non-alternating, at 60 fps. This format goes by many names, including non-interlaced mode, 240p (not a real spec, btw) and "double-strike."

I did some more mockups using the same procedure as above, but using a phosphor array that's ~480 pixels tall (I actually went with 448; i.e., double the SNES res) and started with an image that should match the "double-strike" idea at 10x, namely, a 20x image with half of the pixels blanked out, like this (note, it looks exactly like if you added a 100% black scanline filter; if this is somehow incorrect, I'd love to hear about it in the comments):
Since the vertical resolution doubled, I also doubled the pixel radius of the gaussian blur, up to 6 px and, instead of adjusting the levels, I increased the brightness by 300% and the contrast by 25%, which resulted in this (crap: blogspot won't display that large of a file and insists on shrinking it and converting to jpg; here's a link to download the full-res if anyone wants it). And here it is shrunken bilinearly back down to 10x: 
 It still looks pretty nice. And here's a detail shot:
Interestingly, at this scale, the vertically oriented phosphors look a lot better and produce an image that's much less scanliney, so perhaps it is the better option:

Here's a link to download that vertically oriented shadow mask / phosphor LUT, if anyone's interested in playing around with it.

UPDATE (3/18/2013): After digging around in an old thread on byuu's forum, I came across a post from Themaister where he explained how to go about using and accessing various framebuffers in shaders, which allowed me to finish this one up, for the most part (at least to the point where I don't have to keep poking away at it). It's a real beast of a shader and doesn't run fullspeed on the Intel HD4000 I've been testing with, but it looks pretty good in slow motion :P

You can download the shader and a couple of LUTs here. (same as the link at the top of the post)

I included 240p and 480p LUTs with both horizontal and vertical phosphor orientations. The 240phoriz LUT will give you essentially the image I was going for at the start of this post:
Notice, these shots look a bit dark, due to the way the LUT gets shrunken on my 1080p/4x scale screen. In the shader, there's a line you can uncomment when you're on a low-resolution screen like this to make the colors a bit brighter, which gives you something like this:
For anyone on a larger screen (or if you'd like to play around with the higher-resolution LUTs), I also included 480p variants, the vertical version of which looks like this:
Again, these shots would be brighter with the low-res mode enabled.

Last, just for fun, I made some screenshots of Super Metroid using the low-res option with the 240phoriz LUT, just like the Photoshop renders I made above, and here's how it looks:
Not bad! :D

One thing that still needs work is that the gaussian blur is only happening vertically, I think, but I'm not really sure why. I'm assuming I didn't put the horizontal pass into the framebuffer properly, so I'll keep fiddling with that. Also, certain emulation cores (such as snes9x-next) don't really play nicely with it, presumably due to horizontal resolution issues. Genesis Plus GX seems to work fine, at least:
Update (3/29/2013): I made a couple of small changes to the shader and a few different variations. I added in a new variable called 'brightness' (toward the end of the file) that you can raise/lower to increase/decrease the brightness of the image if it's too dark (within a range). Here's a pastebin of the updated default shader (shown with 240pvert LUT):
And here's one with scanlines added (sry, no screenshot; I'll try to add one soon):

GPDP made some new LUTs, too, which is awesome. The first one mimics an aperture grill (click to download the LUT; shown here with scanlines added):
This one will look best at 9x scale instead of 10x.

The second LUT is actually pretty good at smaller scales, such as 4x, and takes the same idea as cgwg's CRT shader in that it tints alternating pixels green or magenta (click to download the LUT;
 I also modified cgwg's CRT shader to use the phosphor LUTs instead of the built-in pink/green pixel tint (note: screen transformation doesn't work right in this version) . It looks like this (click to download; shown with 240phoriz LUT):
Just as a warning: there's no reason to use this one instead of the regular CRT shader at less-than-huge scales. It looks significantly worse than the normal version. TBH, that's true for all of the shaders in this post...

And here's one with aliaspider's GTU shader replacing the gaussian passes (click to download; shown with 480pvert LUT):

Update (5/9/2013): Here's how the updated Cg port looks. I now consider this the best version and recommend users upgrade to it if they've been using the older GLSL version:

These shots were all taken at 4x, which would never have been possible with the GLSL version (without it looking like total garbage). It also handles pseudo-hires transparency decently well (see: Kirby shot), though you'll notice there's quite a bit of color mangling still...

You can download this shader here.


jelbo said...

Man, very, very interesting and I think a great contribution to CRT emulation :)

I hope this will be made into a real shader sometimes, so it could be incorporated into an injector like SweetFX. Boulotaur2024 ported cgwg's CRT shader to HLSL so you can apply that to any DirectX game. Wonderful :)

Hunter K. said...

Thanks man :)

I had no idea about the SweetFX port. That's awesome and I will definitely be using it!

Anonymous said...

You are amazing!

cgwg said...

Getting the right phosphor/shadow mask look is important, but this isn't what causes scanlines to appear; on monochrome CRTs (which don't have a shadow mask) the scanline appearance can be even more prominent.

Hunter K. said...

Hey man :)

The scanlines come from the "240p"/non-interlaced/double-strike format, right? Where only half of the lines are drawn so they can fit the full 60fps into the NTSC/480i bandwidth?

After I wrote this post, I did some more reading up on that, which led me to try starting with an image that already had scanlines applied. The output looked weird, though, so I assumed I was doing it wrong and didn't add any of it to the post.

cgwg said...

Well, any raster scan CRT will have scanlines due to the nature of how it works. I think they're more prominent when displaying a 240p game on a television because the TV would typically be designed to minimize the visibility of scanlines when displaying 480i content.

One thing that I should emphasize about the phosphor patterns is that I think they're generally independent of the scanlines, i.e. they won't line up and their vertical count won't equal that of the scanlines (this is obvious on a multiscan display like a computer CRT, since the number of scanlines can change significantly).

Unknown said...

Needs more 4:3...

Anonymous said...

I found your blog a few months ago before this extensive work you've done on this phosphor shader. Since then I've been trying to find a good shader that has this effect and found your blog yet again. This is amazing work your doing for retro gaming. I'd like to thank you for all your hard work.

squall_83 said...

Sorry, I'm having a bit of trouble with this. I can't seem to figure out how to get this to work with BSNES. I have put the files in the shader folder, but it still doesn't show up in the emulator as a selection in the shader menu. I have no Idea how to get this working. Is this file intended to work with BSNES to begin with? Also is there any way this might be made into HLSL to be compatible with DX games and PCSX2 in peticular?

Hunter K. said...

Aw, thanks dude. I really appreciate the kind words. I've got some (relatively minor) updates that I'll try to post up this weekend.

It won't work in bsnes/higan right now, but byuu's working on getting multipass shader support, so maybe soon. In the meantime, it works well with RetroArch's libretro-bsnes.

There's no reason it couldn't be converted to HLSL, so long as the emulator supports lookup textures, multipass shaders and arbitrary access to FBOs. I haven't looked into PCSX2's shader specifications, so I don't know if that stuff is available or not.

squall_83 said...

I've been using this for a while now and I gotta say, it really is amazing. The only thing I can say that's negative is it seems a bit bright. I've tried adjusting the gamma and brightness in the shader file and it just seems a bit washed out. It could be my display, I'm not totally sure. Otherwise, I'm super impressed with this. Stupendous work, bro.

Hunter K. said...

Thanks! I'm glad you like it :)

Which LUT are you using? I've had good luck just lowering the brightness on the LUT directly in Photoshop if it's too bright/bloomy (this is definitely a problem with the crtgeom LUT; I believe the shot from Super Mario All Stars uses a darkened version, in fact)

squall_83 said...

So I did some adjustments with the LUT. I loaded it up in Photoshop and did a color select on the individual RGB sections, feathered my selection and did a fill on that color to give it a nice "ghosty blend". Then I, very slightly, reduced the brightness of the image. I'm getting a nice effect now. Very contrasty and it just "pops" nicely. No more washed out color.

Anonymous said...

There are more and more shaders which emulate CRT TV displays.

But what about CRT computer monitors? Maybe you could explore this topic Hunter? You're articles are very interesting to read. :)

Anonymous said...

I made a try at emulating shaders with similar results to the latest shader posted, I did use gimp instead of photoshop though:
Set up pixel scanline grid image to a set size(9 by 9 px per cell in my case), blur by 1 px for viewing pleasure, base image done.
Scale(9x in my case), Invert and subtract rendered image like stated earlier.
Duplicate image and blur first image by 3px.
Blur second image by something like 9 by 9 px (taller blur looks good too).
Mess with the intensity of the second image (squared intesity and multiply by whatever feels good).
Add the second layer to the first using additive blending.
Scale to a suitable size.

That's how I emulated the result, might post an image tomorrow when it's not 2 am.

Hunter K. said...

Sounds great :D I look forward to checking it out!

mAIEaIE said...

Ok here it is, I changed the formula a little to use 4x scaling on base image and 4 sub pixels for every pixel. Still made by hand though.

I used my simple 9px pixel and made a 6x6 grid with that (every second line displaced), I then scaled that down to match the number of pixels I was going to use using interpolation (two pixels per pixel in both width and height). Then I cut the centre 2 by 2 "pixels" and pasted them with the fill tool, filling the entire screen (would have been way easier with uv coordinates and repeating textures and mipmaps).

I then did the steps as above, but with some tweaked settings, 1px blur on the bottom layer (close enough to 4px/2cells/3subPx), the top layer blurred 4 by 8 px (to make bleeding more visible). Then I messed with the intensity a bunch, just that even gave a "nice" effect like recording the tv with a video camera, the second set of the last steps was done with interpolation of the base layer (game). This gave a slightly more realistic look.

I might even incorporate this into my game-engine at some point too :P But that will gimp performance drastically, all 2kFps I will miss, even though it currently runs at about 22kFps...

Hunter K. said...

That looks great. Surely 2k fps from 22k is an acceptable drop for such a nice effect :P

mAIEaIE said...
This comment has been removed by the author.
mAIEaIE said...

Yes, but it will probably be less in the end :P and I also need to figure out how to actually draw it, but that will come later... But then again, 6 total render passes after the initial might not be the biggest drawback computer graphics has seen so far too :P

mAIEaIE said...

Found a way to amp the performance a bit, instead of inverting one of the pictures and then subtracting you can use a darken only filter, "min(F, B)" in shader terms. I still have to make my engine work, all ~500 lines of code I rewrote still don't want to work with more that two render passes (initial render + display render). Fun stuff...

mAIEaIE said...

I did some math, the average framerate after all filters are applied will probably be less 2000 fps, and that's without any physics.

After upscaling the game from 256 by 144 to 1024 by 576 the game dropped from 22k to 10k fps. I have now been able to put a couple of the filters on the game and I'm at about 4800fps with the first gaussian blur applied, it's a two stage blur to improve performance, the second will theoretically drop the fps another 1800 fps bringing it to 2000 fps, then I need to apply the final pass that adds the blurred layers together, will probably take about 250 fps, which brings me to a final of around 1.75 thousand frames per second with no physics...

A regular render pass without any fancy effects takes around 200fps to render, but given how fps works a single draw call will reduce more and more the closer to zero one comes, so it might not be as bad as I think.

mAIEaIE said...

Also using my trusty school calculator I calculated I would get about 20 (+-10) render passes before/after effects and still reach 60fps, so I'm fine as far as rendering goes, and to 30 fps that would be about an extra 5 draws.

Hunter K. said...

Nice :D It sounds like you've got some wiggle room in there.

mAIEaIE said...

As long as I don't put TOO much physics in there :P But then again, if worst comes to worst I just have to divide the physics into several threads, and I will probably have asynchronous file loading sorted out as well by the end...

Also, I was going to use a 5 point gaussian blur filter for a single horizontal pass, not happening :P The scan lines would never disappear and doubling the passes would just make performance ridiculously bad for what it was :P so I changed the filter to a more traditional 9 point blur, then the scanlines disappeared, and doing it twice (emulating upper + lower layer blur) got me to 2000-2100 fps :D so my calculations wasn't too far off :P

Now I have too sleep again, but After that I will continue, and probably design a class for all the create/use/destroy functions I'm going to need if I'm not gonna go insane :P

mAIEaIE said...

Ok, everything except those pesky functions are now written, will post some images of the filter running in full 60 fps when I wake up in 10 hours or so. It even has a crude way of tuning the overlay settings using three passed floats to the final mixing fragment shader, some serious tweaking is still needed before I'm happy with the result.

Hunter K. said...

Oh man, I'm excited to see it :D

mAIEaIE said...

I tested the program on the laptop our school provided, I got about 110 fps out of it whith full effects (no vsync), the effects are still pretty bad on the intel hd graphics 4400, my gtx 970 pushes about a stable 1700 fps when I disable vsync, so I have to fabricate a way to turn off effects :P I think it will invole more efficient FBO bouncing too.

I think I will also use the alpha channel as an offset to fetch pixels that are on uneven rows too, shouldn't be to hard to implement.

Anyways, I named the files in the order I took the screenshots:

00: base render.
01: pc-crt small phosphors
02: pc-crt medium phosphors.
03: pc-crt big phosphors.
04: tv-crt small phosphors.
05: tv-crt medium phosphors.
06: tv-crt big phosphors.

The smallest phosphor sizes seem to screw with the coloration of the pixels for some reason, offsetting the phosphor textures changes the color, but all the images are actually rendered on the fly so I can probably amp the performance quite the bit in skipping the phosphors being re rendered every frame...

Hunter K. said...

Wow, that's looking really great. I also had issues with everything going magenta when the phosphor scale gets too small. :( Your pc-med and pc-big shots look especially fantastic. tv-med looks great, too.

mAIEaIE said...

I figured it was because of the linear approximation grabbing most color from the red pixels seen as doing it in gimp/photoshop didn't give the same discoloration at smaller scales, it might become better if I increase the preciseness of the floats, right now I'm doing most everything in the lowest possible precision floats

mAIEaIE said...

I have now been able to reduce the code about 40 lines so far.

I realized setting the already upscaled texture to nearest interpolation was a bad idea, setting the interpolation to linear approximation gave me the clear blurs I got in GIMP, I will also change my blur shaders so I only have to call them twice per blur instead of four times, even if that means amping up the sampling to a 17 point blur (simulated 17, only uses 13), that will probably also amp the performance even more and create less overhead.

mAIEaIE said...

Rewrote the blur shaders completely, got 2200 stable fps with vsync off and I can turn off blur for higher performance. Four pass blur gave better blurring at larger blur amounts, so I'll give an option to blur twice if needed. I also shrunk the code to around 350 lines, there are still many functions for ease of use to be written still.

Hunter K. said...

Dang, that sounds like some serious progress. Is your finished version going to be available somewhere? I'd like to steal some of your improvements :)

mAIEaIE said...

Yeah, I'm an open source guy myself, so I'll probably post the initial engine/tools I create on the way online :P If you're just talking about shaders I can post those with some comments so you see how I went about doing it, because the shaders are basically the only thing that's written with proper coding :D the code shortening things I do is just batching regularly occuring creation/destroying commands (usually around 5 individual commands) into super-functions (involves vector arrays) and put the super-functions inside appropriate namespaces so I can access buffers without a ready function. So do you know about fbo creation? You will if you are going to use my exploiting shaders :P

mAIEaIE said...

Ok, now I'm basically done with the shaders (and their solid handles). I feel the shaders doesn't need more work at all, I'll put together a sample code using the shaders the "normal" way which I'm doing, to prevent banding the blur shouldn't be larger than 8 pixels in any direction, that is my blur shaders hard coding, the final resolution is whatever the screen is configured at launch too.

mAIEaIE said...

Ok, here's a demo of what I've done so far, don't remember if there's a README in this one but press V to disable Vsync and G for a surprise ;)
Look in the shaders folder to get a glimpse of what is going on, the texture2DArray could be a normal texture2D, but if I did that I wouldn't be able to create efficient tilemaps (to avoid artifacting from nearby texels on an atlas), also because my loading of images doesn't support an automatic 2d mode yet...

mAIEaIE said...

Merged this:
and this:
with my subtraction shader, works almost perfectly, running the extra shader code + increasing every fbo(except initial render) to fullscreen size(1080p) drastically reduced my fps to around 700 :P All I need to do is make the curvature compensate for the vertical offset it adds to the textures relative to the amount of lens distortion present... and finish of the vbo setup and draw command functions I never get around to programming.

mAIEaIE said...

I'm truly sorry for not posting an example for how to do the phosphor shader I did in opengl, I kinda sorta became an indie dev and forgot, I remembered when I went out searching for a gameboy shader for reference to also do in opengl/c++, which brought me back here.
So I'll make tutorial code tomorrow maby and send it to you, depending on if I have some time for it.

Hunter K. said...

Sure, no worries :) I assume you found harlequin's Game Boy shader?

Eric DeSantis said...

"Should have... sent a poet..."

Analytics Tracking Footer