So, you’ve been following some tutorials online on WebGL, you’re playing around with shaders, and it’s time to incorporate them into your gulp build. (If you aren’t, and are interested, I highly recommend the WebGL Fundamentals.)
A common technique (one the author of WebGL Fundamentals mentions) is inserting
your shaders into your HTML source in non-javascript <script>
tags, and loading them
at runtime.
1 |
|
1 |
|
This works, but as your shader complexity (and total number of shaders) goes up, it can be frustrating to be staring at that HTML - we’d like to split that stuff into separate files, and wouldn’t it be nice to have dedicated syntax highlighting? (If you’re a VSCode user, check out the Shader languages extension!).
With gulp, we can do better. We’re going to use the same technique I covered in Optimizing for Space Part 2, but this time, for our shaders.
Inserting shaders at build time
The first thing to do is take all of your shaders out of the HTML file, and place
them in separate files. What I recommend is naming your vertex shaders *.vert
,
and your fragment shaders *.frag
, and then also placing them in separate subfolders.
Here’s an example:
1 |
|
(We’ll see why the subfolders are useful in a few seconds.)
Add the gulp-glsl
plugin to your build, and let’s add a task that will slurp up all these
shader files. The plugin supports several output formats, but we’re going to use the json
format.
1 |
|
What this gulp task does is take all these individual shader files and produce a single file,
temp/ShaderData.json
, that looks like this:
1 |
|
The gulp-glsl
plugin will already do some very basic minimizing for us (mostly eliminating whitespace). As
you can see, this is where those folders come in: by default, the plugin will use the folder structure of
the input files to build a “tree” in the JSON output. If you were to load this JSON blob, and you wanted the
fancy
shaders, you could get the shaders by grabbing .frag.fancy
and .vert.fancy
.
In fact, let’s turn this JSON file into an actual JavaScript source file and include it into our main js
task.
1 |
|
Our strategy here is a little different than last time - before, we were processing CSS and JavaScript, and
inserting it into the HTML. Now we’re processing shader files, turning them into an imaginary source
file called ShaderData.js
, which is then packed into the rest of our source. At run time, we can now
load our shaders, like so:
1 |
|
Make sense? At this point, we’re finished… unless we’re building a #js13k game. Hold on to your hats!
Mangling shader variables
If we want to shave off every byte possible, we’re going to want to mangle those shader variable
names. Nobody wants variables like u_worldTransformReverseMatrix
running around. This is not
like normal mangling, though - shader programs are just strings, not something terser
can handle
for us. We can use a dedicated GLSL minimizer, like glsl-minifier,
but that tends to focus on internal variables, not the attribute and uniform inputs our game uses.
What we really want is a custom mangling step. We want to identify all references in our program
to the “magic vars” (variables starting with u_
, v_
, and a_
), and mangle them. We need to do this
globally, so we can catch references inside the shader programs and in strings in our JavaScript, such
as calls to getUniformLocation
, etc., in the WebGL API. We also want to do this before we run terser,
which might accidentally pre-mangle one of our magic vars so we can’t recognize it.
I’m using the convention that the WebGL Fundamentals uses for my shader vars, which is what enables
this particular approach. If you don’t like using u_/a_/v_
, you can use a different naming strategy,
but it has to be one that makes the variables easily distinguishable from all other text in your
application.
So, we want to insert a new custom step right here:
1 |
|
You can see I’ve already required the imaginary file mangleShaderVars.js
, so let’s go ahead and create it.
Now, to keep our gulpfile nice and clean, I’m treating this new module like a gulp plugin. Creating your
own gulp plugin is interesting, but outside the scope of this article, and all we really want to do is
replace the content of a single file (and we don’t need to worry about source maps, because we’re going to
edit existing lines without adding or removing any). So we’re going to wrap a standard function
with the gulp-modify-file
plugin, to keep things simple.
Here’s our mangleShaderVars.js
.
1 |
|
Now that the gulp plumbing is taken care of, we can focus on our code. There are many ways you can tackle this, but let’s go with a basic regex approach. We’ll repeatedly scan the source code for matching variables, replacing them as we find them. Each time we replace a variable, we’ll add it to a table of previous matches (so we can use the same replacement if we see it again).
Here’s the whole function, with comments:
1 |
|
It’s important that we not match partials (a variable like alpha_var
would be seen as a_var
), so we
actually capture two groups - the first group will be some non-alphanumeric character (most likely syntax,
like a quote, an operator, a space, etc.), and the second group is the matched variable name. We use
substrings to insert the new mangled name into the source code, taking care to keep that errant character
we captured at the beginning. And last, we fudge the lastIndex
of our running regular expression.
If you aren’t that familiar with regular expressions, note that we’re using the regex.exec(string)
approach and not the common string.match(regex)
approach. The advantage is that exec
automatically
keeps track of the last index matched, and continues marching down the string looking for new matches -
it’s ideal for string scanning. In our case, because we are modifying the string, we need to be careful
to subtract the difference between the length of our old name and our new name – for example, if the old name was
v_normal
, and our new name is v_a
, we need to subtract 5 from the lastIndex
of our regular expression.
Otherwise it will “jump ahead” 5 characters into the modified string, and possibly skip one of the
variables we want to replace.
Alright, we’re close. Now we just need to define what our “mangled names” look like. Here’s a simple approach:
1 |
|
Here, we’re intentionally keeping the prefix (u_
, a_
, or v_
), and then selecting a single alphabetical
letter (a
, b
, c
, and so on). This’ll work fine as long as your program has no more than 26 variables;
if your shader is large, you’d want to support 2-character names (aa
, ab
, etc.). A good example of a
very robust mangler is terser’s implementation,
but for now, I’ll stick with our simple mangle function.
I’ll go ahead and build this, on a small sample 3D project:
1 |
|
And, sure enough, in the output app.js
, all string references to u_worldViewProjection
have been replaced with u_g
.
Respecting mangled variables in terser
Right now, we have a working shader program mangler. However, we do have a problem looming on the horizon - this only works as long as all of our references to shader variables are inside strings. As soon as we start doing anything fancy (for example, defining helper methods or getter/setter methods to represent our uniforms), we are going to have issues with terser.
This is actually an issue anytime you mix strings and property names and use an aggressive property mangler. For example, leaving shaders behind for a second, imagine this snippet of a keyboard input handler:
1 |
|
OK, so mixing strings and property names would get messy… But, why would that happen? Is there any reason we’d be using the name of a shader variable as a property or method name? Usually, it’s when you start wrapping up annoying pieces of WebGL into a nice program handler. Imagine some wrapper code like the following:
1 |
|
In this example we only looked at uniforms, but you can imagine similar code that wraps up the process of
obtaining locations, setting values, etc., for our buffers as well. However, the code above won’t work
if our custom shader mangler turns u_worldViewProjectionLocation
throughout the source file into u_a
,
but then terser turns the property u_a
into ab
. So, we need to be able to tell terser not to mess
with our pre-mangled values.
Terser supports this functionality out of the box (with the reserved
keyword). All we need to do when
we turn on property mangling is pass it in. Chances are we’re already reserving some property names
for other reasons, so this might look something like this:
1 |
|
Here, we’ve added an array of property names to reserve, and are passing a reference to that array to both our custom mangler and to terser’s options.
We will take our initial implementation of mangleShaderVars
from before (sans most of the comments), with
a couple new tweaks.
1 |
|
We’ve wired up a new parameter (the array of reserved words), and we append all of our mangled variable names to the array once we’re done with our own logic. Now, anywhere in our code that we use one of our output names as a property, it will be left alone by terser.
We’re cheating a little here, and taking advantage of the fact that terser
accepts its options object
at initialization time, but does not parse those options until it begins minifying. That’s why we can
get away with appending a bunch of new values to the array we passed to terser
, in the step right before
tersing.
Testing our theory
Something really important when doing space optimizations (especially for js13k) is testing whether it even works. It’s really easy to make obvious “improvements” to your source code, only to discover that the original was actually better once you take tersing and optimized zipping into account.
For a test case, I’ll use a simple little 3D demo I’ve been working on (following along with the
WebGL Tutorials link at the top of this blog post). For this particular demo, here’s the debugging
output from our mangleShaderVars
method:
1 |
|
I’ve added comments showing the expected space savings for each variable. We would expect a fully optimized zip file to store every variable only once in its string dictionary (potentially even less, for variables that share a common prefix). Based on the numbers above, in theory, our maximum potential savings for adding our custom mangling should be an ~89 byte difference.
Let’s test, using advzip -z -4 -i 1000
for optimization:
1 |
|
According to these numbers, I got roughly half of the ideal savings, which isn’t bad!
Why wouldn’t we get the full expected savings? It all has to do with the ZIP dictionary, which we are
optimizing to be as smart as possible. As a simple example, notice that u_worldViewProjection
and
u_worldInverseTranspose
both start with u_world
; almost certainly the dictionary already optimized
that prefix out, reducing our expected savings by 5 characters. Another issue is existing variables
that can’t be optimized out, such as the built-in gl_Position
: because it shares 7 characters with
a_position
, that dictionary entry won’t go away at all, meaning those 7 characters of savings are
lost. That’s why it’s important to actually test our optimizations!
Conclusions
In this article we built some custom WebGL shader infrastructure, including program variable mangling, and proved it works and can reduce overall game size. Whether you use any of this for your own GL-based js13k game depends on how badly you need those extra bytes, but I hope the journey was interesting!