SilkTouch: Slots & VTables

Please read the previous blog post, which introduces SilkTouch and it's core features Invokes & Marshalling.

This time, we are going to talk about this little yet essential part of SilkTouch and Silk.NET as a whole. We ignored this last time for simplicity. (CurrentVTable as GeneratedVTable).Load(721, "glGetDebugMessageLog"). The quick break down is this, first, we cast CurrentVTable to GeneratedVTable to allow inlining, because it isn't a virtual call anymore (GeneratedVTable is a private sealed class). Then, besides the native name, we pass a slot to the Load method.


A VTable is basically a "tiny" abstraction around INativeContext it's defined as:

public interface IVTable : IDisposable
    void Initialize(INativeContext ctx, int maxSlots);
    IntPtr Load(int slot, string entryPoint);
    void Purge();

Pretty simple, right? You may wonder what INativeContext is, if you are unfamiliar with Silk.NET.Core it's designed as

public interface INativeContext : IDisposable
    IntPtr GetProcAddress(string proc, int? slot = default);

And is implemented by GLFW / SDL and some default / composite types, and provides platform specific information on loading native symbols. It also powers overrides (see addendum)

Note that INativeContext does not cache anything, it's asked to always call directly down to whatever loading mechanism it wraps. This is where IVTable comes in.


The "default" IVTable we used was ConcurrentDictionaryVTable, and while it's still there, we don't use it anymore. We only use the generated VTables from SilkTouch. But it still serves as a good example of the basic functionality here.

So, let's go back to our interface. First of all, we know we have to store the INativeContext somewhere, to later load from, then, we can already know the maximum amount of unique slots we are every going to be queried for, and while it's not necessary, we can put that in as the maximum capacity of our ConcurrentDictionary now, I think it's obvious by now that the key of that dictionary is going to be an int (a slot), and maybe that the value is going to be IntPtr (the address we get back from INativeContext.GetProcAddress).

Next, the Load method, it's pretty simple, we just use the GetOrAdd method on our dictionary to either the the existing address, or load it from the INativeContext and then store it into the dictionary.

Last and also least important, Purge. First of all, Purge does not reset the Initialize params, the INativeContext and the number of slots stays the same! So in this implementation, it's just using Clear().

How SilkTouch does it

Now SilkTouch knows a lot more about the slots. Basically it's just a compile-time Dictionary. In a nutshell, we generate one field per entrypoint and use the slot to find the appropriate field. Purge is just Zeroing all those fields, and Initialize just stores the _ctx.

This allows the whole lookup to be inlined, which allows maximum performance. Now, at first you'd think this will result in just one huuuuge Load function, lot's of if (slot == x) return _...; now, that was much first approach to, I've then quickly learned that JIT does not care how hard you tell it to aggressively inline and optimize, it refuses to inline.

Making JIT inline

Now, it's pretty simple, it's like a binary search, we just sort all the slots, and generate for that section of slots (at the beginning these are all the slots). We then look whether the section of slots is an odd amount of slots, if it is, we load the lowest removing it from the slot section. As a fallback, we put an unreachable throw helper + return default.

Then, we take the mid slot and split again and we do that over and over and over again, recursively. This may be a bit of a surprise, but that means the JIT is happy to inline the whole thing.Loading, in this case means checking whether the field that was found is IntPtr.Zero and if it is, call _ctx.Load otherwise, just return it.

AOT (Ahead of Time Compilation)

I've recently added support for AOT into Silk.NET, and with it came the idea to add a feature to SilkTouch, preloading. Basically, what it does, it gets rid of all those branches in the loading, and preloads all addresses when Initialize is called. This means that Load call to the IVTable is as good as it gets. Right now we use the Cdecl calling convention in all cases, but that's not hard coded into SilkTouch, it's configurable per-method, this does mean some overhead. Also, every call we have a GC transition, but again, not much we can do about that. in the middle of the cdecl clutter and the GC entry/exit is our amazing call rax which, to my knowledge is the best we can do right now. (If you know any way to improve this please contact me)

Well yeah and that's kind of it. Thanks again, and if you're interested, I encourage you to look at the source yourself, see

Addendum: Overrides

So overrides are the solution to conditionally changing the INativeContext in some cases.

On some platforms, specifically iOS, it's required to statically link the entire application, so trying to load functions from "glfw3.dll" won't work, since it's linked into the application itself, which can only be circumvented using __Internal which can only be used from P/Invoke. So we had to go back to using P/Invoke.

We didn't want to rely on P/Invoke marshalling, so all P/Invoke signatures are what you would expect to find in the invoke function pointer, the native types, names, etc. Now, to override this, but only sometimes, SilkTouch generates an override, which is then put into a function CreateDefaultContext. It just compares a parameter with the condition given in the attribute, and then either creates one of the overrides, and then returns either the override, or the default context. In the generated override, we just put all the P/Invoke definitions, and then a binary tree similar to the one used in VTables, returning the pointer to the correct P/Invoke definition.

And that's it, it's pretty short, but there's not much to it. Once again, check out the relevant code You'll kind of get to see this running if you get the iOS or Android demo (included in preview3) running, which both use the P/Invoke override (the latter due to loading bugs). iOS (and Android) support is still pretty experimental, so not yet full glory :)

Khronos®, Vulkan® are registered trademarks, and OpenXR™ is a trademark of The Khronos Group Inc. and is registered as a trademark in China, the European Union, Japan and the United Kingdom. OpenCL™, OpenGL®, and the OpenGL ES™ logos are registered trademarks or trademarks used under license by Khronos. Microsoft® and DirectX® are registered trademarks of Microsoft Corporation, used solely for identification. All other product names, trademarks, and/or company names are also used solely for identification and belong to their respective owners. Use of external images, trademarks, and/or resources are not endorsements, and no information in or regarding any of these external resources has been endorsed or approved by Silk.NET or the .NET Foundation.

Powered by Statiq Framework