Intel Details Inner Workings of XeSS

Intel released an explainer video for its upcoming XeSS AI upscaling technology, and showcased how the tech works on its nearly ready for public release Arc Alchemist GPUs. It used the fastest Arc A770 for the demonstrations, though it’s difficult to say how the performance will stack up against the best graphics cards based on the limited performance details shown.

If you’re at all familiar with Nvidia’s DLSS, which has been around for four years now in various incarnations, the video should spark a keen sense of Deja Vu. Tom Petersen, who formerly worked for Nvidia and gave some of the old DLSS presentations, walks through the XeSS fundamentals. Long story short, XeSS sounds very much like a mirrored version of Nvidia’s DLSS, except it’s designed to work with Intel’s deep learning XMX cores rather than Nvidia’s tensor cores. The tech can also work with other GPUs, however, using DP4a mode, which might make it an interesting alternative to AMD’s FSR 2.0 upscaler.

In the demos shown by Intel, XeSS looked to be working well. Of course, it’s difficult to say for sure when the source video is a 1080p compressed version of the actual content, but we’ll save detailed image quality comparisons for another time. Performance gains look to be similar to what we’ve seen with DLSS, with over a 100% frame rate boost in some situations when using XeSS Performance mode. 

How It Works

If you already know how DLSS works, Intel’s solution is largely the same, but with some minor tweaks. XeSS is an AI accelerated resolution upscaling algorithm, designed to increase frame rates in video games.

It starts with training, the first step in most deep learning algorithms. The AI network takes lower resolution sample frames from a game and processes them, generating what should be upscaled output images. Then the network compares the results against the desired target image and back propagates weight adjustments to try and correct any “errors.” At first, the resulting images won’t look very good, but the AI algorithm slowly learns from its mistakes. After thousands (or more) of training images, the network eventually converges toward ideal weights that will “magically” generate the desired results.

Once the algorithm has been fully trained, using samples from lots of different games, it can in theory take any image input from any video game and upscale it almost perfectly. As with DLSS (and FSR 2.0), the XeSS algorithm also takes on the role of anti-aliasing and replaces classical solutions like temporal AA.

(Image credit: Intel)

Again, nothing so far is particularly noteworthy. DLSS and FSR 2.0 and even standard temporal AA algorithms have a lot of the same core functionality — minus the AI stuff for FSR and TAA. Games will integrate XeSS into their rendering pipeline, typically after the main render and initial effects are done but before post processing effects and GUI/HUD elements are drawn. That way the UI stays sharp while the difficult task of 3D rendering gets to run at a lower resolution.

XeSS operates on Intel’s Arc XMX cores, but it can also run on other GPUs in a slightly different mode. DP4a instructions are basically four INT8 (8-bit integer) calculations done using a single 32-bit register, what you’d typically have access to via a GPU shader core. XMX cores meanwhile natively support INT8 and can operate on 128 values at once.

That might seem very lopsided, but as an example an Arc A380 has 1024 shader cores that could each do four INT8 operations at the same time. Alternatively, the A380 has 128 MXM units that can each do 128 INT8 operations. That makes the MXM throughput four times faster than the DP4a throughput, but apparently DP4a mode should still be sufficient for some level of XeSS goodness.

Note that DP4a appears to use a different trained network, one that’s perhaps less computationally intensive. How that will translate into real-world performance and image quality remains to be seen, and it sounds like game developers will need to explicitly include support for both XMX and DP4a modes if they want to support non-Arc GPUs.

Intel XeSS Performance Expectations

This news is republished from another source. You can check the original article here

Be the first to comment

Leave a Reply

Your email address will not be published.