Skip to content

What is SIMD NEON?

John Galvin edited this page Oct 28, 2024 · 9 revisions

What is sse2neon.h and why do we need it for OLC PGE 2.0 Mobile development?

  • The sse2neon.h is required as mobile CPU's and GPU's are nowhere near as powerful as their Desktop cousins.
  • Therefore we need to use SIMD (Single Instruction Multiple Data) technology in order to make our games playable on these weak devices.
  • _The sse2neon.h allows use to write one set of SIMD instructions sets that will be supported across multiple CPU types (X86/X64/ARM/ARM64) and multiple instruction set types (SSE2, SSE4.2, NEON).

What is SIMD wikipedia

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

What is NEON wikipedia

The Advanced SIMD extension (also known as Neon or "MPE" Media Processing Engine) is a combined 64- and 128-bit SIMD instruction set that provides standardised acceleration for media and signal processing applications.


In short but SIMD and NEON are the same thing but use different syntax Therefore we use a header file that can reinterpcast our SIMD syntax to neon, this header file is sse2neon.h


SIMD:

SIMD stands for Single Instruction Multiple Data; it is a hardware circuitry embedded into the CPU (Central Processing Unit). This complex circuit allows for the same instruction to be executed across multiple APU’s (Arithmetic Processing Unit, also known as MPU Maths Processing Unit).

How this circuit is created and how it works is out of scope for this document. All we need to know is we access and execute SIMD instructions using Intrinsic Functions .

Javidx9 (OneLoneCoder.com creator) has an amazing video that clearly explains how SIMD & Intrinsic Functions works here.

There are multiple versions of SIMD (see Later Versions) , we are only interested in SSE4, as this is the instruction sets implemented in the PGE 2.0 Mobile.

**Finally: You do not need to understand how intrinsic functions work, nor understand the code to be able to use the new SIMD instructions sets. it is all build in and hidden away withing the OLC PGE 2.0 Mobile Engine and will automatically apply SIMD against your graphic code. **


All the speed none of the headache


DEMO:

You can download a Demo of how all the PGE SIMD Instructions set work here .

You can use this demo to learn more on how SIMD for PGE works. The implementation in the OLC PGE 2.0 Mobile is using the same idea but is implemented very differently.

Please note I do not own the copyright to the images displayed in the demo and this documents, these images are used for education purposes only, and cannot be copied or redistributed image

By pressing S you can enabled/disable the SIMD instruction set, compare the frame rate when SIMD is disabled to when it is enabled.

SIMD Disabled:

image

SIMD Enabled:

image

It is not always the case you will achieve double the frame rate, as in the example above, this will depend on what else is set to execute during the OnUserUpdate execution.  

You can also press E to enable/disable Storing of Sub Sprites (this is explained in Section “A Little More Technical”), again compare the frame rate when SIMD is enabled, and Store Sub Sprites are Enabled / Disabled. Store Sub Sprites Disabled:

Store Sub Sprites Enabled:

This demo showcases all the available SIMD options available, such as DrawPartailSprite_SIMD, getting Sprites from a Sprite Sheet, flip and resize options.

NOTE: Frame Rate should not be used as an indication of how fast your game is running, a low frame rate does not mean a slow game. However, if you have a frame rate of < 24 then you do need to look at your architecture. A good frame rate is anything above 30FPS, believe it or not, most consoles are set at 30FPS, only the newer PlayStation 5, X Box Series X now have 60FPS. However, PC gamers have had 60FPS for a very long time. Reference GPU Mag

A high frame rate just means you can fit more into your OnUserUpdate without affecting the user experience. SIMD will not always increase your frame rate; however, it should stabilize it. SIMD is more likely lessen the frame rate decrease impact, when more executions are been ran in the OnUserUpdate.

Example only (not realistic):

  • Two Sprites been drawn with DrawSprite() lessen the frame rate by 10: 40FPS --> 30FPS
  • Two Sprites been drawn with DrawSprite_SIMD() lessen the frame rate by 3: 40FPS --> 37FPS
  • However, in a lot of cases it will increase it, but it all depends again on what is been executed in the OnUserUpdate

The SIMD code is there for you in its lowest form, it is up to your imagination to bring it to life

Clone this wiki locally