Releases: dougbinks/enkiTS
v1.11 Breaking C API change to Completion Actions - added Pre and Post completion functions
New Features
This release adds a breaking API feature to the C interface: pre-completion and post-completion functions. The C++ interface is unchanged.
A pre-completion function is called before the complete action task is 'complete', which means this is prior to dependent tasks being run. This function can thus alter any task arguments of the dependencies.
A post-completion function is called after the complete action task is 'complete'. Dependent tasks may have already been started. This function can delete the completion action if needed as it will no longer be accessed by other functions.
It is safe to set either of these to NULL if you do not require that function.
See CompletionAction_c.c for an example showing both how to modify following tasks as well as free memory from the completion action and previous tasks.
The C++ equivalent was already possible, I have updated the example CompletionAction.cpp to demonstrate how.
Fixes
In addition this release adds the following fixes:
- Allow ENKITS_API to be defined externally + Initialisation order warning fix
- enki::ThreadDataStore padding changes and improved static_assert - fixes #67
- Fix for WaitforTask running tasks if pCompletable_ already complete and not nullptr
Thanks to @BobbyAnguelov, @TurtleSimos and @makuto for their issue reports, PRs, and testing which have helped bring this release together. I'd also like to thank our Patreon and Github supporters for their financial assistance.
Support development of enkiTS through Github Sponsors or Patreon
v1.10 Wait For Pinned Tasks
This release adds a major new feature - WaitForPinnedTasks - suitable for performing work which could be OS blocking such as IO.
The mortification for this feature is that Some calls, such as file and network IO, may result in the thread being blocked whilst waiting on an external event which does not consume CPU resources to occur. If this was run on a standard enkiTS thread the task scheduler would have fewer threads able to perform computational work. Creating more threads than CPU cores offers one solution, but this will result in OS scheduling overhead which we wish to minimize. The WaitForPinnedTasks()
function permits an enkiTS thread to block at the OS level until it explicitly receives new work via a PinnedTask
. Developers can thus create extra threads which then loop calling WaitForPinnedTasks()
and RunPinnedTasks()
to perform IO/blocking work, minimizing OS scheduling overhead whilst keeping all CPU cores active with enkiTS threads.
WaitForPinnedTasks thread usage in C++:
- full example in example/WaitForPinnedTasks.cpp
- C example in example/WaitForPinnedTasks_c.c
#include "TaskScheduler.h"
enki::TaskScheduler g_TS;
struct RunPinnedTaskLoopTask : enki::IPinnedTask
{
void Execute() override
{
while( g_TS.GetIsRunning() )
{
g_TS.WaitForNewPinnedTasks(); // this thread will 'sleep' until there are new pinned tasks
g_TS.RunPinnedTasks();
}
}
};
struct PretendDoFileIO : enki::IPinnedTask
{
void Execute() override
{
// Do file IO
}
};
int main(int argc, const char * argv[])
{
enki::TaskSchedulerConfig config;
// In this example we create more threads than the hardware can run,
// because the IO thread will spend most of it's time idle or blocked
// and therefore not scheduled for CPU time by the OS
config.numTaskThreadsToCreate += 1;
g_TS.Initialize( config );
// in this example we place our IO threads at the end
RunPinnedTaskLoopTask runPinnedTaskLoopTasks;
runPinnedTaskLoopTasks.threadNum = g_TS.GetNumTaskThreads() - 1;
g_TS.AddPinnedTask( &runPinnedTaskLoopTasks );
// Send pretend file IO task to external thread FILE_IO
PretendDoFileIO pretendDoFileIO;
pretendDoFileIO.threadNum = runPinnedTaskLoopTasks.threadNum;
g_TS.AddPinnedTask( &pretendDoFileIO );
// ensure runPinnedTaskLoopTasks complete by explicitly calling shutdown
g_TS.WaitforAllAndShutdown();
return 0;
}
Screenshot of enkiTS in action in Avoyd Voxel Editor whilst CPU path tracing a scene. Avoyd uses additional enkITS threads using WaitForNewPinnedTasks()
to wait for PinnedTasks
which perform blocking IO (not shown in profile as these are hard to capture in a nice looking screenshot).
Support development of enkiTS through Github Sponsors or Patreon
v1.9 Another Bugfix Edition
This release adds no new features, but incorporates a number of bug fixes:
- Fixes error in link to dependencies example #53
- Fixes files ending without a new line #54
- Fixes Fails to compile in XCode/iOS with precompiled headers enabled #55
- Fixes Occasional crash when waiting on task that's stored as a local variable
- Fixes Crash in Android 11 beta #57
- Fixes Tasks are dispatched with invalid ranges #60
Many thanks to @brunochampoux, @aaronfranke, @craigsteyn, @mrdooz, @MSFT-Chris-Barrett, and @cstamford for the issue reports, PRs and testing which have helped to bring this release together.
Support development of enkiTS through Github Sponsors or Patreon
v1.8 Dependencies
NOTE: Breaking changes to C API
This release of enkiTS adds a major feature - Dependencies, and a minor feature - Completion Actions.
Dependencies introduce an alternative approach to waiting on tasks to create a sequence of tasks, or task graph. See example below along with links.
Completion Actions are dependencies which execute immediatly after a task has completed, and do not add themselves to the task scheduler. They involve less overhead than a normal task, and can be used to delete the completed task as well as themselves as they are not referenced after their completion function is called. See example/CompletionAction.cpp and example/CompletionAction_c.c .
The following breaking changes to the C API were required:
enkiDelete*
functions now require the task scheduler as a parameter.enkiAddTaskSet
replaced withenkiAddTaskSetArgs
so thatenkiAddTaskSet
can be used for a new function which does not set task arguments, intended for use along with the newenkiSetParams*
functions.
Dependency usage in C++:
- full example in example/Dependencies.cpp
- C example in example/Dependencies_c.c
#include "TaskScheduler.h"
enki::TaskScheduler g_TS;
// define a task set, can ignore range if we only do one thing
struct TaskA : enki::ITaskSet {
void ExecuteRange( enki::TaskSetPartition range_, uint32_t threadnum_ ) override {
// do something here, can issue tasks with g_TS
}
};
struct TaskB : enki::ITaskSet {
enki::Dependency m_Dependency;
void ExecuteRange( enki::TaskSetPartition range_, uint32_t threadnum_ ) override {
// do something here, can issue tasks with g_TS
}
};
int main(int argc, const char * argv[]) {
g_TS.Initialize();
// set dependencies once (can set more than one if needed).
TaskA taskA;
TaskB taskB;
taskB.SetDependency( taskB.m_Dependency, &taskA );
g_TS.AddTaskSetToPipe( &taskA ); // add first task, when complete TaskB will run
g_TS.WaitforTask( &taskB ); // wait for last
return 0;
}
v1.7 Bugfix Edition and new TestAll.cpp smoke testing
This release of enkiTS fixes a number of issues and adds a new smoke test, TestAll.cpp. Additionally, if using profiling code based on enkiTSMicroprofileExample.cpp in enkiTSExample check out the latest code for a stack based tick store required by the v1.2 callbacks.
Screenshot of enkiTSMicroprofileExample.cpp showing two stacked waitForTaskComplete callbacks (at top) which requires a tick stack for correct Microprofile profiling.
As a helper for catching issues earlier I've also added a smoke test TestAll.cpp for Travis CI which now runs on Linux x64, Linux ARM64, OSX x64 and Windows x64.
- Fixed issue #44 Fixed GGC warnings.
- Merged Pull Request #47 with Some fixes adding Xbox CreateSemaphoreExW(), uninitialized m_WaitingForTaskCount.
- Fixed issue #48 Pinned task not being woken.
- The added WakeSuspendedThreadsWithPinnedTasks() function increases the workload on threads which have no tasks and are about to suspend waiting for new tasks or task completion, but since these threads are about to suspend this has no detectable performance penalty.
- Compile fix for when NOMINMAX is globally defined.
- Fixed issue #49 Valgrind errors on OSX and mach semaphore exception.
- The v1.6 change to add custom allocators caused an OSX crash in placement new of mach semaphores, so I have switched to dispatch semaphores.
- Fix for initializing TaskScheduler multiple times without a shutdown and with different configuration parameters.
- Fixed issue #50 Valgrind warning thread states for all threads not initialized prior to thread launch.
- Fixed issue #51 WaitforAll() and external threads not waking.
- Fixed issue #52 ThreadSanitizer (TSAN) reporting data race.
- A note on ThreadSanitizer (TSAN) & Intel Inspector: currently neither TSAN nor Intel Inspector support all the primitives used by enkiTS and thus there will be false data race reports. I hope to keep these to a minimum, but this is not always possible.
Thanks to @Vuhdo, @boxerab, @Pagghiu, and Bobby Anguelov.
Support development of enkiTS through Github Sponsors or Patreon
v1.6 Custom allocators API
You can configure enkiTS with custom allocators using the
Developed based on request Feature suggestion: custom allocators #41 by @sergeyreznik.
In addition you can now sponsor enkiTS through Github Sponsors, and to boost community funding, GitHub will match your contribution!
Custom Allocator usage in C++:
- full example in example/CustomAllocator.cpp
- C example in example/CustomAllocator_c.c
#include "TaskScheduler.h"
#include <stdio.h>
#include <thread>
using namespace enki;
TaskScheduler g_TS;
struct ParallelTaskSet : ITaskSet
{
virtual void ExecuteRange( TaskSetPartition range_, uint32_t threadnum_ )
{
printf(" This could run on any thread, currently thread %d\n", threadnum_);
}
};
struct CustomData
{
const char* domainName;
size_t totalAllocations;
};
void* CustomAllocFunc( size_t align_, size_t size_, void* userData_, const char* file_, int line_ )
{
CustomData* data = (CustomData*)userData_;
data->totalAllocations += size_;
printf("Allocating %g bytes in domain %s, total %g. File %s, line %d.\n",
(double)size_, data->domainName, (double)data->totalAllocations, file_, line_ );
return DefaultAllocFunc( align_, size_, userData_, file_, line_ );
};
void CustomFreeFunc( void* ptr_, size_t size_, void* userData_, const char* file_, int line_ )
{
CustomData* data = (CustomData*)userData_;
data->totalAllocations -= size_;
printf("Freeing %p in domain %s, total %g. File %s, line %d.\n",
ptr_, data->domainName, (double)data->totalAllocations, file_, line_ );
DefaultFreeFunc( ptr_, size_, userData_, file_, line_ );
};
int main(int argc, const char * argv[])
{
enki::TaskSchedulerConfig config;
config.customAllocator.alloc = CustomAllocFunc;
config.customAllocator.free = CustomFreeFunc;
CustomData data{ "enkITS", 0 };
config.customAllocator.userData = &data;
g_TS.Initialize( config );
ParallelTaskSet task;
g_TS.AddTaskSetToPipe( &task );
g_TS.WaitforTask( &task );
g_TS.WaitforAllAndShutdown(); // ensure we shutdown before user data is destroyed.
return 0;
}
v1.5 Traditional post-release bugfix edition: name clash and GetConfig() fixes.
As is now a defining tradition of enkiTS, we have a post-release bugfix:
- Fixed CACHE_LINE_SIZE name clashes with macro from major console SDK #43
- Fixed GetConfig() returning actual config (it was incorrectly returning the default).
v1.4 Register external threads to use with enkiTS
You can now configure enkiTS with numExternalTaskThreads which can be registered to use with the enkiTS API using the RegisterExternalThread function.
Developed based on request Feature suggestion: running tasks from non main/task threads #39 by @Vuhdo.
- This also introduces a new way to configure the task scheduler using the enki::TaskSchedulerConfig.
- C++11 branch is now deprecated (C++11 functionality now in master branch) and will no longer be updated as master is identical. Please switch to master if you are on this legacy branch.
- GetProfilerCallbacks no deprecated as you should use the enki::TaskSchedulerConfig.
External thread usage in C++:
- full example in example/ExternalTaskThread.cpp
- C example in example/ExternalTaskThread_c.c
#include "TaskScheduler.h"
enki::TaskScheduler g_TS;
struct ParallelTaskSet : ITaskSet
{
virtual void ExecuteRange( TaskSetPartition range, uint32_t threadnum )
{
// Do something
}
};
void threadFunction()
{
g_TS.RegisterExternalTaskThread();
// sleep for a while instead of doing something such as file IO
std::this_thread::sleep_for( std::chrono::milliseconds( num_ * 100 ) );
ParallelTaskSet task;
g_TS.AddTaskSetToPipe( &task );
g_TS.WaitforTask( &task);
g_TS.DeRegisterExternalTaskThread();
}
int main(int argc, const char * argv[])
{
enki::TaskSchedulerConfig config;
config.numExternalTaskThreads = 1; // we have one extra external thread
g_TS.Initialize( config );
std::thread exampleThread( threadFunction );
exampleThread.join();
return 0;
}
v1.3 Profiler callback bugfix
This release is a bugfix for the profiling callbacks.
v1.2 Wait functions now relinquish CPU resources when idle, and breaking change to ProfilerCallbacks
The wait functions now relinquish CPU resources when idle, so other threads can run, similar to the behaviour of task threads which have no tasks to run. This lowers CPU power consumption, and can improve performance when the CPU is oversubscribed (for example when other threads or processes are consuming resources). Thanks to @zhaijialong for the feature request and follow up testing. See issue #31 for more information.
The wait functions try to run tasks whilst waiting, and if there are none they first spin then perform an OS blocking wait for a task complete event allowing other threads to run. The task complete event system may spuriously wake waiting threads, but they will then go back to a blocking wait.
This update also has a breaking change to the ProfilerCallbacks struct, which should be a one or two line change. Please see the struct declaration and comments for details.
Finally, in addition to a few bug fixes, this release also deprecates the C++98 branches and support for non C++11 compatible compilers (C support is still available through the C headers).