Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NEST performance by revised connection exchange and spike delivery #2926

Merged
merged 146 commits into from
Sep 13, 2023
Merged
Show file tree
Hide file tree
Changes from 130 commits
Commits
Show all changes
146 commits
Select commit Hold shift + click to select a range
a4fbdbe
Remove while-loop and multi-threading (except for deliver) in gather_…
suku248 Mar 7, 2022
472affb
Ensure spike register is thread-local.
heplesser Mar 9, 2022
e8d52f4
Fixed formatting.
heplesser Mar 9, 2022
9d993f9
Merge pull request #2 from heplesser/suku_test_single_threading_in_ga…
suku248 Mar 10, 2022
5ee6d82
Remove reading thread dimension in spike register for testing
suku248 Mar 10, 2022
e647604
Merge branch 'test_single_threading_in_gather_spike_data' of github.c…
suku248 Mar 10, 2022
4b772b1
Change spike register to use thread-specific pointers to inner vectors
suku248 Mar 10, 2022
ece715d
Merge branch 'master' of github.com:nest/nest-simulator into test_sin…
heplesser Mar 30, 2022
f0610ea
Fixed merge error
heplesser Mar 30, 2022
eb5851f
Process spike receive buffer in a batchwise fashion
suku248 Mar 31, 2022
ce0ab45
Merge branch 'batchwise_processing_spike_receive_buffer' of github.co…
heplesser Mar 31, 2022
5e06435
Compressed spike mapping with fixed slot per spike.
heplesser Apr 1, 2022
c1883dd
Merge remote-tracking branch 'origin/single_batchwise' into hep_test_…
heplesser Apr 1, 2022
b4a1283
Fixed thread bugs.
heplesser Apr 1, 2022
2302845
More brute force serialization.
heplesser Jun 8, 2022
afb2ec7
Added SIONLIB omp barrier skip
JoseJVS Aug 4, 2022
f743f03
Rearranged deliver events and gather events.
JoseJVS Aug 24, 2022
cef6d8a
Moved deliver events conditional block.
JoseJVS Aug 24, 2022
02cee95
Shift event_delivery after updating nodes and gather_spikes
med-ayssar Sep 20, 2022
c7a1e50
Shift spikes timestemaps
med-ayssar Sep 27, 2022
79d4ebd
Merge branch 'debug_nest' of github.com:med-ayssar/nest-simulator int…
med-ayssar Sep 27, 2022
3c3caab
Put the delivery event back at the top of the main while-loop
med-ayssar Sep 27, 2022
98196e8
Substract lag from mindelay in deliver_event
med-ayssar Sep 28, 2022
09f5446
Fix static check error
med-ayssar Sep 28, 2022
d7052e0
Fix assertion error
med-ayssar Sep 28, 2022
0d20dfd
Fix static check error
med-ayssar Sep 28, 2022
daccac5
Clean implementation of time stamps on delivery.
heplesser Oct 2, 2022
d257e6c
Attempt to fix time stamps for secondary events.
heplesser Oct 2, 2022
f3b8033
Added minimal unit test for problems when using one-to-one connectivity.
heplesser Oct 3, 2022
f009e45
Converted one-to-one-multithreading test to pytest
heplesser Oct 4, 2022
9fc6d93
Update mip_corrdet test for delivery at beginning of slice
heplesser Oct 6, 2022
d077ac4
Attempt solution for multithreading problem
heplesser Oct 19, 2022
4e6d118
Revised compressed_spike_data_map_, removing load balancing over threads
heplesser Oct 19, 2022
2a67dc8
Fix formatting
heplesser Oct 19, 2022
c9114fb
Extended onetone_multithreaded test to spike_transmission test
heplesser Oct 20, 2022
17b8bf4
Fixed formatting
heplesser Oct 20, 2022
e7adff1
Added comment about fixed MPI buffer size for experimentation
heplesser Oct 21, 2022
8b4e7e3
Disabled all CI steps except MPI_ONLY for time of development
heplesser Oct 21, 2022
a25b26f
Test direct target buffer construction from compressed_spike_data; no…
heplesser Oct 23, 2022
caec3f8
Set 'more_targets' flag correctly. Some MPI problems, likely du to si…
heplesser Oct 23, 2022
2c612aa
Added blockwise communication of connectivity to new compressed schem…
heplesser Oct 24, 2022
ebffb39
Single-threaded MPI now working correctly
heplesser Oct 28, 2022
15aee17
Temporary fix to AssignedRanks to ensure correct MPI behavior
heplesser Oct 30, 2022
4a84375
Merging with main
med-ayssar Dec 14, 2022
7a36be2
Passing all pytests
med-ayssar Dec 14, 2022
5f702b7
Merge pull request #26 from med-ayssar/deliver_events_first_
heplesser Dec 15, 2022
0ff22e0
Re-added support for growing MPI spike buffers.
heplesser Dec 15, 2022
3aa6821
Remove spurious buffer resizing and unnecessary emitted flag
heplesser Dec 15, 2022
932c289
Fix end vs complete flag use in deliver_events
heplesser Dec 15, 2022
04aa3fc
Add comments on how spike gathering and delivery works.
heplesser Dec 16, 2022
1cac86b
Added shrinking of spike buffers
heplesser Dec 16, 2022
ff37082
Revised shrink/grow scheme
heplesser Dec 16, 2022
81a9025
Correctly place sw_comm_spikes.stop()
heplesser Dec 19, 2022
4625687
Merge branch 'master' into deliver_events_first
heplesser Dec 21, 2022
02965f1
Support SecondaryEvents in deliver_events_first branch
heplesser Feb 16, 2023
902d690
Fix communication of receive buffer position for 2ndry events with co…
heplesser Feb 18, 2023
c54b0a2
Update connection infrastructure correctly from GetConnections
heplesser Feb 19, 2023
622037b
Remove incorrect compressed_spike_data_map_ resizes
heplesser Feb 19, 2023
596f54e
Adapt rate-neuron input delay handling to deliver events first logic.
heplesser Feb 19, 2023
fe79f37
Introduce extra growth for spike buffers, improve comments; side effe…
heplesser Feb 20, 2023
92f467b
Revised scheme for exchanging buffer size information for resizing
heplesser Feb 20, 2023
055e577
Fix formatting
heplesser Feb 20, 2023
946b0b2
Merge branch 'master' into deliver_events_first
heplesser Feb 20, 2023
28c3a0f
Fix merge problem
heplesser Feb 20, 2023
5843e82
Remove unused return value
heplesser Feb 20, 2023
3d88c68
Add copy constructor and assignment op for OffGridSpikeData
heplesser Feb 20, 2023
58196ae
Change CSDMapEntry to bitfield
heplesser Feb 20, 2023
169bcb2
Activate off grid spikes
heplesser Feb 20, 2023
8311350
Collocate from offgrid register if needed
heplesser Feb 20, 2023
29291bc
Make test more focused and efficient
heplesser Feb 20, 2023
6ae042c
Adapt offgrid sending to three-dim register
heplesser Feb 21, 2023
896b2b1
Adjust test to deliver_events_first time logic
heplesser Feb 21, 2023
3daea1f
Fix delivery of both on-grid and off-grid spikes
heplesser Feb 21, 2023
eb2e595
Made buffer grow/shrink user configurable
heplesser Feb 21, 2023
08a0ade
Reactivate full set of github test
heplesser Feb 21, 2023
d2afe39
Fixed formatting
heplesser Feb 21, 2023
74c7341
Fix warnings arising from signedness
heplesser Feb 22, 2023
cd78045
Fix remaining compiler warnings
heplesser Feb 22, 2023
03845a5
Experimentally re-activate OpenMP for FullMacos CI testing
heplesser Feb 22, 2023
2d9fcfc
Revert "Experimentally re-activate OpenMP for FullMacos CI testing"
heplesser Feb 22, 2023
bbe5558
Fixed code formatting
heplesser Feb 23, 2023
1d6abad
Add debuggigng output
heplesser Feb 28, 2023
c7cbb3a
Add more debugging output
heplesser Mar 1, 2023
afd56bb
Adding target table dumping
heplesser Mar 1, 2023
49b6f48
Yet more debugging output
heplesser Mar 2, 2023
e5cbeee
Added debugging output for target exchange
heplesser Mar 27, 2023
d588fc4
Correct target exchange for compressed case
heplesser Mar 28, 2023
028851d
Disable debug output and tweak batch size for testing
heplesser Mar 28, 2023
9762c84
Deactivate source dumping
heplesser Mar 28, 2023
d08d903
Make full logging compile-time configurable
heplesser Mar 28, 2023
00b179f
Fixed C++ code formatting
heplesser Mar 28, 2023
84cb496
Set BATCH_SIZE back to 8
heplesser Apr 4, 2023
d5ad1c9
Fix comment; fix warning due to incorrect logging wrapping.
heplesser Apr 6, 2023
92f9b64
More careful macro-protection of full-logging-only code to avoid comp…
heplesser Apr 6, 2023
8803865
Remove variable never used and commented-out code.
heplesser Apr 6, 2023
d0a2016
Do not deliver events during first slice of simulation.
heplesser Apr 6, 2023
63a88ea
Add comment
heplesser Apr 6, 2023
f61cec3
Fix int/uint problem
heplesser Apr 7, 2023
f1e7c8e
Skip tests requiring multiple threads
heplesser Apr 10, 2023
79304fe
Skip also connect outdegree test when running without threads
heplesser Jul 3, 2023
d12e10d
Merging main into deliver_events_first.
heplesser Jul 3, 2023
50efeaf
Fixed data types
heplesser Jul 3, 2023
3cd6eaf
Ensure test_stdp_synapse reference data has not too many spikes; conv…
heplesser Jul 6, 2023
c204eec
Adjusted mip corrdet test to deliver events first scheme
heplesser Jul 6, 2023
2284546
Fix formatting issues
heplesser Jul 7, 2023
5bc829e
Restrict test_spike_transmission to single thread if NEST does not su…
heplesser Jul 7, 2023
3fe13c4
Removing AssignedRanks for spike buffers and simplifying SendBufferPo…
heplesser Aug 8, 2023
03615f9
Remove lag dimension from emitted_spikes_register
heplesser Aug 10, 2023
4a7333c
Merge branch 'master' into def_nolag_mrg
heplesser Aug 10, 2023
39c8c56
Merge branch 'master' into def_nolag_mrg
heplesser Sep 4, 2023
59dd535
Fix lcid and syn_id boundary checks; see also #2529 for suggested fol…
heplesser Sep 7, 2023
3251eee
Remove spurious thread id argument
heplesser Sep 7, 2023
b54e5c3
Remove debugging output
heplesser Sep 7, 2023
4d09fe3
Tidy up assertion
heplesser Sep 7, 2023
816e1ad
Remove sort_connections flag from tests
heplesser Sep 7, 2023
7ae78cc
Remove adaptive_spike_buffers flag and max_buffer_size_spike_data ker…
heplesser Sep 7, 2023
2a31d5c
Improve comment
heplesser Sep 7, 2023
bc4aaa1
Update PyNEST KernelAttributes wrt to removed and added kernel proper…
heplesser Sep 7, 2023
0acca52
Slight refactoring of deliver_events_() for greater clarity
heplesser Sep 7, 2023
d90d0e1
Reformatted C++ files
heplesser Sep 7, 2023
e1b9fee
Improve buffer resize logging
heplesser Sep 7, 2023
c977914
Better variable names
heplesser Sep 7, 2023
b1ccd18
Avoid technical problems in test
heplesser Sep 7, 2023
020efb5
Removed no-longer used methods and fields and improved comments
heplesser Sep 10, 2023
d518c4f
Add comments
heplesser Sep 10, 2023
85a6c51
Remove FULL_LOGGING calls
heplesser Sep 11, 2023
060ed1f
Correct placement of timers
heplesser Sep 11, 2023
59e301b
Fix buffer parameter names and stopwatch code location
heplesser Sep 11, 2023
87906ee
Set spike_buffer_shrink_limit to 0 to turn off shrinking by default
heplesser Sep 11, 2023
85c7312
Moved ResizeLog to separate file and rename BufferResizeLog
heplesser Sep 11, 2023
e909f1a
Apply suggestions from code review
heplesser Sep 11, 2023
afba22f
Merge branch 'def_nolag_mrg' of github.com:heplesser/nest-simulator i…
heplesser Sep 11, 2023
042b5b2
Update cmake/ProcessOptions.cmake
heplesser Sep 11, 2023
05ce424
Update libnestutil/config.h.in
heplesser Sep 11, 2023
44900f6
Update nestkernel/event_delivery_manager.h
heplesser Sep 11, 2023
87cfdc7
Update nestkernel/event_delivery_manager.h
heplesser Sep 11, 2023
073d3d7
Some improvements around FULL_LOGGING_ONLY
heplesser Sep 11, 2023
f92b21a
Merge branch 'def_nolag_mrg' of github.com:heplesser/nest-simulator i…
heplesser Sep 11, 2023
503b7de
Merge branch 'master' of github.com:nest/nest-simulator into def_nola…
heplesser Sep 11, 2023
f7a7af3
Fix C++ formatting
heplesser Sep 11, 2023
a92b995
Fix tests after merge
heplesser Sep 11, 2023
2a17bdd
Fix formatting
heplesser Sep 11, 2023
8b3488a
Update nestkernel/spike_data.h
heplesser Sep 11, 2023
c4c00fe
Fix formatting of kernel SLI documentation
heplesser Sep 13, 2023
bca6f28
Merge branch 'def_nolag_mrg' of github.com:heplesser/nest-simulator i…
heplesser Sep 13, 2023
e57b9a1
Fix typos
heplesser Sep 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ set( with-defines OFF CACHE STRING "Additional defines, e.g. '-DXYZ=1' [default=
set( with-userdoc OFF CACHE STRING "Build user documentation [default=OFF]")
set( with-devdoc OFF CACHE STRING "Build developer documentation [default=OFF]")

set( with-full-logging OFF CACHE STRING "Write much debug output to rank-and-thread-specific log file [default=OFF]")
jougs marked this conversation as resolved.
Show resolved Hide resolved

################################################################################
################## Project Directory variables ##################
################################################################################
Expand Down Expand Up @@ -153,6 +155,7 @@ nest_process_with_hdf5()
nest_process_target_bits_split()
nest_process_userdoc()
nest_process_devdoc()
nest_process_full_logging()

nest_process_models()

Expand Down
8 changes: 7 additions & 1 deletion cmake/ProcessOptions.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -615,7 +615,6 @@ function( NEST_PROCESS_WITH_HDF5 )
set( HDF5_VERSION "${HDF5_VERSION}" PARENT_SCOPE )
set( HDF5_HL_LIBRARIES "${HDF5_HL_LIBRARIES}" PARENT_SCOPE )
set( HDF5_DEFINITIONS "${HDF5_DEFINITIONS}" PARENT_SCOPE )

include_directories( ${HDF5_INCLUDE_DIRS} )

endif ()
Expand Down Expand Up @@ -703,3 +702,10 @@ function( NEST_PROCESS_DEVDOC )
set( BUILD_DOCS ON PARENT_SCOPE )
endif ()
endfunction ()

function( NEST_PROCESS_FULL_LOGGING )
if ( with-full-logging )
message( STATUS "Configuring full logging" )
set( FULL_LOGGING ON PARENT_SCOPE )
heplesser marked this conversation as resolved.
Show resolved Hide resolved
endif ()
endfunction ()
11 changes: 11 additions & 0 deletions lib/sli/unittest.sli
Original file line number Diff line number Diff line change
Expand Up @@ -1289,6 +1289,10 @@ FirstVersion: 2012-05-22

SeeAlso: unittest::distributed_assert_or_die, unittest::distributed_collect_assert_or_die, nest_indirect, unittest::mpirun_self, unittest::assert_or_die
*/

/distributed_process_invariant_events_assert_or_die << /show_results false >> Options


/distributed_process_invariant_events_assert_or_die
[/arraytype /proceduretype]
{
Expand All @@ -1312,6 +1316,13 @@ SeeAlso: unittest::distributed_assert_or_die, unittest::distributed_collect_asse
} Map

% we now have array of arrays, should be identical
/distributed_process_invariant_events_assert_or_die /show_results GetOption
{
dup
{ == } forall
}
if

dup First /reference Set
Rest true exch { reference eq and } Fold
}
Expand Down
3 changes: 3 additions & 0 deletions libnestutil/config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -182,4 +182,7 @@
/* Whether to enable detailed NEST internal timers */
#cmakedefine TIMER_DETAILED 1

/* Whether to do full logging */
#cmakedefine FULL_LOGGING 1
heplesser marked this conversation as resolved.
Show resolved Hide resolved

#endif // #ifndef CONFIG_H
2 changes: 1 addition & 1 deletion models/music_event_out_proxy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ nest::music_event_out_proxy::handle( SpikeEvent& e )
#pragma omp critical( insertevent )
{
#endif
for ( int i = 0; i < e.get_multiplicity(); ++i )
for ( size_t i = 0; i < e.get_multiplicity(); ++i )
{
V_.MP_->insertEvent( time, MUSIC::GlobalIndex( receiver_port ) );
}
Expand Down
2 changes: 1 addition & 1 deletion models/rate_neuron_ipn_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -434,7 +434,7 @@ void
nest::rate_neuron_ipn< TNonlinearities >::handle( DelayedRateConnectionEvent& e )
{
const double weight = e.get_weight();
const long delay = e.get_delay_steps();
const long delay = e.get_delay_steps() - kernel().connection_manager.get_min_delay();

size_t i = 0;
std::vector< unsigned int >::iterator it = e.begin();
Expand Down
2 changes: 1 addition & 1 deletion models/rate_neuron_opn_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ void
nest::rate_neuron_opn< TNonlinearities >::handle( DelayedRateConnectionEvent& e )
{
const double weight = e.get_weight();
const long delay = e.get_delay_steps();
const long delay = e.get_delay_steps() - kernel().connection_manager.get_min_delay();

size_t i = 0;
std::vector< unsigned int >::iterator it = e.begin();
Expand Down
2 changes: 1 addition & 1 deletion models/rate_transformer_node_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ void
nest::rate_transformer_node< TNonlinearities >::handle( DelayedRateConnectionEvent& e )
{
const double weight = e.get_weight();
const long delay = e.get_delay_steps();
const long delay = e.get_delay_steps() - kernel().connection_manager.get_min_delay();

size_t i = 0;
std::vector< unsigned int >::iterator it = e.begin();
Expand Down
2 changes: 1 addition & 1 deletion models/spike_recorder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ nest::spike_recorder::handle( SpikeEvent& e )
{
assert( e.get_multiplicity() > 0 );

for ( int i = 0; i < e.get_multiplicity(); ++i )
for ( size_t i = 0; i < e.get_multiplicity(); ++i )
{
write( e, RecordingBackend::NO_DOUBLE_VALUES, RecordingBackend::NO_LONG_VALUES );
}
Expand Down
2 changes: 1 addition & 1 deletion nestkernel/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ set ( nestkernel_sources
target_table.h target_table.cpp
target_table_devices.h target_table_devices.cpp target_table_devices_impl.h
target.h target_data.h static_assert.h
send_buffer_position.h
send_buffer_position.h send_buffer_position.cpp
source.h
source_table.h source_table.cpp
source_table_position.h
Expand Down
167 changes: 139 additions & 28 deletions nestkernel/connection_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ nest::ConnectionManager::ConnectionManager()
, keep_source_table_( true )
, connections_have_changed_( false )
, get_connections_has_been_called_( false )
, sort_connections_by_source_( true )
, use_compressed_spikes_( true )
, has_primary_connections_( false )
, check_primary_connections_()
Expand All @@ -96,7 +95,6 @@ nest::ConnectionManager::initialize()
const size_t num_threads = kernel().vp_manager.get_num_threads();
connections_.resize( num_threads );
secondary_recv_buffer_pos_.resize( num_threads );
sort_connections_by_source_ = true;
keep_source_table_ = true;
connections_have_changed_ = false;
get_connections_has_been_called_ = false;
Expand Down Expand Up @@ -167,19 +165,7 @@ nest::ConnectionManager::set_status( const DictionaryDatum& d )
"to false." );
}

updateValue< bool >( d, names::sort_connections_by_source, sort_connections_by_source_ );
if ( not sort_connections_by_source_ and kernel().sp_manager.is_structural_plasticity_enabled() )
{
throw KernelException(
"If structural plasticity is enabled, sort_connections_by_source can not "
"be set to false." );
}

updateValue< bool >( d, names::use_compressed_spikes, use_compressed_spikes_ );
if ( use_compressed_spikes_ and not sort_connections_by_source_ )
{
throw KernelException( "Spike compression requires sort_connections_by_source to be true." );
}

// Need to update the saved values if we have changed the delay bounds.
if ( d->known( names::min_delay ) or d->known( names::max_delay ) )
Expand All @@ -204,7 +190,6 @@ nest::ConnectionManager::get_status( DictionaryDatum& dict )
const size_t n = get_num_connections();
def< long >( dict, names::num_connections, n );
def< bool >( dict, names::keep_source_table, keep_source_table_ );
def< bool >( dict, names::sort_connections_by_source, sort_connections_by_source_ );
def< bool >( dict, names::use_compressed_spikes, use_compressed_spikes_ );

def< double >( dict, names::time_construction_connect, sw_construction_connect.elapsed() );
Expand Down Expand Up @@ -843,12 +828,13 @@ nest::ConnectionManager::increase_connection_count( const size_t tid, const syni
num_connections_[ tid ].resize( syn_id + 1 );
}
++num_connections_[ tid ][ syn_id ];
if ( num_connections_[ tid ][ syn_id ] >= MAX_LCID )
if ( num_connections_[ tid ][ syn_id ] > MAX_LCID - 1 )
{
// MAX_LCID is used as invalid marker an can therefore not be used as a proper value
throw KernelException(
String::compose( "Too many connections: at most %1 connections supported per virtual "
"process and synapse model.",
MAX_LCID ) );
MAX_LCID - 1 ) );
}
}

Expand Down Expand Up @@ -1329,7 +1315,7 @@ void
nest::ConnectionManager::sort_connections( const size_t tid )
{
assert( not source_table_.is_cleared() );
if ( sort_connections_by_source_ )
if ( use_compressed_spikes_ )
{
for ( synindex syn_id = 0; syn_id < connections_[ tid ].size(); ++syn_id )
{
Expand Down Expand Up @@ -1365,14 +1351,7 @@ nest::ConnectionManager::compute_target_data_buffer_size()
const size_t min_num_target_data = 2 * kernel().mpi_manager.get_num_processes();

// Adjust target data buffers accordingly
if ( min_num_target_data < max_num_target_data )
{
kernel().mpi_manager.set_buffer_size_target_data( max_num_target_data );
}
else
{
kernel().mpi_manager.set_buffer_size_target_data( min_num_target_data );
}
kernel().mpi_manager.set_buffer_size_target_data( std::max( min_num_target_data, max_num_target_data ) );
heplesser marked this conversation as resolved.
Show resolved Hide resolved
}

void
Expand Down Expand Up @@ -1545,7 +1524,8 @@ nest::ConnectionManager::deliver_secondary_events( const size_t tid,
std::vector< unsigned int >& recv_buffer )
{
const std::vector< ConnectorModel* >& cm = kernel().model_manager.get_connection_models( tid );
const Time stamp = kernel().simulation_manager.get_slice_origin() + Time::step( 1 );
const Time stamp =
kernel().simulation_manager.get_slice_origin() + Time::step( 1 - kernel().connection_manager.get_min_delay() );
const std::vector< std::vector< size_t > >& positions_tid = secondary_recv_buffer_pos_[ tid ];

const synindex syn_id_end = positions_tid.size();
Expand Down Expand Up @@ -1670,7 +1650,6 @@ nest::ConnectionManager::collect_compressed_spike_data( const size_t tid )
{
if ( use_compressed_spikes_ )
{
assert( sort_connections_by_source_ );

#pragma omp single
{
Expand All @@ -1685,3 +1664,135 @@ nest::ConnectionManager::collect_compressed_spike_data( const size_t tid )
} // of omp single; implicit barrier
}
}

bool
nest::ConnectionManager::fill_target_buffer( const size_t tid,
const size_t rank_start,
const size_t rank_end,
std::vector< TargetData >& send_buffer_target_data,
TargetSendBufferPosition& send_buffer_position )
{
// At this point, NEST has at least one synapse type (because we can only get here if at least
// one connection has been created) and we know that iteration_state_ for each thread
// contains a valid entry.
const auto& csd_maps = source_table_.compressed_spike_data_map_;
auto syn_id = iteration_state_.at( tid ).first;
auto source_2_idx = iteration_state_.at( tid ).second;

if ( syn_id >= csd_maps.size() )
{
return true; // this thread has previously written all its targets
}

do
{
const auto& conn_model = kernel().model_manager.get_connection_model( syn_id, tid );
const bool is_primary = conn_model.has_property( ConnectionModelProperties::IS_PRIMARY );

while ( source_2_idx != csd_maps.at( syn_id ).end() )
{
const auto source_gid = source_2_idx->first;
const auto source_rank = kernel().mpi_manager.get_process_id_of_node_id( source_gid );
if ( not( rank_start <= source_rank and source_rank < rank_end ) )
{
// We are not responsible for this source.
++source_2_idx;
continue;
}

if ( send_buffer_position.is_chunk_filled( source_rank ) )
{
// When the we have filled the buffer space for one rank, we stop. If we continued for other ranks,
// we would need to introduce "processed" markers to avoid multiple insertion (similar to base case).
// Since sources should be evenly distributed, this should not matter very much.
//
// We store where we need to continue and stop iteration for now.
iteration_state_.at( tid ) =
std::pair< size_t, std::map< size_t, CSDMapEntry >::const_iterator >( syn_id, source_2_idx );

return false; // there is data left to communicate
}

TargetData next_target_data;
next_target_data.set_is_primary( is_primary );
next_target_data.reset_marker();
next_target_data.set_source_tid(
kernel().vp_manager.vp_to_thread( kernel().vp_manager.node_id_to_vp( source_gid ) ) );
next_target_data.set_source_lid( kernel().vp_manager.node_id_to_lid( source_gid ) );

if ( is_primary )
{
TargetDataFields& target_fields = next_target_data.target_data;
target_fields.set_syn_id( syn_id );
target_fields.set_tid( 0 ); // meaningless, use 0 as fill
target_fields.set_lcid( source_2_idx->second.get_source_index() );
}
else
{
const auto target_thread = source_2_idx->second.get_target_thread();
const SpikeData& conn_info =
compressed_spike_data_[ syn_id ][ source_2_idx->second.get_source_index() ][ target_thread ];
assert( target_thread == static_cast< unsigned long >( conn_info.get_tid() ) );
const size_t relative_recv_buffer_pos =
get_secondary_recv_buffer_position( target_thread, syn_id, conn_info.get_lcid() )
- kernel().mpi_manager.get_recv_displacement_secondary_events_in_int( source_rank );

SecondaryTargetDataFields& secondary_fields = next_target_data.secondary_data;
secondary_fields.set_recv_buffer_pos( relative_recv_buffer_pos );
secondary_fields.set_syn_id( syn_id );
}

send_buffer_target_data.at( send_buffer_position.idx( source_rank ) ) = next_target_data;
send_buffer_position.increase( source_rank );

++source_2_idx;
} // end while

++syn_id;
if ( syn_id < csd_maps.size() )
{
source_2_idx = csd_maps.at( syn_id ).begin();
}
} while ( syn_id < csd_maps.size() );

// Store iteration state for this thread. If we get here, ther is nothing more to do for
// this thread so we store a non-existing syn_id with a meaningless iterator to inform that
// this thread has nothing to do in the next round.
iteration_state_.at( tid ) =
std::pair< size_t, std::map< size_t, CSDMapEntry >::const_iterator >( syn_id, source_2_idx );

// Mark end of data for this round
for ( size_t rank = rank_start; rank < rank_end; ++rank )
{
if ( send_buffer_position.idx( rank ) > send_buffer_position.begin( rank ) )
{
// We have written data for the rank, mark last written entry with END marker
send_buffer_target_data.at( send_buffer_position.idx( rank ) - 1 ).set_end_marker();
}
else
{
// We have not written anything, mark beginning of chunk with INVALID marker
send_buffer_target_data.at( send_buffer_position.begin( rank ) ).set_invalid_marker();
}
}

// If we get here, this thread has written everything.
return true;
}

void
nest::ConnectionManager::initialize_iteration_state()
{
const size_t num_threads = kernel().vp_manager.get_num_threads();
iteration_state_.clear();
iteration_state_.reserve( num_threads );

// This method only runs if at least one connection has been created,
// so we must have at least one synapse model and we can start iteration
// at the beginning of its compressed spike data map.
auto begin = source_table_.compressed_spike_data_map_.at( 0 ).cbegin();
for ( size_t t = 0; t < num_threads; ++t )
{
iteration_state_.push_back( std::pair< size_t, std::map< size_t, CSDMapEntry >::const_iterator >( 0, begin ) );
}
}
Loading