-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple connections for boards with multiple ethernet connected chips #142
Comments
There is of course the issue of how you parallelise all your reads/writes (i.e. using multiple Ethernets at once) with a nice-ish API... Asyncio-type approaches seem to be the main candidate here but slooowww! That aside, you may find the following functions from the defunct asyncio branch useful for getting the set of candidate ethernet-connected chips: Lines 176 to 254 in b666e80
When I was playing with this, I built the following machine discovery implementation: rig/rig/machine_control/machine_controller.py Lines 231 to 327 in b666e80
If you're worrying about IP address and ethernet connectivity stuff, this code should have you covered. Note the |
Awesome, thank you!
On my "to test" list is making |
Following from a recent offline discussion with @mundya about speeding up I/O for large machines using multiple Ethernet connections, the following conclusions were reached:
Possible options:
Clearly all this requires some new centralised state:
This is going to prove faffy to get right... |
The following is an informal design document for multi-connection support. Feedback highly encouraged. High-Throughput IO PlansGetting data in and out of SpiNNaker machines with high throughput is very How many connections?Though ideally Rig should be able to drive all the Ethernet connections to an
I'm not sure which of these two issues will strike first. Initially I'll hope Which connection should be used?I would suggest that loading is always performed via the Ethernet connection on
Option 1 is very easy but may lead to significant load imbalance: in principle Option 2 in principle is ideal but in practice is expensive since it requires Option 3 and 4 potentially work reasonably well and probably differ mainly in Overall, option 3 is probably the easiest to implement and stands the best How should multiple connections be used?In principle it could be done with a single select statement and some clever If this were C, the select option would be the way to go but since this is I propose having a thread pool with each thread allocated a particular SCP As for how to select the number of threads; initially this can just be fixed How should data be supplied?Two obvious possibilities exist:
The former is potentially the easiest to implement and has the cleanest API. The later is much more fiddly to implement but obviously is more scalable since I think due to safety and scalability I'll regrettably have to go with the How should errors be treated?For a large portion of applications, a single error during read/write is Sadly, for larger-scale applications, a partial failure may not be as bad as The correct solution here is probably to have two alternative modes of How should the API look?Files on Linux are famously poor at dealing with multiple asynchronous AsyncIO/Trollius could be used here however it has famously dreadful An idea:
The above works nicely for upgrading existing applications where you can afford
With this modified API, callback style usage could be achieved at the expense By default, if an error occurs (including in a callback function) the end_burst Alternatively, end_burst could accumulate exceptions and tracebacks, continuing What will happen to the internals?
|
An alternative API idea:
This alternative has the following advantages:
|
I like this, with the modification that the additional argument to We can add Thoughts? The advantages are the same as above but with fewer API changes (and with complete backwards compatibility). |
What would your use-case for the callback function be? My understanding of most loads is that a sync/flush function is the simplest thing which could possibly work and should result in minimal changes to existing code. Certainly much simpler than callbacks etc. for this style of application. My reluctance to exposing a callback is that the "write and flush/sync later " API does not imply any sort of asynchronous programming mechanism while a callback does and I don't have a good feeling about that given Python's lack of a good defacto asynchronous programming mechanism. |
Which callback? The read callback I can see being particularly handy (over, say The write callback was mostly for symmetry over just saying |
[I guess |
Indeed, but if we provide the callback we are introducing an implied concurrency model to the user's application. This is not a step I'd take too lightly and as mentioned there is no clear "good solution" here. I can't think of an application where you'd want to be notified when some (but not all) of your reads have completed -- what sort of thing are you thinking of?
Fair, though my plan was to have exceptions in read/writes to pop out of the sync function to prevent silent errors in most apps. Having to write a NOP function is also vaguely irritating ;). |
Not immediately required (it would be nice to have for Canada, possibly) but definitely necessary for Spaun.
As far as I can tell, the IP address of a chip can be read from the
sv
struct (ip_addr
) and is, I assume, formatted the same way as IP addresses for IP tags (i.e.,!4B
, bewaring that the default appears to be 255.255.0.0).The real challenge is in determining the co-ordinates of likely chips, though I guess a safe approach would be to always jump 8* (north or east) from the last found ethernet chip and then read
eth_addr
to get the co-ordinates of the nearest ethernet connected chip.I initially thought that this process should be part of
get_machine
... but now I'm not so sure.Once we have a map of co-ordinates to ethernet addresses it becomes a drop in replacement to
MachineController.get_connection
, though we should also expose the mapping so that other software (like Nengo) can transmit SDP to SpiNNaker efficiently.* or 7?
Also: "bewaring" is an awesome word.
The text was updated successfully, but these errors were encountered: