You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current BOA model simplifies the Rubicon model into distinct stages of goal-selection vs. goal-engaged pursuit, without considering the need for various updates in the goal-engaged state. This is also reflected in the trace-based learning rules in BLA and BG needed to bridge the temporal span of the goal-engaged state: the BLA trace currently updates primarily at the time of CS gating (as a function of ACh), meaning that changes in sensory input are not tracked at all. This is not good for eboa, reflected in the following two key cases:
During initial curiosity-driven learning, the model will approach anything, and > 50% of the time the CS in fixation at time of initial gating is different at time of US. It really needs to learn about the US state in this case, because the initial gating was basically random exploration. It is not necessary to "abandon" the curiosity goal to learn about the actual outcome -- that was the whole point of the curiosity goal in the first place.
After the BLA has learned the CS -> US associations, then if the initial CS changes over the course of approach (e.g., due to getting a better, closer view), that is an opportunity to re-evaluate the goal. If the new CS is not actually desired, then it should abandon and re-explore to find a new one. If it is desired, updating to new expectations might be in order. This case is distinct from the curiosity case, because the proper accounting relative to the initial goal must be handled: you need to abandon old then update to new.
So, how can we manage these two cases (among other possible scenarios) within a consistent set of mechanisms?
First, we need to keep in mind the proposed hemispheric primary / secondary goal framework: the non-dominant hemisphere can track changes in non-selected options, and if that ends up being better than the current dominant engaged goal, the usual hemispheric competition dynamic can drive an update.
Even prior to implementing that mechanism, there are possible ACh-level mechanisms that might help achieve these conflicting cases:
CS novelty ACh should be modulated by BLA activity, not just raw stimulus diffs. If a new CS drives a new BLA US association, that should be more significant (higher ACh) than just a novel CS with no such US assoc.
Thus, basic CS novelty can drive a level of ACh sufficient to engage BG gating under a curiosity drive, but, with MaintInhib of ACh (BOA: shift balance on SC novelty / ACh more to MaintInhib #236), it is not enough to disrupt engaged goal pursuit. Further, this lower initial ACh level leaves "room" for more US-time learning of CS in the synaptic traces.
Known CS -> US onset drives a higher level of ACh that more strongly anchors learning to the initial gating conditions, and more strongly disrupts ongoing goal pursuit, enabling consideration / gating to update to the new CS.
To explore this space, we need the following new mechanisms:
ACh mod needs min, range params for defining the sensitivity to ACh levels, for each case (gating, learning).
ACh will be more graded overall as a function of novelty, CS-US activation in BLA, and inhibition -- MaintInhib cannot fully suppress, so in-flight updates can happen. Need to manage dynamic rage effectively. kind of a PITA but necessary.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
The current BOA model simplifies the Rubicon model into distinct stages of goal-selection vs. goal-engaged pursuit, without considering the need for various updates in the goal-engaged state. This is also reflected in the trace-based learning rules in BLA and BG needed to bridge the temporal span of the goal-engaged state: the BLA trace currently updates primarily at the time of CS gating (as a function of ACh), meaning that changes in sensory input are not tracked at all. This is not good for eboa, reflected in the following two key cases:
During initial curiosity-driven learning, the model will approach anything, and > 50% of the time the CS in fixation at time of initial gating is different at time of US. It really needs to learn about the US state in this case, because the initial gating was basically random exploration. It is not necessary to "abandon" the curiosity goal to learn about the actual outcome -- that was the whole point of the curiosity goal in the first place.
After the BLA has learned the CS -> US associations, then if the initial CS changes over the course of approach (e.g., due to getting a better, closer view), that is an opportunity to re-evaluate the goal. If the new CS is not actually desired, then it should abandon and re-explore to find a new one. If it is desired, updating to new expectations might be in order. This case is distinct from the curiosity case, because the proper accounting relative to the initial goal must be handled: you need to abandon old then update to new.
So, how can we manage these two cases (among other possible scenarios) within a consistent set of mechanisms?
First, we need to keep in mind the proposed hemispheric primary / secondary goal framework: the non-dominant hemisphere can track changes in non-selected options, and if that ends up being better than the current dominant engaged goal, the usual hemispheric competition dynamic can drive an update.
Even prior to implementing that mechanism, there are possible ACh-level mechanisms that might help achieve these conflicting cases:
CS novelty ACh should be modulated by BLA activity, not just raw stimulus diffs. If a new CS drives a new BLA US association, that should be more significant (higher ACh) than just a novel CS with no such US assoc.
Thus, basic CS novelty can drive a level of ACh sufficient to engage BG gating under a curiosity drive, but, with MaintInhib of ACh (BOA: shift balance on SC novelty / ACh more to MaintInhib #236), it is not enough to disrupt engaged goal pursuit. Further, this lower initial ACh level leaves "room" for more US-time learning of CS in the synaptic traces.
Known CS -> US onset drives a higher level of ACh that more strongly anchors learning to the initial gating conditions, and more strongly disrupts ongoing goal pursuit, enabling consideration / gating to update to the new CS.
To explore this space, we need the following new mechanisms:
Beta Was this translation helpful? Give feedback.
All reactions