Skip to content

Inference Engine Sequence Discussion

denishaskin edited this page Jan 20, 2015 · 4 revisions

The following is a sequence-based walkthrough of the Inference Engine, abbreviated as IE in the rest of this page.. An image representing is a TODO.

As mentioned in the Inference Engine page, the IE is an implementation of a particle filter. If you are new to particle filters, please take a moment to view this youtube video for a non-mathematical introduction.

PartitionedInputQueueListenerTask

The entry point into the IE is currently the PartitionedInputQueueListenerTask.processMessage(...). This class is responsible for subscribing to the Queue and serializing data off of it. The other important aspect of this class is the implementation of the concept of Parititoning. The IE may not be able to process the entire data set of an agency, and as its implementation is stateful, it does not lend itself to load balancing well. The solution is to divide (parition) the dataset based on vehicles. This is accomplished by only processing a vehicle if it belongs to a depot that has been assigned to the configured IE. See the acceptMessage(...) Further details of the impl.

VehicleLocationService

The above processMessage(...) calls into VehicleLocationService.handleRealtimeEnvelopeRecord(...) which takes the deserialized data structure of the Queue message and translates it to a NYCRawLocationRecord. Although NYCRawLocationRecord is rather poorly named, it is a key data model, allowing observation data to flow into the IE. VehicleLocationSevice provides the IE capabilities as a service, with the handleRealtimeEnvelopeRecord(...) taking care of the necessary synchronization of an internal thread pool. The synchronization concern is this: Each vehicle is independent of each other, so vehicles simply compete for server resources (the thread pool). However with realtime data you cannot guarantee messages are spaced such that a vehicle will not compete against itself -- hence the synchronization is at the vehicle level as well. This size of the thread pool is calculated as a function of the CPUs available, and has been tuned based on real world testing.

Once an instance of VehicleInferenceInstance has been secured, it is passed to a work thread named ProcessingTask which runs in the thread pool.

ProcessingTask

ProcessingTask contains an instance of VehicleInferenceInstance and updates it with the latest observation from the NYCRawLocationRecord. The methods that query for state of the VehicleInferenceInstance are not thread safe, so handleUpdate attempts to synchronized the smallest possible block of contentious data. Once complete, it returns a NycQueuedInferredLocationBean, the second key data model inside the IE. With NycRawLocationRecord populating the input of the observation, the equally poorly named NycQueuedInferredLocationBean represents the output of the IE as it will be placed on the Queue. Returning this data represents completion of the PartitionedInputQueueListenerTask.processMessage(...) method.

VehicleInferenceInstance

VehicleInferenceInstance is colloquially known as the "wrapper" for the IE. It does a large amount of setup to create an Observation that the particle filter requires, and calls directly into ParticleFilter.updateFilter(...) with that Observation.

Observation

An Observation wraps the NycRawLocationRecord along with additional state about the Observation. It's just a container which includes that plus several state variables (all set externally), as well as refs to the previous Observation, RunResults, and a comparator.

BlockStateObservation

A BlockStateObservation wraps an Observation and a BlockState. From the doc: Specifically, it contains information about the BlockState that is conditional on when it was observed.

ParticleFilter

ParticleFilter.runSingleStep(...) appropriately delegates to updateFilter(...) which updates the particle filter based on the single observation, which then potentially includes resampling based on that observation. I say potentially, as some particles are dropped for performance reasons. runSingleStep(...) invokes MotionModel.move(...) which incorporates the current observation into the existing state of the IE, after which it invokes computeBestState(...). computeBestState(...) computes the probability of each particle and selects the most likely state for return.

MotionModel

MotionModel is an implementation of a model representing the buses behaviour based on the dimensions or measures listed below. This is acheived by calculating the probability of each Likelihood and summing that probability to create the overall probability. Next, this particle becomes the parent of the new particles created by sampling the parent. Thus the term parent used throughout the IE refers to the instances of the particle before this current observation.

Likelihood

Likelihoods represent 9 measurements about the vehicle. A Likelihood is a model of the behaviour of a particular measurement as it applies to a particle weight. Some are really simple,like MovedLikelihood, others are more complex like SchedLikelihood.

Likelihoods rarely ever deal in absolute truth; all rules are formulated so that the set does not completely converge as you want multiple possibilities to be considered appropriately. With that said, Likelihoods often do consider the impossibility of a measurement such that for performance reason the particle can be dropped. Weightings of these probabilities are adjusted by hand based on experience, observation, and the established Integration Test Traces.

If you have read to this point and are still interested, look into the implementation of the Likelihoods, and the services responsible for sampling choices. It is in these details of the overall motion model that the accuracy of the system lives.

runSingleStep(...) invokes

Clone this wiki locally