-
Notifications
You must be signed in to change notification settings - Fork 69
PTMapper algorithm and config parameters
To map the stop sequences of a public transit route to a network, a reference link for each stop and the path between those referenced links have to be defined. To generate the paths, the following input data is usually available for each transit route: stop sequence, stops with coordinates and a network as a directed graph.
The mapping algorithm in pt2matsim looks at the whole route to define the best reference link for each stop. It is also possible that more than one optimal link can be found for a stop. The algorithm calculates the least cost path from the transit route's first to its last stop with the constraint that the path must contain a link candidate of every stop.
For each transit route, the algorithm consists of the following steps:
- Find link candidates for each stop.
- Create a pseudo graph using the link candidates as nodes. Add a dummy source and destination node to the pseudo graph.
- Calculate the least cost path between each link candidate pair. This path is represented by an edge in the pseudo graph, connecting two link candidate nodes. The edge weight is the path's travel cost plus half the travel cost of the two link candidates it connects.
- Calculate the pseudo least cost path from the source node to the destination node in the pseudo graph. The resulting least cost path contains the best link candidate for each stop.
- Create the link sequence. Each stop is referenced to a link (given by the best link candidate that is part of the pseudo least cost path). The least cost path on the real network between the referenced links is used to create the network path for the transit route.
The algorithm works without using any additional GPS data. It is assumed that a public transit vehicle can access a stop by passing a link which is referenced to the stop. Depending on the input data, stop locations are not available, i.e. multiple stop locations on different roads are combined to one parent stop with the same name in the schedule. When two lines use a stop at an intersection there might be one to four actual stop locations.
Step 4 & 5: Calculate least cost path on the pseudo graph. Use this path to create link sequence on network
The following sections explain how the algorithm is implemented and what parameters are used in the PTMapper config file.
First, a set of link candidates is created for each stop facility and schedule transport mode. For nodes around the stop coordinate, the in- and outlinks are fetched and sorted in ascending order by their distance from the stop facility. All links within the "candidate distance" are considered as link candidates. The candidate distance is calculated by multiplying the distance from stop facility to the n-th link (defined by nLinkThreshold) and the candidateDistanceMultiplier factor. For example, if the 6th link is located 12 meters from the stop facility, given a multiplier of 1.5, all links within 18 meters from the stop facility are considered as link candidates. However, no links farther than maxLinkCandidateDistance from the stop facility are used. Stop facilities with no link within maxLinkCandidateDistance are given a dummy loop link at their coordinates: A node is added at the coordinate and a dummy loop link is added to the network with the added node as source and destination. The loop link is referenced to the stop facility and is set as its only link candidate. Note: If maxLinkCandidateDistance is 0, all routes use artificial links. That way an separate network for public transit can be created.
The config parameterset transportModeAssignment defines for a transport mode, what links a transit route is allowed to use. For example, transit routes with schedule mode bus can only use links with "bus" or "car" as modes. Similarly, all transit routes with the transport mode rail can only use rail links. If no assignment for a schedule transport mode is given, all transit routes using that mode are mapped with artificial links between stops.
To calculate the least cost paths as part of pseudo routing, a router is needed for every transport mode. To create these routers, mode separated networks are generated. Then an A* router for each of these networks is initialized. The networks are filtered according to the transportModeAssignment. The router uses one of two link travel costs: either linkLength or travelTime, defined in parameter travelCostType.
During this step the best sequence of link candidates for each transit route is calculated. While routing on the network uses an A* router, least cost path search on the pseudo graph is done with a separate Dijkstra implementation.
Artificial links connect the toNode of a link candidate with the fromNode of the next link candidate. Artificial links are added to the network in two cases: When no path on the network between two link candidates can be found or if the least cost path has costs greater than a threshold. This threshold is defined as maxTravelCostFactor times the minimal travel costs. The minimal travel costs depend on the parameter travelCostType: If it is linkLength, the beeline distance between the two stops is used. If it is travelTime, the minimal travel cost is equivalent to the travel time needed based on the arrival and departure offsets of the two stops. All artificial links (including loop links for stop facilities) and nodes have the prefix "pt_".
The step "PseudoRouting" creates PseudoRoutes for each transit route, each of which contains a sequence of PseudoRouteStops. A PseudoRouteStop contains information on the stop facility, the link candidate as well as departure and arrival offsets.
This pseudo routing step can be parallelized using multiple threads (numOfThreads). For each thread a queue of transit lines is handled. However, the search for the shortest path between link candidates uses the routing algorithms provided in the MATSim core which are not thread safe. Access to the mode separated routers had to be synchronized.
Since the shortest paths on the network have been calculated, the optimal link sequence for each transit route is known. After this step, all artificial links are added to the network.
After all pseudoRoutes have been created, most likely there are multiple best links for a stop facility because different routes use different links. For each of these links a "child stop facility" is created. It has the same name and coordinate as its parent stop facility but it has a link referenced. The id of the child stop facility is generated by combining the parent stop id and the link id, connected by the string ".link:". For example the child stop facility of parent stop 64587 which is connected to the link 432 would get the id "64587.link:432". Using the same connection string for all parts of the package allows to infer the parent stop id based on the given stop facility id.
Since it is now known which link candidates and thus child stop facility each transit route uses, route profiles (stop sequences) for all transit routes can be created.
Pseudo routing is finished after the previous steps. However, some artifacts remain in the schedule and the network. By default, stop facilities that are not used by any transit route are removed (removeNotUsedStopFacilities). The freespeed of artificial links is set according to the schedule: For all transit routes the minimal necessary freespeed to ensure they are on schedule is calculated. The highest minimal freespeed of all transit routes of a link is used as the link's freespeed. This process can be done for other link transport modes as well (defined in scheduleFreespeedModes). It is recommended to do this for rail.
The transport mode of each transit route is assigned to its used links. Links that are not used by a transit route are removed. This can clean up and simplify rail networks. Links which have a mode defined in modesToKeepOnCleanup are kept regardless of public transit usage.