Last week, we used the taxi data to determine what strategies the highest performing drivers were using. There, the best "strategy" essentially boiled down to a list of locations and times that led to profitable pickups. (More precisely, we found that whether or not a particular driver was in the top 5 percent of all cabbies depended strongly on the number of pickups that driver made at these specific spots.)
However, this list of hot spots is ultimately of limited use to a driver. If I'm a cabbie with a weekend shift, it doesn't do me much good to know that picking up fat cats at City Hall during the work week is a good "strategy." Even if I know which spots are hopping on the weekend after midnight, that leaves me little guidance for the rest of my shift. And finally, even if I know all the best spots at every hour of the day, what happens if I need to drop a passenger off far away from those spots? Do I waste gas driving back to them, or do I try to make another pickup nearby? These are the issues that we'll try to address with our analysis this week.
So what do we actually want to be the end product of our analysis? We're trying to play the game of maximizing profit as a taxi driver. Clearly, a "strategy" for this game that consists of a completely general set of rules to follow would be quite useful! That is, if I'm a cabbie that happens to find himself at location X at time T, these rules should tell me exactly what action to do next. If there's a passenger available, should I pick her up and take her to wherever she wants to go? Or should I leave her on the curb---and if so, where should I go next to give myself the best chance at making a more profitable pickup (without wasting too much time or gas)?
The taxi data gives us the board on which this game is played---it tells us the probabilities of making pickups at each location and time, as well as the payoffs or costs for making each move. As an astrophysicist, it's most natural for me to think of this game board as existing in a sort of spacetime. Recall that we split our 2 spatial dimensions into 251 zones:
We can further split our time dimension into different slices:
From this diagram, we can see that a taxi trip consists of a move on this game board from one point on the spacetime grid to another. Again, the data tells us the probability of making a pickup at each point, as well as the probability distribution for making trips to subsequent points if a pickup is made. It also tells us the distribution of rewards (if a pickup is made) and costs (for gas) associated with moves between all of the points. Note that some moves are not "legal"---cabbies have to obey the speed limit, and they can only move forward in time! (Physicists will recognize this requirement as a consequence of being confined to the future "light cone," which contains all future points that can be causally connected to a current point given the finite speed of light. Here, our "cabbie cone" is determined by the typical taxi speed, which we'll take to be around 15 mph.)
Clearly, our game on this spacetime board contains some element of chance. Whether or not we score a passenger at each spacetime point is random, as are the moves that passengers force us to make. The problem is then to find an optimal strategy for this game of chance, given that we know the probabilities for these random processes. It turns out that our game is simply a case of a Markov Decision Process:
Markov Decision Processes can typically be solved by writing the problem as a matrix equation (involving matrices for move probabilities and expected payoffs), and then using linear or dynamic programming to solve the equation to find the optimal strategy. However, this is difficult to do in our case, because: 1) the size of the spacetime grid is large (251 spatial zones x many 10-minute time slices), and 2) the number of possible actions is large (given any spacetime point, there are many points contained in the cabbie cone that we could choose to go to next). This means that our matrices are large and unwieldy.
Instead, we'll use a different approach to find near-optimal strategies on our spacetime game board. First, we note that we can use the data to test strategies, by simply simulating trips and finding their corresponding payoffs under each strategy. For example, here's a simulated taxi shift that starts at Penn Station at 7AM (a.k.a. "spacetime zone 10914") and ends near City Hall at 7PM (a.k.a. "spacetime zone 28414"):
Starting in spacetime zone 10914 Make pickup: moving to 11428 and making 10.04 Make pickup: moving to 12046 and making 27.42 No pickup: moving to 12797 and losing -0.84 No pickup: moving to 13295 and losing -0.84 No pickup: moving to 14029 and losing -0.84 Make pickup: moving to 13982 and making 6.18 Skip pickup: moving to 14487 and losing -0.98 Make pickup: moving to 15060 and making 39.57 Make pickup: moving to 15060 and making 6.59 Make pickup: moving to 15060 and making 6.59 No pickup: moving to 15813 and losing -0.69 Make pickup: moving to 16315 and making 20.12 Make pickup: moving to 16901 and making 20.78 Make pickup: moving to 17179 and making 9.72 Make pickup: moving to 17179 and making 3.52 No pickup: moving to 17428 and losing -0.28 No pickup: moving to 18228 and losing -1.14 Make pickup: moving to 18689 and making 12.54 Make pickup: moving to 18946 and making 10.15 No pickup: moving to 19731 and losing -1.19 Make pickup: moving to 20227 and making 18.36 Make pickup: moving to 20634 and making 15.68 Make pickup: moving to 21507 and making 21.34 Make pickup: moving to 21788 and making 7.15 Make pickup: moving to 21995 and making 10.48 No pickup: moving to 22244 and losing -0.28 Skip pickup: moving to 23021 and losing -0.44 Make pickup: moving to 23788 and making 33.12 Skip pickup: moving to 24277 and losing -0.44 Make pickup: moving to 24688 and making 28.50 No pickup: moving to 24939 and losing -0.00 Make pickup: moving to 24938 and making 4.70 No pickup: moving to 25706 and losing -1.41 Make pickup: moving to 26293 and making 19.95 Skip pickup: moving to 27026 and losing -0.44 Make pickup: moving to 27563 and making 29.77 Skip pickup: moving to 28041 and losing -0.89 Make pickup: moving to 28414 and making 32.33 Total / hour = 31.99
To search for near-optimal strategies, we can use sparsely sampled look-ahead trees (as discussed in this paper). That is, given a starting spacetime point, we can randomly sample combinations of possible future passenger trips and actions. We then calculate the corresponding payoffs for these combinations, and choose the action that yields the highest payoff on average. The set of such actions at all possible spacetime points then constitutes our strategy; since we don't sample all possible actions infinitely far into the future, the overall strategy is only near-optimal.
As one might expect, the performance of near-optimal strategies found using sparsely sampled look-ahead trees depends on the tree width/breadth (the number of possible moves and actions we sample) and depth (the number of moves we look ahead). Ideally, we'd like to compare our near-optimal strategies to those employed by real-life drivers (see the histogram of driver performance from last week). However, the fact that our spacetime game board is just an approximation to the real-life game board makes the comparison a bit unfair---for example, our simulated drivers are only allowed to make a single move every 10 minutes. So we'll have to do a little bit of scaling in order to compare apples to apples. We can also compare the performance of our near-optimal strategies to a random strategy, in which the driver simply moves at random if no pickup is made.
Interestingly enough, we see that typical real-life drivers perform no better than a random strategy! Furthermore, it's clear that increasing the width and depth of our trees does indeed lead to better strategies---even sampling just 32 possible actions at each of 4 moves ahead (i.e., at most 32^4) already yields about a $10/hour boost over a real or random strategy. However, it does seem that the variance in performance that arises from our strategies is a little larger than the real variance.
Finally, we also see that for strategies with a depth of 2 or more, the average performance is comparable to that of the top 5th percentile of real drivers. When asked how far he thinks ahead in a typical chess position, Kasparov stated: "Normally, I would calculate three to five moves. You don't need more..." Perhaps the best NYC cabbies do the same.
No comments:
Post a Comment