Taxi Company Dispatch Simulation

As a part of the course in “Simulation of Complex Systems” I suggested a project. The project was to simulate the dispatch system of a taxi cab company. The goal was to essentially simulate where I live in Sweden accurate enough to predict how the customers would react to the change of internal “dispatch rules”.

For the shortest path codes I used MATLAB BGL http://www.mathworks.com/matlabcentral/fileexchange/10922-matlabbgl.

I wrote a report on this project but it only reflect the results and some findings, but nothing on the MATLAB code itself. The code is too long and complicated and I don’t believe it is very useful for anyone if I share it here. I can however give the outline of the algorithms used and my findings. This might help others who wish to simulate traffic or a similar complex system.

This project was ambitious. Firstly, to make the simulation as accurate as I could, I interviewed my father, who is a taxi driver. I asked him about the different probabilites of missing a customer, how long a customer usually waits for a cab and so on. I used real statistics for travel distance from over 200 instances to create a distribution function to build the new simulation from. All of this data was used to create a more accurate simulation.

At this point you should probably head on and read the report. The examples below complements the report and explains some of the variables used briefly.

For the source, see my github account link to my github.

The simulator

I wrote the simulation in MATLAB, using three matrices, C, Q, T. These matrices describes C – customer data, Q – the queue in a zone and T, the Taxi cab data.

To enqueue a taxi cab we can write:

z = ZoneID(T(1, cab));
q(z, cab) = max( q( z, : ) ) + 1;

ZoneID and NodeID are helper functions that return the ID of the current zone and node, respectively.

The project required 3 coordinate systems. Why? -One for visualization (co), one for zones (zone) and one for the cab movements (mod). The last coordinate system is called mod, as in modulo. The reason for this is that is the way I create the world matrix, using modulo, specifically:

A(i,j) = mod(i,2)*mod(j,2);

with looping over i,j

In the final code, the following was ultimately used

D    = [1, 0; 0, 0.8];
b    = repmat(D, n);
b    = b(1:n, 1:n); %get the submatrix

In the above code, 1 is used to denote a node, 0.8 is ’empty’, but important to define for visualization.

These values was used to denote direction in the mod matrix

dir_value_SW = 0.2; %South/West
dir_value_NE = 0.3; %North/East
edge_value   = 0.6; %the value of a double directed edge (default)

Note: dir_value_SW*dir_value_NE == edge_value.

Below shows how I converted from b to the sparse adjacency matrix spAM:

[u, v, w] = find(b);

for i=1:size(u, 1)

    if w(i) == edge_value || w(i) == dir_value_NE || w(i) == dir_value_SW  %if edge

        if mod(u(i), 2) == 1 %horizontal
            from   = nodeID(u(i), v(i)-1);
            to     = nodeID(u(i), v(i)+1);

            if w(i) == dir_value_NE %right
                spAM(to, from) = 1; %link one direction
            elseif w(i) == dir_value_SW %left
                spAM(from, to) = 1;
            elseif w(i) == edge_value %double direction
                spAM(from, to) = 1;
                spAM(to, from) = 1;
            end
        end

        if mod(u(i), 2) == 0 %vertical
            from   = nodeID(u(i)-1, v(i));
            to     = nodeID(u(i)+1, v(i));

            if w(i)==dir_value_NE %up
                spAM(from, to) = 1;
            elseif w(i)==dir_value_SW %down
                spAM(to, from) = 1;
            else
                spAM(from, to) = 1;
                spAM(to, from) = 1;
            end
        end
    end
end

b contains a regular grid of nodes. Each node can have an edge (street) to any of its 4 adjacent nodes. It can also have directed edges. This feature was ultimately not used in the final simulation.

The simulation works by using time stamps. Each iteration is a new unit of time in the simulation. So, each event is defined at a specific time. The cabs know before the dispatch picks up when the person is going to be there. I also have some timings for customers, because they can change their mind.

The code is actually not concurrent, using message passing or anything. The agents are independent and there are no locks, or any need to add them. There is a lot of housekeeping however.

One of the behaviors of the cab drivers was to stay within one zone of the city as long as there weren’t too many other taxi cabs in that zone. The random driving around in the zone was accomplished by randomly choosing one of five points in a zone: one of the four corners or center.

In order to keep the cab within a zone, I use my auxillary function ZoneID, which returns the zone identity, given a set of nodes. By filtering out the nodes of a specific zone, we can remove the other nodes to define the nodes the taxi cab has to drive to OR create a new sparse matrix and keep the specified nodes.

The reason for doing this is to force the cab to only drive within a zone, so we need a sparse adjacency matrix for only those nodes. If we don’t do this and choose nodes within the same zone and let the cab drive between them, the shorter distance between two nodes may cross over to another zone, because we minimize the distance for each travel.

Consider the following codes

A = sparse( (1:3)'*ones(1,3) );

A1         = A;
keep_nodes = [1, 3];
A1         = A1(keep_nodes, keep_nodes)
rm_nodes         = [2];
A2               = A;
A2(rm_nodes, : ) = 0;
A2(:, rm_nodes)  = 0

Gives the results

A1 =
     1     1
     3     3

A2 =
     1     0     1
     0     0     0
     3     0     3

This is enough to illustrate how the sparse version of the above codes will differ. The A1 version would destroy the important zero elements of a adjacency matrix, so it would be incorrect.

Here is the final version:

all_nodes    = 1:length(spAM);
node_keepers = all_nodes( find(ZoneID(all_nodes) == current_zone) );
complement_nodes = setdiff(all_nodes, node_keepers);

spAMtmp = spAM;

spAMtmp(complement_nodes, : ) = 0;
spAMtmp(:, complement_nodes)  = 0;

Conclusions

What conclusions can we draw from a computer simulation? The simulation is a clean, simplified version of the real world. It provides us with an insight into a complex world that may or may not lends itself to be simplified. On the other hand, maybe we are only interested in a statistical version and not a detailed true version of the world.

By setting the rules and measuring the reactions as they occur is reality, we can find the variables in the simulation to fit the real world data. Extrapolation of a complex system is prone to errors. More sophisticated statistical tools and empirical studies are needed to be able to get sensible data from a simulation. Although it is a valuable tool, further research is needed to assess the accuracy of the simulation and the conclusions one can draw from them.

2 Comments:

  1. Hi, sir, I’m a student learning Matlab. I’m deal with a problem about taxis simulation. Could you please share the codes in this article to me? Thank you very much.

    • Hello Jin. Nice of you to comment. I don’t think that my code will be of any help, it is quite complex and I have not really written any documentation. I can put it up on github later today, when I get the time to sort everything out, so you can see.

      If I did the project today, given all I know now, I would probably choose another language that is better suited for handling scheduling issues. In my solution I solve this by doing the housekeeping myself. That’s why the code is relatively messy. Maybe check out concurrent programming languages such as JR.

Leave a Reply

Your email address will not be published. Required fields are marked *