next up previous contents index
Next: Current Limitations and Future Up: PRouST Switch Components Previous: Aggregator   Contents   Index

Subsections


More on Aggregation

Possible Approaches to Aggregation

There are two paradigms for topology aggregation currently supported in PNNI: the Symmetric-node approach and the Star approach. In a Symmetric-node model the peer group topology is collapsed to a single node, and one value (usually the worst-case value), called the radius, is advertised for each parameter. While this scheme greatly reduces the information propagated, it may result in very poor routing decisions, particularly if the number of border nodes and the variance in their connectivity is large. The Star model augments the Symmetric-node approach as follows: topology is aggregated into a representation with one central virtual node called the nucleus and $n$ border nodes. This central node is explicitly connected to all of the border nodes via spokes. The spokes may be given weights different from the radius, in which case they are referred to as exceptions. In the final level of sophistication, PNNI aggregation allows the aggregated representation to specify logical links directly between borders; such links are called bypasses. A desirable aggregation scheme must
  1. Transform a peer group structure into a compact form. This is necessary to capitalize on the intended purpose of aggregation, namely to reduce background traffic required to maintain routing information for large networks.
  2. Retain information about those aspects of the topology that are most critical to making good routing decisions. The aggregation process results in each switch seeing a suitably coarsened (and hopefully adequate) view of the entire network topology: A switch sees the true topology only for its own peer group, but has progressively more aggregated views of farther regions of the network. Since this aggregated topology information will be used to select the routes for connection requests, we must make sure that our aggregation scheme does not have too detrimental effect on route optimality.
  3. Be efficiently computable. Aggregation will need to be re-performed every time peer group topology changes significantly. Having a lightweight aggregation algorithm ensures that we can have a reasonable criterion defining a ``significant change'', without putting an unacceptable burden on the network switches that have to compute the aggregated representation.
To date, no specific Symmetric-node or Star aggregation schemes have been defined in the PNNI standard. In this paper, we present two approaches for topology aggregation with respect to a single metric parameter. Specifically, we describe two techniques for computing good Star representations of a peer group.

Mathematical Model

We represent a peer group as a multi-weighted undirected graph $G(V,E,B,{P_1,\cdots, P_k})$ with nodes $V$, edges $E$, where $B\subseteq V$ is the fixed set of $n$ border nodes, and ${P_1,\cdots,
P_k}$ are the required PNNI topology state parameters. The edge vector $e_{ij} =({P_1}^{ij}, {P_2}^{ij},\cdots, {P_k}^{ij})$ consists of the values of the parameters for the link joining $V_i$ and $V_j$.

Preliminaries

Let $\delta_{P_i}(W, B_i)$ be the value of the best path between the non-border node $W$ and border node $B_i$ with respect to $P_i$. Specifically,

\begin{displaymath}
\begin{array}{l}
${\bf for additive metric:} best path mea...
...g the edges that form the
path\footnotemark ).$
\end{array}
\end{displaymath}

Define $bd(B_i, B_j)$, the boolean distance between a pair of border nodes $B_i$ and $B_j$;

\begin{displaymath}bd(B_i, B_j) = \left\{
\begin{array}{l}
\phi \mbox{{, if } ...
...B_j$\}}\mbox{{, if } $B_i \neq B_j$} \\
\end{array}
\right. \end{displaymath}

Note: The boolean distance between two nodes is a set. It is easy to see that the boolean distance is a metric, defined in terms of equality, containment and union of sets. Define $\omega$, the set of candidate nodes as

\begin{displaymath}\omega = \{W \;\;\;\vert \;\;\; W\in V-B, \;\; W\not\in \bigcup_{1\leq i<j\leq n}bd(B_i, B_j) \} \end{displaymath}

. It is clear from this definition, that $\omega$ consists of all non-border nodes in the peer group that lie on a simple path between some pair of border nodes. We define the status of every candidate node $W\in\omega$ with respect to the parameter $P_i$, and denote this quantity as $S_{P_{i}}(W)$. We also define the status of the peer group with respect to parameter $P_i$, and denote this quantity by $r_i$. The precise definitions of $r_i$ and $S_{P_{i}}(W)$ depend on the particular parameter $P_i$, and are given below:

\begin{displaymath}
\begin{array}{rll}
P_i$ is an {\bf additive metric}:$\ & S...
..._i}(W, B_i) & r_i=\min_{W \in \omega} S_{P_i}(W)
\end{array}
\end{displaymath}

The central status nodes of the peer group relative to the pair $(B, P_i)$ are defined to be the set of nodes $W \in
\omega$ such that $S_{P_{i}}(W)=r_i $. A central status node may be thought of as a node on which one can center the smallest radius sphere (with respect to parameter $P_i$), and still capture all of the border vertices.

Algorithms

Preprocessing

Whenever we need to aggregate, we first generate a weighted complete graph on border nodes from the original peer group. We present two methods for generating the weights $d_{ij}$ (between borders $B_i$ and $B_j$) on the edges of this complete graph:
(I)
${d_{ij}}$ is the measure of the shortest path between $B_i$ and $B_j$. Complexity. We compute the best paths from every border node, to every other node in the network, by running Dijkstra's algorithm $n$ times, starting once from each of the border nodes. This requires $O(n\vert V\vert\log\vert V\vert + n\vert E\vert)$ steps. The space needed is $O(\vert V\vert+\vert E\vert)$, which is what is required to hold the adjacency list graph representation of the peer group.
(II)
${d_{ij}}$ is the measure of the best path between $B_i$ and $B_j$ through a central status node. Complexity. First, we run Dijkstra's algorithm from all border nodes, in $O(n\vert V\vert\log \vert V\vert + n\vert E\vert)$ steps, as in (I). This gives us the $\delta_{P_i}(W, B_i)$. Then for each non-border node $W$, we use the computed distances to the border nodes to compute $S_{P_{i}}(W)$, requiring $O(n\vert V\vert)$ additional steps. We select a central status node appropriately, by keeping track of the best value of the $S_{P_{i}}(W)$ as we compute them. Finally, since we have already computed the distances from the border nodes to the chosen central status node, we can compute the pairwise shortest distances between all border nodes in an additional $O(n^2)$ steps. The space requirement is the same as for method (I) above.
If $d_{ij}$ is the weight on the edge joining $B_i$ and $B_j$ in the complete graph then this is the value that we wish to advertise in the parent group as the cost of traversing the peer group between border nodes $B_i$ and $B_j$.

Figure 7.3: Basic steps for topology aggregation
\begin{figure}
\centering {\mbox{\epsfbox {DevRef/Components/aggr1.eps}}}
\end{figure}

Approach 1

We would like to represent the original peer group as a star graph with weights $x_i$ advertised on the edges such that $x_i+x_j =
d_{ij}$ for all $i,j(i\neq j)$. In general, with $n$ border nodes, such system of equations can be represented as $AX=D$ where, A is $ {n \choose 2 }\times n$ matrix, D is ${n \choose 2 }\times 1$ column vector of the costs to be advertised, and X is $n\times 1$ column vector of unknown weights to be determined for the star representation of the Logical Group Node. This system $AX=D$ is an over determined system of equations which is usually an inconsistent system. That is, given an $m\times n$ system $AX=D$ with $m>n$, we cannot in general find a vector $x\in {R^n}$ for which $AX=D$. In such cases, one resorts to finding a vector, $\hat{x}\in {R^n}$ for which $A\hat{x} =\hat{D}$ is closest to $D$ in the least square solution to the system $AX=D$. Indeed, $\hat{x}$ in $R^n$ is such that ${\Vert r(\hat{x})\Vert}^2={\Vert D-A\hat{x}\Vert}^2$ is minimum over all $x\in
R^n$. Thus to solve $AX=D$, we solve

\begin{eqnarray*}
A^TAX=A^TD
\end{eqnarray*}



The above represents an $n\times n$ system of linear equations, referred to as the normal equations. The $\hat{x}$ obtained are the values advertised as exception spoke edges of the star graph that represents the peer group in the next higher level. The size of this aggregated representation, and hence the communication incurred per peer group, is $O(n)$. This approach is well-suited for peer groups which experience unpredictable traffic patterns, since it computes the star representation that minimizes, in a least squares sense, the average distortion over all transits of the peer group.

Computational Complexity of Approach 1

We compute $(A^TA)^{-1}A^T$ only once (off-line) beforehand in $O(n^4)$ steps7.2. Each time we need to re-aggregate: we multiply the precomputed $(A^TA)^{-1}A^T$ matrix by the current value of the vector $D$. This multiplication can be done in $O(n^3)$ time. The space required to hold the precomputed value of $(A^TA)^{-1}A^T$ and to perform the necessary multiplication is $O(n^3)$. Thus the time requirement to perform an aggregation using Approach 1 (including preprocessing) is $O(n\vert V\vert\log \vert V\vert + n\vert E\vert + n^3)$, and the space requirement is $O(\vert V\vert+\vert E\vert+n^3)$.

Approach 2

A transit is an undirected path through the peer group whose endpoints are distinct border nodes. We call an unordered pair of border nodes $(B_i,B_j)$ critical if we cannot tolerate any discrepancy between the information advertised in the aggregated representation, and the true value $d_{ij}$ of the parameter for the transit between $B_i$ and $B_j$. Note: In the entire discussion that follows, we deal with unordered pairs, implying that the element denoted $(B_i,B_j)$ is equivalent to the element denoted $(B_j,B_i)$. Select a set of preserved transits $P^+ = \{ (i,j) \;\vert\; (B_i,B_j)
\; $is critical$\}$. Our approach requires $\vert P^+\vert \leq n$; the precise conditions on this set $P$ will be explained shortly. Define the set of non-critical transits $P^- = \{(p,q) \;\vert\; B_p, B_q
$ distinct border nodes, $ (p,q) \not \in P^+\}$. Note that $\vert P^+\vert +
\vert P^-\vert = {n \choose 2}$. Now for each element $(i,j) \in P^+$ define the linear equation $\phi_{ij}$ $\phi_{ij}:\;\;\; x_i + x_j = d_{ij}$ And for each element $(p,q) \in P^-$ define the linear equation $\psi_{pq}$ $\psi_{pq}:\;\;\; x_p + x_q + k_{pq} = d_{pq}$ The $\psi_{pq}$ and $\phi_{ij}$ together specify a system of ${n
\choose 2}$ linear equations in ${n \choose 2}$ unknowns, where $n$ is the number of border nodes. The $k_{pq}$ and $x_{i}'s$ are the unknowns which have to be determined. We represent this system in matrix form ${\bf A} \vec{x} = {\bf D}$,

\begin{displaymath}\left[ \begin{array}{c}
\vdots\\
\cdots {{\phi^{*}}_{ij}...
...\
\vdots\\
d_{pq}\\
\vdots\\
\end{array}
\right] \end{displaymath}

where ${{\phi^{*}}_{ij}}$ and ${{\psi^{*}}_{pq}}$ are the $0/1$ coefficients corresponding to the linear equations ${\phi_{ij}}$ and ${\psi_{pq}}$ respectively. Since each ${\psi_{pq}}$ introduces a new free variable ${k_{pq}}$, it is clear that the coefficient matrix ${\bf A}$ of the above system is non-singular if and only if the system consisting of just the ${\phi_{ij}}$ has a solution. We can now more fully describe how to select the set $P^+$: We sample the actual probabilities of the transits, (or get some estimate of the traffic patterns between every pair of borders nodes $B_i$ and $B_j$). Then we choose the $n$ most demanding of these traffic routes, with the additional constraint that the resulting system of ${\phi_{ij}}$ be non-singular. We compute the unique solution ${\bf\vec{x} }$, and give the following interpretation to the values of the components of the solution vector: The value of ${x_i}$ will be the quantity advertised on the exception spoke between the nucleus and border node $B_i$ in the star graph. The value of $k_{pq}$ is the additive distortion between the truth and what is advertised about the (non-critical) transit between border nodes $B_p$ and $B_q$. For any fixed border node $B_i$ we consider the multiset $E_i$ consisting of the $n-1$ transit errors,

\begin{displaymath}E_i = \left\{ k_{iq} \;\vert\; (i,q) \in P^- \right\}
\cup
\left\{ 0 \;\mbox{{ for each} $(i,q) \in P^+$}\right\}
\end{displaymath}

We define $k_i$ = $f(E_i)$, where $f$ is some aggregation function7.3, and consider ${k_i}$ to be a measure of the average error inherent in the star representation for paths through border ${B_i}$ of the peer group. The final representation of the peer group is the set of $x_i$ (on the exception spokes of the star), and the set of errors $k_i$. These $k_i$ values serve as a ``disclaimer'', specifying the average error inherent in the star representation for transits through border $B_i$. The representation constructed by this approach is of size at most $2n = O(n)$. Unfortunately, the current version of the PNNI standard does not have any provision for advertising the set of errors $k_i$ . We recommend that the PNNI aggregation model be augmented to permit the assignment of topology state parameters to the nucleus. If such a scheme is adopted, the set of errors $k_i$ could be associated with the nucleus and included as part of the aggregated representation. Remote switches trying to decide whether or not to route through a particular peer group could then take into account not only its aggregated representation, but also the $k_i$, which are the peer group's self-assessment of the fidelity of its advertised representation. In the absence of such an extension to PNNI, we can still use Approach 2, and only advertise the set of $x_i$ on the exception spokes of the star (without the $k_i$). This approach is well-suited for peer groups which experience regular, characterizable traffic patterns, since it computes the star representation that completely preserves the truth about the most commonly occurring transits.

Computational Complexity of Approach 2

We compute ${\bf A}^{-1}$ only once (off-line) beforehand in $O(n^6)$ steps, just as in Approach 1. Each time we need to re-aggregate: we multiply the precomputed $A^{-1}$ matrix by the current value of the vector $D$. This multiplication can be done in $O(n^4)$ time, and the total space required to store $A^{-1}$ and perform this multiplication is $O(n^4)$. Finally, we compute the $k_i$ using the aggregation function $f$ as above, (and if $f$ runs in time linear in the size of its argument set $E_i$), this is completed in at most $O(n^2)$ time. Thus the time requirement to perform an aggregation using Approach 2 (including preprocessing) is $O(n\vert V\vert\log \vert V\vert + n\vert E\vert + n^4)$, and the space requirement is $O(\vert V\vert+\vert E\vert+n^4)$.

Remarks

Lowering Distortion Further

If this representation is still found to deviate unacceptably from the true topology, it can be augmented in a systematic way to reduce the distortion, but at the cost of a larger representation. Let us assume that we desire to finish with a representation that is within $\epsilon$ multiplicative distortion of the true topology. To achieve this we can add bypass links between borders $B_i$ and $B_j$ if

\begin{eqnarray*}
{\bf {{1}\over {1+\epsilon}}\leq {{d_{ij}}\over{{d^*}_{ij}}}
\leq{1+\epsilon}}
\end{eqnarray*}



where $d_{ij}$ is the actual cost of the $(B_i,B_j)$ transit and ${{d^*}_{ij}}$ is the cost currently advertised. The weight of such the additional bypass link would be ${d_{ij}}$.

Ensuring Conservative Advertisement

We want to advertise $d_{ij}$ as the cost of traversing the peer group between border nodes $B_i$ and $B_j$. The least square solution in Approach 1 provides the best $\hat{x}=(\hat{x_1},\cdots,\hat{x_n})$, and results in $\hat{x_i} + \hat{x_j}$ being advertised as the distance between border $B_i$ and $B_j$. In some instances we may find $\hat{x_i} + \hat{x_j} \ge d_{ij}$. If we desire this advertisement to be conservative, we can add a bypass link with value exactly $d_{ij}$ between border nodes $B_i$ and $B_j$.

Measuring the Performance of an Aggregation Scheme:

We measure performance of an aggregation scheme $K$ as follows: Define

\begin{displaymath}{\bf A = {\sum_{1\leq i,j\leq n,i\neq j} \vert {d_{ij} - d^*_{ij}}} \vert}
\end{displaymath}

where $d_{ij}$ is the true cost of traversing the peer group through the borders $B_i$ and $B_j$ and $d^*_{ij}$ is what the aggregated representation advertises. For perfect aggregation one would like $d_{ij}$ and $d^*_{ij}$ to be equal for all i and j $(i \neq j)$. To quantify the extent to which this is achieved, we define for each pair $B_i$, $B_j$ of border nodes,

\begin{displaymath}{\bf
m_{ij} = 1 - {{ {\vert d_{ij}-d_{ij}^{*} \vert } c_{ij}} \over {A}} }\end{displaymath}

where $c_{ij}$ are real numbers between $0$ and $1$, chosen to represent the frequency of transits between borders $B_i$ and $B_j$. A value of $c_{ij}$ that is closer to $1$ signifies more frequent transits between the borders $B_i$ and $B_j$, making it more important to minimize the distortion in the compact representation for the cost of that transit. Specifically, we view $m_{ij}$ as the degree to which the advertised cost matches the true cost of the transit between borders $B_i$ and $B_j$. It is easily verified that for any such transit, $m_{ij}$ equals $1$ if and only if the advertised cost $d^*_{ij}$ exactly matches its true value $d_{ij}$. Furthermore, any deviation of the advertised value from the true value is reflected in the the corresponding $m_{ij}$ to an extent that is linearly proportional to the importance assigned to the transit (as specified by the scalar $c_{ij}$). We now define the performance of an aggregation scheme K with respect to a peer group with N border nodes as:

\begin{displaymath}
{\bf
Perf(K) = {{\sum_{(i, j)} {( m_{ij})}} \over {N \choose 2}}.
}\end{displaymath}

Then $0 \leq \mbox{Perf}(K) \leq 1$, and higher values of $\mbox{Perf}(K)$ indicate better performance. Also, it is clear that if the aggregation scheme is perfect, that is, $d_{ij}=d_{ij}^{*}$, then $\mbox{Perf}(K) = 1$.

Conclusion

Both approaches presented are viable, according to the criteria presented in section 2: The proposed schemes transform a peer group structure into a compact form; the final representation requires only $O(n)$ space, where $n$ is the number of border nodes of the peer group. Within the confines of an $O(n)$ space star, the proposed schemes optimally retain the information about peer group structure that is critical for making good routing decisions. Approach 1 seeks to minimize the average distortion over all transits in a least squares sense; approach 2 builds a star that completely preserves the $n$ most critical transits. The schemes are dual: the first is best suited for peergroups where traffic patterns are unpredictable; the second is suited for peergroups where traffic patterns can be characterized. The representations are efficiently computable since the most expensive operations (like inverting large matrices) are performed off-line and only need to be done once. In fact, if we we assume, not unreasonably, that the number of border nodes is significantly smaller than the number of nodes in the peer group, say $n \leq \vert V\vert^{1\over
3}$, then the time to aggregate using either of our schemes becomes7.4 $O(n\vert V\vert\log
\vert V\vert + n\vert E\vert)$. Note further that when a connection request arrives at the peer group, a route computation must be carried out to select a path for the connection. This route computation itself, using Dijkstra's algorithm, would take $O(\vert V\vert\log \vert V\vert + \vert E\vert)$ time. Thus, if the peer group leader performs re-aggregation every time $n$ new connection requests have entered the peer group, then the computational work required to do the re-aggregation can be amortized against the unavoidable work required to compute the routes for the connections themselves. Our schemes can therefore be executed scalably in a temporal sense: larger peer groups will need to re-aggregate their topology less frequently if switches are not to be excessively burdened by the computational demands of re-aggregation, but the length of this minimum time interval between re-aggregations grows only linearly in $n$, the number of border nodes.
next up previous contents index
Next: Current Limitations and Future Up: PRouST Switch Components Previous: Aggregator   Contents   Index
proust-dev@cmf.nrl.navy.mil