RAN Journal

Thursday, April 19, 2007

Evolution and Structure of the Internet

"Evolution and Structure of the Internet: A Statistical Physics Approach", by Romualdo Pastor-Satorras and Alessandro Vespignani. Cambridge University Press, 2004.

In my recent study on the structure of Internet, I happened to read most of this book. It is a very concise book on the structure of the Internet topology, its characterizing parameters, their statistical models, descriptions of topology creation tools available in the field and analysis of their performances. It is an easy read for the people who want to get the idea of the structure of the Internet as well as an impressive read for the people who want to understand it in a statistical point-of-view.

Internet can be visualized as a connected graph with the routers as the graphs node and their connections as the edges in the graphs. I want to short-list the defining characteristics of the Internet topology:
  • degree: number of edges branches from a node in the Internet topology (property of nodes). Internet shows heavy-tail distribution in degree distribution which means that, while a huge number of nodes has very small degree, there are a small number of nodes with very large degree. Which also implies the "popular-get-more-popular" concept.
  • shortest path length: the number of edges in the shortest paths between two end-points (property of node-pairs). Internet shows small world phenomena, that is, the average shortest path length in Internet is small.
  • betweenness: the number of shortest paths goes through a particular node (property of nodes).
  • clustering coefficient: an index indicating how well a node's neighbors are connected among themselves (property of nodes).
  • scale-free behavior: it is a property of the topology as a whole. It says that, at whatever granularity we look at the Internet topology (AS-topology, router topology, intra-AS router topology), the properties of the topology follows the same statistical model.
  • rich-club phenomena: this says that larger degree nodes are connected to each other with high probability.
The book also makes the following observations on the geographical characteristics of the Internet:
  • router-density is correlated with the population density. It is further correlated with the wealth of the regions.
  • Logical Internet distance (hop count) does not correlate
From the book, it is obvious that none of the existing Internet topology generators take all the above characteristics of the Internet topology into consideration. It is important to carefully go through the documentation of the topology generators and choose one wisely that models the parameter(s) of your interest correctly.

Read more!

Wednesday, April 18, 2007

BID-Scheme

It is an entry after a very long time. After my internship at INRIA, France, my research direction suddenly changed into bandwidth based resource discovery. As the first step towards it, I have been working on a landmarking scheme for distributed bandwidth prediction.

The bandwidth landmarking scheme I developed, which is named as BID-scheme, in inspired from landmarking schemes for network latency prediction and another bandwidth prediction scheme called BRoute. I will make another detailed entry on BRoute. I build the node states for bandwidth prediction similar to the latency-based landmarking scheme, but uses a deduction scheme similar to the one used in BRoute.

The landmarks in my BID-scheme are not some chosen nodes as in the latency-based landmarking schemes. BID-scheme uses some selected autonomous system (AS) domains as landmark domains (not landmark nodes). Another difference between the BID-scheme and the latency-based landmarking schemes are that, in BID-scheme, nodes are not placed in a Cartesian space -- it is impossible because bandwidth is not an additive measure like latency (and neither abides by the triangular inequality). Instead, the nodes in the BID-scheme probes the landmark domains to construct BID tables. Each row in a BID table belongs to a landmark AS and records information about it -- distance to it in terms of AS hop counts, average bottleneck bandwidth observed in reaching the AS and a bloom filter that stores information about the landmark AS's probability of being in the route towards other ASes.

Deduction of bandwidth between two end-points works in two phases: first to guess the most probable AS domain (from the landmark set) to be in between the two end-points; and then to deduce the end-to-end bandwidth using the two links -- source point to middle AS and middle AS to sink point. The bloom filters from the BID-table from both end-nodes are used to determine the best probable middle AS and the respective bandwidths are used to determine the end-to-end bandwidth.

One definite advantage of the BID-scheme over the BRoute scheme is that the BID-tables of the nodes can be used to create bandwidth identifiers (BIDs) of the nodes which can be used to create BID-based routing, in other words, a bandwidth-based discovery scheme. This BID-based routing/discovery scheme is a direct derivative of the DHT-based discovery schemes like Pastry.

I performed a simulation study on the scheme which showed promising results. We have submitted a paper to GlobeCom 2007 on this research. A journal version is currently under process.

Read more!