A Study of the Structure of the Web

Title	A Study of the Structure of the Web
Publication Type	Journal Articles
Year of Publication	1999
Authors	Deshpande A, Huang R, Raman V, Riggs T, Song D, Subramanian L
Journal	University of California, Berkeley
Date Published	1999///
Abstract	The World Wide Web is a huge, growing repository of information on a wide range of topics. It is alsobecoming important, commercially and sociologically, as a place of human interaction within different communities. In this paper we present an experimental study of the structure of the Web. We analyze link topologies of various communities, and patterns of mirroring of content, on 1997 and 1999 snapshots of the Web. Our results give insight into patterns of interaction within communities and how they evolve, as well as patterns of data replication. We also describe the techniques we have developed for performing complex processing on this large data set, and our experiences in doing so. We present new algorithms for finding partial and complete mirrors in URL hierarchies; these are also of independent interest for search and redirection. In order to study and visualize link topologies of different communities, we have developed techniques to compact these large link graphs without much information loss.

Publications