It really exists: the Terra Incognita of the Web.

Story byEdial Dekker

Edial Dekker is a New Media student at the University in Amsterdam. He works as a freelancer and is specialized in the arts of data visualiz(show all)Edial Dekker is a New Media student at the University in Amsterdam. He works as a freelancer and is specialized in the arts of data visualization. He is also co-founder of BLOG08 and is involved in many side-projects that have to do with new media. If you are interested in data visualizations, be sure to drop an e-mail.

See hisLinkedIn profileandBlogfor more information.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

More TNW

About TNW

No round-trips

Most search-engines do not even try to reach the full Web, because indexing as many as websites as possible isn’t necessarily the best way to provide the best search results. The Web is big yet small. But the small world behind the Web is a bit misleading. The Web is ascale-free network,dominated by hubs and nodes with a very large number of links. The World Wide Web has a directed structure.Andrei Broder, Vice President of Emerging Search Technology for Yahoo!, was the first person to notice how this directed network had consequences for the topology of the Web itself. For example, if you want to go from website A to website D, you can start from node A, then go to node B, which has a link to node C, which points to D. But you can’t make a round-trip. Most likely there is a different route one would have to find for going from node D to node A.

The four different continents of the Web

Albert-László Barabási, a Hungarian scientist, famous for contributing his insights on network theories, has tried to map the Web into four different continents:A Strongly Connected, or Central Core (SCC): this contains a quarter of all websites, it gives a home to all indexed websites and is easy navigable. This does not mean there is a link between all nodes; but the paths are defined and allows you to surf between the nodes.Than there are theIN and the OUT continents: these continents are just as large as the Central Core but are much harder to navigate. From the IN continent you can easily reach the SCC, but there is no path taking you back to the IN continent. In contrast, the OUT continent can easily be reached from the SCC, but has links to take you back to the core (where all the magic happens). The OUT continent is mostly populated by corporate websites that can easily be reached from outside, but once you get in, there is no way out.

The fourth continent is made out ofTendrils and disconnected Islands; they are interlinked groups that are unreachable from the SCC and have no links back to it. These websites can contain thousands of documents. The location of these websites have nothing to do with the content, but with relation to other documents.

There’s no way you can reach it

These four continents significantly limit the Web’s navigability. Where we can go, depends on the continent you start your search at. No matter how many times you time you want to click, when you are in the Central Core there is no way you can reach the IN continent or the Islands that surround it. Ever realized why search engines are giving user the option to submit websites? It’s because then the crawlers cansniff intothose isolated islands that can otherwise never be found.

Is this fragmented structure here to stay? Barabási thinks it is. As long links remain directed, homogenization will never occur. One of the founding fathers of the Web,Tim Berners-Leehas been stressing the importance of links that track back to where they are linked from, for many years. The way blogs use the track-back system, can also be used for connecting the IN and OUT continent. The bottom line is that directed networks always break into the same four continents. The only way to organize is to reorganize the relations documents have with each other, semantic web anyone?

TNW Podcast: Endless possibilities of a digital stethoscope with Diana van Stijn, Lapsi Health

How AI can help you make a computer game without knowing anything about coding

Discover TNW All Access

Vay secures €34M to bring remote-controlled cars to the streets of Europe

Dutch carbon capture startup Skytree opens offices in US, Canada