Exploring Metro Systems Worldwide on Wikipedia

by Anastasia Salnikova

 

For my final project, I chose to analyze a corpus consisting of all of the completed metro systems in the world. I compiled the data from the Wikipedia page titled “List of metro systems”, which contains external links for each metro system in the world, including those that are still in development. I decided to analyze metro systems because I am very interested in urban planning and development, and metro systems represent some of the highest levels of urban development that cities can achieve. I am also interested in the diverse history of these metro systems, with some systems being around for over a century, and other systems only emerging in the past few decades as a result of broader economic development in countries such as China and India. For the purposes of this analysis, I chose to only analyze those metro systems which are complete, as they would have the most extensive page histories and information. Since there are around 170 completed metro networks in the world, the corpus contains around 170 pages, making it a good candidate for extensive analysis and comparisons. The ordering of the pages on the Docs page follows the same format as the Wikipedia page: the metro systems are listed in the order of the countries where their respective city is located. This makes it easy to see which countries have more metro systems than others.

On the list of metro systems page, Wikipedia notes which metro system is the longest by route length (Shanghai), has the most riders (Beijing), has the most stations (New York City), and the oldest (London). One particular observation I had was that the London Underground page had page revisions dating all the way back to 2001. Given that Wikipedia was founded in January of 2001, this page must have been one of the first to make it onto the site, whereas a lot of the other pages only have revision histories dating back to 2003/2004 at the earliest. Another observation I had about the London Underground page was that it was translated into 74 languages, the highest total out of any metro system page. In addition, the page is also the longest, with 1865 words. In this case, the extensive history of the London Underground system is linked to its long page revision history, scope of translation, and page length.

Another observation I had relates to the differences in the pictures chosen for different pages. There seems to be four different trends: it is either a picture of a train in the system, a map of the system, the logo of the system, or a collage of pictures of different stations contained within the system. The latter only applies to metro systems in Russia, which I thought was interesting and reflective of the culture of metro systems there. In my external research of metro systems and public transportation networks, I have found that in Russia in particular there is a heavy focus on the architecture of metro stations, dating back to the Soviet tradition. For example, Moscow has some of the most beautiful and elaborate metro stations in the world. I found it fascinating that their Wikipedia pages reflected this focus on architecture and station aesthetics, and that the Russian pages were the only ones with these collages of stations.


A collage of Moscow metro stations.

On the same topic of images chosen, there doesn’t seem to be a fully distinct trend on which pages have which type of image. However, it seems to me that a lot of the lesser known metro systems simply have a picture of an indistinct train arriving, whereas the more known systems have a picture of their logo. Most of the pages contained in this corpus have the former as their first image, but pages such as the New York City Subway and the London Underground are clearly marked by their logos rather than arriving trains. To me this is a bit more confusing, as I would think that lesser known systems would be better identified by their logos, and more well-known systems better identified by their trains. Another trend that I noticed was that the newer systems that have developed in recent decades, particularly in China, all have their first image as a map of the system rather than a train or a logo. I can speculate that because these systems are so new, a lot of their Wikipedia pages were created around the same time (around 2009) and thus were created in similar formats. In addition, a lot of the systems are continuing to develop and having a map of the system as the identifier is helpful in tracking the evolution of their development.

In terms of changes over time in the first paragraph, I decided to focus on the Berlin U-Bahn page. I found that in the first few years after the creation of the page, the first paragraph mentions the Berlin S-Bahn and the two networks’ relationship. However, from 2006-2015 no mention is made of the S-Bahn network, only focusing on characteristics of the U-Bahn. In German-speaking countries, the S-Bahn refers to trains that move from the suburbs to the city center and the U-Bahn refers to trains that move within the city center. Given that the S-Bahn reference was removed for close to a decade, I could speculate that the authors of the page found it less relevant to mention the other system so early on in the page. In order to make a better comparison, I added the Berlin S-Bahn page to my corpus to see if it mentions the U-Bahn in its first paragraph. For every year, the S-Bahn page does mention the U-Bahn. This made me think that the U-Bahn is possibly a more freestanding entity considering that it moves within the actual city, whereas the S-Bahn is dependent on the existence of the U-Bahn system since it only stops in a few places within the city of Berlin.

Overall, my analysis of this corpus touches on four main themes: the relationship between system history and Wikipedia page development, the importance of system aesthetics to the Wikipedia page, the relationship between time of page creation and similarity to other pages, and the way in which different systems relate to each other as indicated by their Wikipedia pages. I found that this corpus in particular exhibited high similarity between pages related to the same country, which may reflect the nationalist nature of Wikipedia as a whole. Presumably it is mainly individuals that have close experience with the country in question that are capable of writing the most in-depth Wikipedia analyses. As a result, pages related to the same topic for a certain country may carry these innate similarities.