MA: Ana­lyz­ing Cen­sor­ship Be­ha­vi­or: Dif­fer­ences in and between Autonom­ous Sys­tems

Abstract:

Many state actors in different countries around the world perform some kind of censorship while we focus on the censorship of websites. Research in the direction of website censorship and censorship circumvention has become very important over recent years. One of the main challenges of measuring and analyzing censorship is that it is often unclear at which specific point censorship is applied. Another challenge with analyzing censorship measurements is that it is often difficult to identify patterns or autonomous systems that behave similarly in a country. In this Master’s thesis, we try to tackle those challenges. For this purpose, we utilize ZMap and extend Censor Scanner as scanning tools to obtain censorship measurements, which are then analyzed with a newly proposed automatic analysis. This analysis utilizes various data analysis techniques, such as clustering. We utilize clustering to find patterns and similarly behaving autonomous systems as well as potential outliers, which have unique censorship behavior. We prove the applicability of clustering by using it in censorship analyses in Russia, India, and the USA.

Our analysis reveals new insights for specific countries as well as general censorship behavior. We identify that there is no central entity that controls all censorship for all three countries. Instead, we reveal autonomous systems as a point of control for India and the USA with the help of our automatic analysis. The reason for this is that, in these two countries, the Echo IPs in an autonomous system are very consistent and behave the same for the majority of the time. For Russia, the autonomous systems are more unstable overall, which is why we are not confident in identifying autonomous systems as a point of control for this country. Our analysis additionally reveals four main censorship patterns. Oftentimes, autonomous systems have barely any or no censoring behavior present, which is the first pattern. As a second pattern, we identify private autonomous systems that censor everything. The third pattern is present when an autonomous system decides to specifically censor websites, which the government labels as that they should be censored. This pattern is specifically well visible in the case of Russia. Finally, we reveal autonomous systems that censor specific websites due to personal gain as a fourth pattern. Especially in the case of India, we identify an autonomous system that decides to censor specific websites due to financial gain. The obtained results prove that the newly proposed automatic analysis, with the help of clustering, works well and can detect censorship patterns as well as unique censorship behavior from outliers.