BA: Censorship Probe Scanner with Recommendations

Abstract:

Internet censorship disrupts the free flow of information, impacting users worldwide by restricting access to content and communication channels. This thesis presents the development and evaluation of ProbeScanner, a tool designed to detect, analyze, and circumvent Internet censorship mechanisms. Implemented in Kotlin and lever-aging the TLS-Attacker framework, ProbeScanner employs a modular architecture with probes targeting Domain Name System (DNS), Transmission Control Protocol (TCP), and Transport Layer Security (TLS) protocols. It systematically applies52different manipulations across various categories—including extension, message, record, Server Name Indication (SNI), and version manipulations—to identify blocked content and assess the effectiveness of circumvention strategies. A key feature of ProbeScanner is its user-friendly graphical interface, which allows users to input websites for analysis and configure scanning parameters easily. The tool not only detects censorship but also provides actionable recommendations by suggesting appropriate circumvention tools based on the successful manipulations, empowering users to bypass restrictions using methods suitable to the specific censorship encountered. We evaluate the effectiveness of ProbeScanner through testing in controlled environments, including China’s Great Firewall (GFW)—one of the most sophisticated censorship systems globally. The evaluation was conducted using a vantage point in Zhengzhou, China, where ProbeScanner scanned 2,572 domains from the Citizen Lab and Tranco lists. Among these, 714 were censored, and ProbeScanner successfully circumvented censorship for 79 of them. Despite the limited data size caused by connection issues, the results show that manipulations like TCP fragmentation and SNI modifications exploit vulnerabilities in censorship infrastructures. These findings provide insights into the behavior of modern censorship systems and validate the practical application of targeted circumvention techniques. A comparative analysis with GFWeb, a large-scale GFW measurement platform, shows that while GFWeb excels in long-term, large-scale data collection and trend analysis, ProbeScanner focuses on immediate user-level detection and intervention. Insights from recent research, including techniques like TLS record fragmentation and tools like DPYProxy, further support the efficacy of these strategies. This work bridges the gap between theoretical research and practical application in Internet censorship analysis. ProbeScanner provides effective detection and circumvention strategies but is limited by its focus on TLS protocols. Future enhancements may include support for protocols like Datagram Transport Layer Security (DTLS) and Quick UDP Internet Connections (QUIC) and the integration of machine learning for advanced detection. These further improvements are aimed at supporting access to an open internet, particularly in countries like China, where censorship is prevalent.