PG: Securing E-Mail with Automated Learning (SS24)

E-Mail is used on a daily basis. Most websites require an E-Mail when registering. The E-Mail often serves as the mechanism to recover an account. Further, a lot of personal and confidential data is exchanged using mails. Due to this, the account and any transmitted mail are of immense value for attackers and should be protected. For this multiple security mechanisms have been added to the E-Mail protocols (IMAP, POP3, SMTP) over the years. The most notable mechanisms are TLS (IMAPS, POP3S, SMTPS) and End-to-End encryption (PGP, S/MIME).

These security mechanisms were not available from the start, but instead added after the fact. This means, the mechanisms had to be designed with backwards compatibility in mind. In the case of TLS this resulted into two modes being available: explicit TLS ("STARTTLS") and implicit TLS. Implementations then need to be able to handle unencrypted connections, explicitly encrypted connections and implicitly encrypted connections. This results in even more complexity, allowing for bugs [1].

In addition to this, TLS itself is already a complex protocol with its own state machine. Joeri de Ruiter has analyzed the state machines of TLS in the past and uncovered flaws in several implementations [2].

There have also been attacks on the End-to-End encryption. For example, EFAIL [3] uncovered several implementation flaws in Clients handling HTML Mails. Other attacks target the visual representation of End-to-End encryption in E-Mail clients [4].

Other attacks do not target the encryption itself, but allow for spoofing sender E-Mails due to servers not following the protocol exactly [6].

All of the stated attacks [1, 3, 4 ,6] were found and tested (semi-)manually. The goal of this PG is to test E-Mail servers and clients automatically. We want to reproduce prior findings, and possibly discover new issues with our automatic approach.

High-Level Goals

  • Implement library of servers and clients to automatically evaluate
  • Implement a driver to automatically control servers and clients
  • Learn state machine of servers and clients
    • Goal: Find bugs from [1] automatically
  • Optional: Fuzz S/MIME and PGP implementations (esp. GUI)
    • Also some manually defined test cases
    • Should be automatically evaluated

Planned Steps and Probable Obstacles

This is not a fix plan. We may diverge from this, adjust this, or extend this. Some steps can or should definitely be parallelized.

  1. Create docker/podman/... library of E-Mail servers and clients
    • Clients often rely on GUIs
    • GUIs in a container are complicated
    • Some servers or clients only run on Windows or macOS
    • Android would be interesting too
  2. Classify which servers exist in the wild
    • Large-scale scan using zgrab (+zdns and/or zmap)
  3. Create an interface to control the clients inside the containers/...
    • Programmatically controlling a GUI is harder than a CLI
    • It might make sense to adopt existing protocols (e.g., something WebDriver like. Microsoft did this with WinAppDriver, a now dead project...)
  4. Allow defining unit tests against clients/servers and run them automatically against multiple implementations
    • Recreate some findings manual to verify test execution
  5. Recreate prior findings using fuzzing
    • (Most likely we will focus on one or two of the following points)
    • Learn state machines of servers and clients (cf. [1,2])
    • Fuzz mails with GUI feedback (cf. [4,5])
    • Recreate SMTP Smuggling [6]
      • May require configuring multiple SMTP servers after each other.
    • Maybe: Recreate E-Fail [3]
      • This may be hard as the feedback for the fuzzer would be the called backchannel.
  6. Optional: Fuzz S/MIME and PGP implementations
    • CLIs and GUIs

Prerequisites

  • Programming skills
    • Language is somewhat open to discussion
    • Java/Kotlin will most likely appear to integrate with TLS-Attacker tools
    • Python may appear from prior tools
  • Interest in Fuzzing/State Learning
  • Interest into E-Mail Protocols
  • Recommended: Finished RWCE Lecture

References

[1]: Why TLS is better without STARTTLS: A Security Analysis of STARTTLS in the Email Context

[2]: Protocol State Fuzzing of TLS Implementations

[3]: EFAIL

[4]: “Johnny, you are fired!” – Spoofing OpenPGP and S/MIME Signatures in Emails

[5]: Security Analysis of the 3MF Data Format

[6]: SMTP Smuggling: Info Website 37C3 Website 37C3 Video