DAITA v1 and v2 defenses

At the time of writing, Mullvad is transitioning DAITA from v1 to v2. All latest platform releases are now fully supporting v2. Besides some UX improvements to the Mullvad VPN app, defense updates are mostly transparent to users. It should lead to significantly better performance, especially on mobile.

I collaborate closely with Mullvad on DAITA. This blog post shares some details on how DAITA v1 works and what is new in v2. I’m sorry in advance if it gets a bit specialized/academic at times. If you’d like to discuss DAITA, please reach out! I’m happy to answer questions and discuss.

What is a defense?

Before getting into v1 and v2, we need to define what a defense is exactly. This is a super-important detail with a twist later.

DAITA builds on Maybenot, a framework for traffic analysis defenses. One instance of Maybenot is running at the client and another at the VPN server (or Tor relay, or HTTPS server, etc).

An instance of Maybenot takes as input a list of probabilistic state machines and limits on padding and blocking actions. The limits can stop machines from sending padding traffic or blocking outgoing traffic. Machines encode logic for when to pad and/or block. Currently, Mullvad’s integration of Maybenot does not support blocking, but can get similar effects by spamming padding.

A defense (note singular) consists of the inputs to the instances of Maybenot at the client and the VPN server/relay. These inputs are different. Clients and relays run different machines with different limits. This is especially important because typically clients download a lot more than they upload.

DAITA v1 defenses

For DAITA v1, we have a single hardcoded configuration at the client that is shared among multiple defenses (that differ at the relay).

Client-side

At the client, we hardcoded four state machines. The four machines:

Sends a padding packet if no data has been sent on the tunnel in [1.5,9.5] seconds.
Ensures that a packet is sent for every 3rd packet received.
Ensures that a packet is sent for every 5th packet received.
A machine that with some probability sends padding packets after randomized delays.

Machine 1 collapses NetFlow records, ensuring that the flow appears active (this reduces the resolution of these records). Based on Tor’s connection-level padding.

Machines 2 and 3 aim to make the client’s outgoing traffic a function of the incoming traffic. This is inspired by the DynaFlow/RegulaTor clients. If the relationship between incoming and outgoing traffic were perfect, there would be no extra information in the outgoing traffic. This is a neat design for website fingerprinting defenses that can block and are simulated. However, this is not the case for DAITA.

You might have spotted above that if machine 2 sends a packet for every 3rd packet, then machine 3 would never send anything (since it would be reset by machine 2, never leading to 5 packets received without having sent anything). Alone, this is correct (if you simulated the machines), but the integration of Maybenot for DAITA v1 (particularly on Windows) results in an aggregation of events as reported to Maybenot from WireGuard and small integration delays in the reporting (integration specific). Therefore, machine 3 is actually useful when aggregation takes place.

Instead of fighting event aggregation, integration delays, and the lack of blocking of outgoing traffic for DAITA v1, we turn this into a feature by embracing randomized defenses (unlike DynaFlow and RegulaTor which aim to ``regularize’’ traffic). This is where machine 4 comes in. Machine 4 can both create completely fake traffic when the connection is otherwise idle and add additional padding when real traffic is running.

Relay-side

The relay has multiple defenses. In DAITA v1, the relay randomly picks from many defenses. Each of those defenses have the same client-side machines and configuration, as just described. The relay-side is different though and consists of 3 state machines each.

In gist, each relay-side defense consists of a common NetFlow machine and an Interspace-inspired padding machine. Interspace was evaluated a couple of years ago as a decent website fingerprinting defense. The third machine in each defense is unique for the defense, generated to improve the accuracy reduction against Deep Fingerprinting (DF) and Robust Fingerprinting (RF), and not shared with clients.

Note here that the simple client machines will now behave differently for each of the many defenses, because the client-side machines are event driven by input (padding) from the many randomized relay machines.

We shared an evaluation of the first eight DAITA servers a while back. While far from perfect, it is clear that DAITA is much better than WireGuard without DAITA, even in the strong scenario evaluated. I write strong scenario, because one has to keep in mind that the assumed adversary above has trained a classifier targeting DAITA v1 in the first place with no background traffic from the client (Spotify, other active tabs in the browser etc), taking the peculiarities of DAITA into account.

Sorry for being a bit vague above on the design of the relay-side of defenses, but it is intentional. Unlike traditional settings of traffic analysis defenses, here, the defenses are actually more akin to cryptographic keys than cryptographic algorithms. There is value to an adversary targeting DAITA v1 to get access to the machines. This brings us to v2!

DAITA v2 defenses

For DAITA v2, we’ve done some updates. Now, there is a simple protocol between client and relay to setup the defense when the connection is established. This is similar to how post-quantum secure tunnels are negotiated (some details indirectly here).

When a v2 client negotiates DAITA, the relay will select from a database of defenses and send appropriate machines and limits (padding budgets) to the client. No more hardcoded machines at the client. Defenses are dynamic!

There are many advantages here. For one, we now have an architecture that enables us to continuously improve our traffic analysis defenses and rapidly get them out to DAITA users. There is an inherent trade-off in the space between overhead costs and protection. Attacks are getting much better. Internet connections are getting faster. There is much work here to be done to optimize the user experience and protection against state-of-the-art attacks.

Another big advantage is that dynamic defenses are a novel defense strategy. Typically, in good adherence to Kerckhoffs’s principle, we consider defenses as public and available to adversaries. Given (unpublished) methods, we can generate a lot of effective and efficient defenses. This allows us to deprive the adversary of direct access to defenses as a form of defense in depth. It also opens up for weaponizing drift. For the VPN setting, dynamic defenses fit perfectly since trust is already placed in VPN servers.

If an adversary spends significant resources on tailoring a traffic analysis attack on DAITA, by periodically updating our database of defenses we invalidate their models for attacking future traffic. Keeping an attack up-to-date requires more effort (and Mullvad VPN subscriptions 😉).

We have some ways still to get to fully ephemeral (databases of) defenses, but DAITA v2 is an important stepping stone. Today, there are many more defenses in DAITA v2 than in v1. The defenses will change over time. If you are researcher evaluating DAITA, I would recommend you keep detailed notes about your experiments.

When we evaluate DAITA, we still always give the adversary full access to defenses. Here is evaluation results comparing DAITA v1 and v2 defenses in November 2024, simulated on the BigEnough dataset:

In gist, v2 gives the same level of protection (y-axis) as v1, but at half the average bandwidth overhead (x-axis). The deployed defenses in November 2024 as we started beta-testing v2 are circled.

Wrapping up

DAITA v2 improves on v1 with dynamic defenses. Performance should be better on average due to the reduced bandwidth overhead, especially on mobile. Because defenses are dynamic, if you end up with a slow connection, just reconnect like you would due to an overloaded VPN relay. You’ll get a new DAITA defense (with overwhelming probability).

We are wrapping up an academic paper on the details and theory behind ephemeral defenses. It will be open access. Together with the paper, we will release an open-source library and a cli-tool for creating them. Just like Maybenot, they’ll both be open source and available as Rust crates.