15:59:43 <cohosh> #startmeeting tor anti-censorship meeting
15:59:43 <MeetBot> Meeting started Thu Sep 23 15:59:43 2021 UTC.  The chair is cohosh. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:59:43 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:59:45 <cohosh> hello!
15:59:49 <cohosh> and welcome :)
15:59:57 <meskio> hello
15:59:58 <cohosh> here is our meeting pad: https://pad.riseup.net/p/tor-anti-censorship-keep
16:00:24 <cohosh> please feel free to add items to the agenda
16:02:33 <cohosh> dcf1: is the first item on the agenda yours for this week?
16:02:46 <dcf1> yes, just a quick followup to what we discussed in last week's meeting
16:03:06 <dcf1> the mysterious error message was caused by using a git that was too old (circa 2010)
16:03:29 <dcf1> the error message only included goptlib because it was first in the go.mod file, but it would have happened with any other module
16:03:37 <cohosh> :o
16:03:38 <dcf1> that is all
16:03:38 <meskio> wow, I didn't expect that
16:04:17 <cohosh> awesome, heh
16:04:28 <meskio> I guess it was a headake to find out
16:04:39 <dcf1> yes it was quite a debugging adventure
16:05:18 <cohosh> XD
16:05:37 <cohosh> anything else before we move on to the reading group?
16:08:09 <meskio> I think we can move on
16:08:16 <cohosh> okay great
16:08:28 <dcf1> I invited Sambhav, they may be here
16:08:43 <meskio> nice
16:08:43 <cohosh> :)
16:09:37 <cohosh> should we wait a few minutes for them?
16:09:38 <dcf1> https://dl.acm.org/doi/10.1145/3473604.3474564 is the PDF
16:09:49 <dcf1> posted a summary this morning https://github.com/net4people/bbs/issues/86
16:09:53 <alladin> That's me. I'm happy to try and answer any questions you might have.
16:10:13 <cohosh> hey alladin welcome :)
16:10:15 <dcf1> coolio
16:10:26 <meskio> hello o/
16:10:37 <alladin> Hi everyone o/
16:10:45 <dcf1> So the core of this work is: do the initial handshake over an unobservable proxy, then immediately resume it without the proxy
16:11:14 <dcf1> the TLS handshake has most of the identifying features (DNS, SNI, cert), a session resumption has fewer (just SNI)
16:11:37 <dcf1> many servers tolerate using a different SNI during resumption, which is how you get around that
16:12:06 <meskio> I think is pretty neat the trick of the TLS resume, I didn't know actuall server implementations will reply properly even if you don't provide the right domain in the resume SNI :)
16:12:27 <cohosh> +1
16:12:39 <dcf1> Some of the RFC language is https://datatracker.ietf.org/doc/html/rfc6066#section-3
16:13:05 <dcf1> "The client SHOULD include the same server_name extension in the session resumption request as it did in the full handshake that established the session. A server that implements this extension MUST NOT accept the request to resume the session if the server_name extension contains a different name."
16:13:30 <dcf1> So by my reading, it's actually non-standards-conforming for the server to accept a modified SNI, but:
16:13:56 <dcf1> according to the RFC for TLS 1.3, there has historically been confusion on that issue
16:14:33 <dcf1> https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.11
16:15:15 <dcf1> "In TLS versions prior to TLS 1.3, the Server Name Identification (SNI) value was intended to be associated with the session, with the server being required to enforce that the SNI value associated with the session matches the one specified in the resumption handshake.  However, in reality the implementations were not consistent on which of two supplied SNI values they would use, leading to the
16:15:21 <dcf1> consistency requirement being de facto enforced by the clients."
16:15:43 <dcf1> Confusion like this may partially explain why BlindTLS was tested to work with some TLS servers and not others.
16:16:09 <alladin> It was precisely this line which led to the idea.
16:16:13 <meskio> the paper shows ~50% of success
16:16:41 <dcf1> I think "which of two supplied SNI values" = 1. the SNI stored with the session by the server, 2. the SNI sent by the client with the session resumption
16:17:16 <dcf1> alladin: in your experiments, did you limit to TLS 1.2 servers, or might there have been TLS 1.3 servers in the pool as well?
16:17:28 * meskio is amazed by all this digging into the details of the RFCs to come up with this kind of hacks :)
16:18:28 <alladin> I just had openssl s_client use TLS1.2. There might have been tls1.3 only servers in the pool, but I don't think that's highly likely
16:18:44 <dcf1> ok, that's reasonable
16:19:54 <meskio> I found pretty interesting the fact that the ISP tested was not blocking the client hello requests based on the SNI, but on the domain name in the cert
16:20:09 <cohosh> is the first "bootstrapping" step where the original DNS + TLS connection happens something that can be done of over DoH or AMP cache or domain fronting?
16:20:10 <meskio> I'm wondering how many other domains they block by certs having multiple domains on them
16:20:27 <dcf1> yeah that detail was wild
16:20:48 <dcf1> it looks like patches on top of patches on the detector side
16:20:54 <cohosh> lol
16:21:30 <alladin> meskio: so it's a mixed usage. In TLS1.2  init handshake, they ignore SNI value and instead look at certificate. In the TLS1.2 resumption though, they do look at SNI since that's the only defining trait. Hats off to them for thinking of this
16:21:36 <dcf1> cohosh: it can be any kind of tunnel that supports looking up a DNS name and establishing a TLS connection. this paper leads the tunnel open.
16:21:49 <dcf1> *leaves the choice of tunnel open
16:21:55 <alladin> In TLS1.3 though, it's a straight up filtering for SNI in both, init and resumption
16:22:10 * meskio has servers with multiple domains in the same cert, should stop doing that seeing how they block stuff around
16:22:40 <dcf1> yeah so TLS 1.3 removes one of the features, because it encrypts the server certificate (it's plaintext in 1.2)
16:22:56 <dcf1> DoH/DoT/etc. can additionally remove the DNS feature.
16:23:25 <dcf1> That leads ECH to hide the SNI, in terms of an official TLS extension, but how successful ECH will be remains to be seen
16:24:03 <dcf1> *leaves
16:25:26 <cohosh> i only got partially through the paper, but was there any similarity in the sites that did not accept the spoofed SNI value?
16:25:43 <cohosh> like what their tls server implementation was?
16:26:28 <cohosh> it was interesting that the proportion of top alexa sites for which it didn't work was greater than the proportion of randomly selected blocked sites
16:27:07 <alladin> Unfortunately, I do not have the answer to that. I just never got around to digging into it. I think that's a clear next step to understand, and even gain further confidence in, the results
16:27:41 <dcf1> I'm curious what server implementations actually do in TLS 1.3 too
16:28:05 <dcf1> If you make such a survey, it might be interesting to the IETF tls working group
16:28:10 <cohosh> yea good point, do we have 1.3 rollout stats?
16:28:17 <cohosh> like how many sites support it now?
16:29:42 <alladin> re: alexa, I agree. And 1 possible solution is perhaps top100 having wildy different server configs/implementations. While the blocked ones might share similar configs. (understandable given how frequently they have to move domains).
16:29:51 <meskio> I found that, but is from Dec 2019: https://www.ietf.org/blog/tls13-adoption/
16:32:46 <dcf1> a criticism of this paper I have is the assumption of no IP blocking, either of the TLS server or of the handshake proxy
16:33:02 <dcf1> it is a good and necessary assumption, but I think it can be framed/understood better
16:33:13 <dcf1> this is how I think of it:
16:33:43 <dcf1> suppose you have a highly unblockable proxy, but it has drawbacks: it is slow, expensive, etc.
16:34:35 <dcf1> in fact we have many such proxies, e.g. DNS tunnel, domain fronting
16:35:21 <dcf1> BlindTLS can be seen as an optimization: it is a way of leveraging a high degree of blocking resistance only where it is most needed
16:35:50 <dcf1> use the slow, expensive tunnel to protect the TLS handshake, then do the remainder of the connection direct, for better efficiency
16:36:27 <dcf1> (you still need the assumption of no IP blocking of the TLS server itself, which is an okay assumption in the case of a CDN)
16:36:49 <meskio> from Tor perspective this is a key point, Tor relays get usually blocked by IP
16:37:24 <meskio> we need to convice cloudflare to run tor relays on their CDN IPs ;D
16:37:51 <dcf1> It's a bit like how, in Snowflake, we use domain fronting or AMP cache (which are slow or expensive) for communication with the broker, then use WebRTC (fast, cheap) for the actual data transfer.
16:38:58 <dcf1> trading off blocking resistance for efficiency
16:39:52 <dcf1> it could also be the case that the handshake proxy's cover protocol remains inconspicuous only when it is used for small data transfers, and it would be detected if used to tunnel a whole connection
16:43:45 <cohosh> dcf1: so to summarize/put another way: slow, careful bootstrapping to enable a high throughput tunnel is an anti-censorship design pattern that this technique uses
16:43:45 <dcf1> those are the points I wanted to cover, are there any others?
16:44:06 <dcf1> cohosh: that's a good way to put it
16:44:44 <dcf1> the security of BlindTLS in some way reduces to the security of the handshake proxy
16:47:48 <cohosh> alladin: this is very cool work :D thanks for joining today
16:47:59 <alladin> Thanks cohosh
16:48:13 <dcf1> yes thanks for this research and for the discussion
16:48:21 <meskio> yes, that was a very interesting conversation, and I good read :)
16:48:33 <meskio> s/I/a/
16:48:51 <alladin> thanks everyone! A quick question though: Are there any deployments which would make it easier to test BlindTLS at larger scale? Does OONI allow for things like this?
16:49:44 <alladin> Or does it boil down to us setting up our own nodes behind different ISPs/countries?
16:49:58 <cohosh> i think you could modify the ooni web connectivity tests to check for whether servers allow a spoofed SNI
16:50:31 <cohosh> but you'd have to ask them about deploying it
16:50:34 <dcf1> You have some options
16:51:04 <dcf1> with OONI I think you'd need to write a custom test. you might want to leave a query on their Slack to get a response from them.
16:51:46 <dcf1> ICLab (https://github.com/net4people/bbs/issues/52) uses commercial VPNs, not sure if they take outside experiments, but presumably it could just be a normal program
16:52:52 <dcf1> You could of course pay for your own VPN nodes, probably cheaper than renting VPSs, but there is always the criticism that VPNs in data centers may be censored differently than residential connections
16:53:34 <dcf1> There are commercial proxy services, e.g. the ones used by https://github.com/net4people/bbs/issues/29, though I would want to do an ethical review first to be sure of where they source the proxies from
16:54:25 <dcf1> I'm guessing RIPE Atlas wouldn't support the level of control you would need.
16:54:42 <dcf1> That's what I can think of right now.
16:55:04 <dcf1> You could ask with people who specialize in censorship measurement, e.g. people from https://ooni.org/post/2020-internet-measurement-village/
16:55:36 <alladin> This is great -- thank you!
16:56:07 <dcf1> Do we want to choose another reading group paper, of wait until next week?
16:56:27 <cohosh> i'm down to keep it going
16:56:34 <meskio> +1
16:56:37 <dcf1> there are still 4 FOCI 2021 short papers (one of them is cohosh's)
16:56:45 <dcf1> https://github.com/net4people/bbs/wiki/Reading-list
16:57:40 <cohosh> well my 4 co-authors were the lead authors on that work
16:57:52 <cohosh> it would be cool to invite them to come chat about it
16:58:09 <dcf1> I wrote a summary of that one already, waiting on the authors to comment on the draft
16:58:16 <meskio> sounds good, then we have a paper for the next reading group
16:58:32 <meskio> I guess we are talking about: https://dl.acm.org/doi/10.1145/3473604.3474563
16:58:46 <cohosh> yep
16:59:33 <cohosh> okay, i'll reach out them about joining in two weeks
16:59:47 <cohosh> let's end the meeting here
16:59:54 <cohosh> #endmeeting