15:59:43 <cohosh> #startmeeting tor anti-censorship meeting 15:59:43 <MeetBot> Meeting started Thu Sep 23 15:59:43 2021 UTC. The chair is cohosh. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:43 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 15:59:45 <cohosh> hello! 15:59:49 <cohosh> and welcome :) 15:59:57 <meskio> hello 15:59:58 <cohosh> here is our meeting pad: https://pad.riseup.net/p/tor-anti-censorship-keep 16:00:24 <cohosh> please feel free to add items to the agenda 16:02:33 <cohosh> dcf1: is the first item on the agenda yours for this week? 16:02:46 <dcf1> yes, just a quick followup to what we discussed in last week's meeting 16:03:06 <dcf1> the mysterious error message was caused by using a git that was too old (circa 2010) 16:03:29 <dcf1> the error message only included goptlib because it was first in the go.mod file, but it would have happened with any other module 16:03:37 <cohosh> :o 16:03:38 <dcf1> that is all 16:03:38 <meskio> wow, I didn't expect that 16:04:17 <cohosh> awesome, heh 16:04:28 <meskio> I guess it was a headake to find out 16:04:39 <dcf1> yes it was quite a debugging adventure 16:05:18 <cohosh> XD 16:05:37 <cohosh> anything else before we move on to the reading group? 16:08:09 <meskio> I think we can move on 16:08:16 <cohosh> okay great 16:08:28 <dcf1> I invited Sambhav, they may be here 16:08:43 <meskio> nice 16:08:43 <cohosh> :) 16:09:37 <cohosh> should we wait a few minutes for them? 16:09:38 <dcf1> https://dl.acm.org/doi/10.1145/3473604.3474564 is the PDF 16:09:49 <dcf1> posted a summary this morning https://github.com/net4people/bbs/issues/86 16:09:53 <alladin> That's me. I'm happy to try and answer any questions you might have. 16:10:13 <cohosh> hey alladin welcome :) 16:10:15 <dcf1> coolio 16:10:26 <meskio> hello o/ 16:10:37 <alladin> Hi everyone o/ 16:10:45 <dcf1> So the core of this work is: do the initial handshake over an unobservable proxy, then immediately resume it without the proxy 16:11:14 <dcf1> the TLS handshake has most of the identifying features (DNS, SNI, cert), a session resumption has fewer (just SNI) 16:11:37 <dcf1> many servers tolerate using a different SNI during resumption, which is how you get around that 16:12:06 <meskio> I think is pretty neat the trick of the TLS resume, I didn't know actuall server implementations will reply properly even if you don't provide the right domain in the resume SNI :) 16:12:27 <cohosh> +1 16:12:39 <dcf1> Some of the RFC language is https://datatracker.ietf.org/doc/html/rfc6066#section-3 16:13:05 <dcf1> "The client SHOULD include the same server_name extension in the session resumption request as it did in the full handshake that established the session. A server that implements this extension MUST NOT accept the request to resume the session if the server_name extension contains a different name." 16:13:30 <dcf1> So by my reading, it's actually non-standards-conforming for the server to accept a modified SNI, but: 16:13:56 <dcf1> according to the RFC for TLS 1.3, there has historically been confusion on that issue 16:14:33 <dcf1> https://datatracker.ietf.org/doc/html/rfc8446#section-4.2.11 16:15:15 <dcf1> "In TLS versions prior to TLS 1.3, the Server Name Identification (SNI) value was intended to be associated with the session, with the server being required to enforce that the SNI value associated with the session matches the one specified in the resumption handshake. However, in reality the implementations were not consistent on which of two supplied SNI values they would use, leading to the 16:15:21 <dcf1> consistency requirement being de facto enforced by the clients." 16:15:43 <dcf1> Confusion like this may partially explain why BlindTLS was tested to work with some TLS servers and not others. 16:16:09 <alladin> It was precisely this line which led to the idea. 16:16:13 <meskio> the paper shows ~50% of success 16:16:41 <dcf1> I think "which of two supplied SNI values" = 1. the SNI stored with the session by the server, 2. the SNI sent by the client with the session resumption 16:17:16 <dcf1> alladin: in your experiments, did you limit to TLS 1.2 servers, or might there have been TLS 1.3 servers in the pool as well? 16:17:28 * meskio is amazed by all this digging into the details of the RFCs to come up with this kind of hacks :) 16:18:28 <alladin> I just had openssl s_client use TLS1.2. There might have been tls1.3 only servers in the pool, but I don't think that's highly likely 16:18:44 <dcf1> ok, that's reasonable 16:19:54 <meskio> I found pretty interesting the fact that the ISP tested was not blocking the client hello requests based on the SNI, but on the domain name in the cert 16:20:09 <cohosh> is the first "bootstrapping" step where the original DNS + TLS connection happens something that can be done of over DoH or AMP cache or domain fronting? 16:20:10 <meskio> I'm wondering how many other domains they block by certs having multiple domains on them 16:20:27 <dcf1> yeah that detail was wild 16:20:48 <dcf1> it looks like patches on top of patches on the detector side 16:20:54 <cohosh> lol 16:21:30 <alladin> meskio: so it's a mixed usage. In TLS1.2 init handshake, they ignore SNI value and instead look at certificate. In the TLS1.2 resumption though, they do look at SNI since that's the only defining trait. Hats off to them for thinking of this 16:21:36 <dcf1> cohosh: it can be any kind of tunnel that supports looking up a DNS name and establishing a TLS connection. this paper leads the tunnel open. 16:21:49 <dcf1> *leaves the choice of tunnel open 16:21:55 <alladin> In TLS1.3 though, it's a straight up filtering for SNI in both, init and resumption 16:22:10 * meskio has servers with multiple domains in the same cert, should stop doing that seeing how they block stuff around 16:22:40 <dcf1> yeah so TLS 1.3 removes one of the features, because it encrypts the server certificate (it's plaintext in 1.2) 16:22:56 <dcf1> DoH/DoT/etc. can additionally remove the DNS feature. 16:23:25 <dcf1> That leads ECH to hide the SNI, in terms of an official TLS extension, but how successful ECH will be remains to be seen 16:24:03 <dcf1> *leaves 16:25:26 <cohosh> i only got partially through the paper, but was there any similarity in the sites that did not accept the spoofed SNI value? 16:25:43 <cohosh> like what their tls server implementation was? 16:26:28 <cohosh> it was interesting that the proportion of top alexa sites for which it didn't work was greater than the proportion of randomly selected blocked sites 16:27:07 <alladin> Unfortunately, I do not have the answer to that. I just never got around to digging into it. I think that's a clear next step to understand, and even gain further confidence in, the results 16:27:41 <dcf1> I'm curious what server implementations actually do in TLS 1.3 too 16:28:05 <dcf1> If you make such a survey, it might be interesting to the IETF tls working group 16:28:10 <cohosh> yea good point, do we have 1.3 rollout stats? 16:28:17 <cohosh> like how many sites support it now? 16:29:42 <alladin> re: alexa, I agree. And 1 possible solution is perhaps top100 having wildy different server configs/implementations. While the blocked ones might share similar configs. (understandable given how frequently they have to move domains). 16:29:51 <meskio> I found that, but is from Dec 2019: https://www.ietf.org/blog/tls13-adoption/ 16:32:46 <dcf1> a criticism of this paper I have is the assumption of no IP blocking, either of the TLS server or of the handshake proxy 16:33:02 <dcf1> it is a good and necessary assumption, but I think it can be framed/understood better 16:33:13 <dcf1> this is how I think of it: 16:33:43 <dcf1> suppose you have a highly unblockable proxy, but it has drawbacks: it is slow, expensive, etc. 16:34:35 <dcf1> in fact we have many such proxies, e.g. DNS tunnel, domain fronting 16:35:21 <dcf1> BlindTLS can be seen as an optimization: it is a way of leveraging a high degree of blocking resistance only where it is most needed 16:35:50 <dcf1> use the slow, expensive tunnel to protect the TLS handshake, then do the remainder of the connection direct, for better efficiency 16:36:27 <dcf1> (you still need the assumption of no IP blocking of the TLS server itself, which is an okay assumption in the case of a CDN) 16:36:49 <meskio> from Tor perspective this is a key point, Tor relays get usually blocked by IP 16:37:24 <meskio> we need to convice cloudflare to run tor relays on their CDN IPs ;D 16:37:51 <dcf1> It's a bit like how, in Snowflake, we use domain fronting or AMP cache (which are slow or expensive) for communication with the broker, then use WebRTC (fast, cheap) for the actual data transfer. 16:38:58 <dcf1> trading off blocking resistance for efficiency 16:39:52 <dcf1> it could also be the case that the handshake proxy's cover protocol remains inconspicuous only when it is used for small data transfers, and it would be detected if used to tunnel a whole connection 16:43:45 <cohosh> dcf1: so to summarize/put another way: slow, careful bootstrapping to enable a high throughput tunnel is an anti-censorship design pattern that this technique uses 16:43:45 <dcf1> those are the points I wanted to cover, are there any others? 16:44:06 <dcf1> cohosh: that's a good way to put it 16:44:44 <dcf1> the security of BlindTLS in some way reduces to the security of the handshake proxy 16:47:48 <cohosh> alladin: this is very cool work :D thanks for joining today 16:47:59 <alladin> Thanks cohosh 16:48:13 <dcf1> yes thanks for this research and for the discussion 16:48:21 <meskio> yes, that was a very interesting conversation, and I good read :) 16:48:33 <meskio> s/I/a/ 16:48:51 <alladin> thanks everyone! A quick question though: Are there any deployments which would make it easier to test BlindTLS at larger scale? Does OONI allow for things like this? 16:49:44 <alladin> Or does it boil down to us setting up our own nodes behind different ISPs/countries? 16:49:58 <cohosh> i think you could modify the ooni web connectivity tests to check for whether servers allow a spoofed SNI 16:50:31 <cohosh> but you'd have to ask them about deploying it 16:50:34 <dcf1> You have some options 16:51:04 <dcf1> with OONI I think you'd need to write a custom test. you might want to leave a query on their Slack to get a response from them. 16:51:46 <dcf1> ICLab (https://github.com/net4people/bbs/issues/52) uses commercial VPNs, not sure if they take outside experiments, but presumably it could just be a normal program 16:52:52 <dcf1> You could of course pay for your own VPN nodes, probably cheaper than renting VPSs, but there is always the criticism that VPNs in data centers may be censored differently than residential connections 16:53:34 <dcf1> There are commercial proxy services, e.g. the ones used by https://github.com/net4people/bbs/issues/29, though I would want to do an ethical review first to be sure of where they source the proxies from 16:54:25 <dcf1> I'm guessing RIPE Atlas wouldn't support the level of control you would need. 16:54:42 <dcf1> That's what I can think of right now. 16:55:04 <dcf1> You could ask with people who specialize in censorship measurement, e.g. people from https://ooni.org/post/2020-internet-measurement-village/ 16:55:36 <alladin> This is great -- thank you! 16:56:07 <dcf1> Do we want to choose another reading group paper, of wait until next week? 16:56:27 <cohosh> i'm down to keep it going 16:56:34 <meskio> +1 16:56:37 <dcf1> there are still 4 FOCI 2021 short papers (one of them is cohosh's) 16:56:45 <dcf1> https://github.com/net4people/bbs/wiki/Reading-list 16:57:40 <cohosh> well my 4 co-authors were the lead authors on that work 16:57:52 <cohosh> it would be cool to invite them to come chat about it 16:58:09 <dcf1> I wrote a summary of that one already, waiting on the authors to comment on the draft 16:58:16 <meskio> sounds good, then we have a paper for the next reading group 16:58:32 <meskio> I guess we are talking about: https://dl.acm.org/doi/10.1145/3473604.3474563 16:58:46 <cohosh> yep 16:59:33 <cohosh> okay, i'll reach out them about joining in two weeks 16:59:47 <cohosh> let's end the meeting here 16:59:54 <cohosh> #endmeeting