15:59:27 #startmeeting anti-censorship meeting 15:59:27 Meeting started Thu Nov 12 15:59:27 2020 UTC. The chair is cohosh. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:27 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:59:32 hey everyone 15:59:44 hi 15:59:45 hi 16:00:02 here is our meeting pad: https://pad.riseup.net/p/tor-anti-censorship-keep 16:00:44 phw: you around? 16:02:08 i wonder if we should move the meeting later now with the time change 16:02:36 might be a good idea 16:03:37 ugh, i forgot about the different time.. again 16:04:53 heh covid time + daylight savings time is a rough combo 16:06:05 we can give a few mins for updating the pad in case you want to add something to the agenda 16:06:28 there are again not very many items but we have a reading group discussion today \o/ 16:08:06 brief announcement/discussion item: i'm planning on setting up rdsys/bridgestrap on polyanthum soon, but only for the sake of bridge testing 16:08:29 the idea is to test our bridges and expose a status page that operators can take a look at. no distributors for now 16:08:50 the status page tells you if your bridge is 1) untested, 2) tested + works, 3) tested + does not work 16:09:31 cool! 16:10:27 sorry for asking but polyanthum is the bridges.torproject.org host, right? 16:10:52 agix: yes, exactly 16:11:24 i believe our hostnames are named after flowers and onions 16:11:59 I see :D 16:14:02 any other discussion items? 16:14:08 i see we have a new default bridge 16:14:22 tor-browser#40212 16:14:42 nothing from my side 16:14:52 okay, any needs help with? 16:15:01 it looks like someone caught the bridge outage on the bug reporting pad 16:15:10 https://pad.riseup.net/p/tor-anti-censorship-bugs-keep 16:15:21 but then commented later that it started working again 16:15:29 yeah I'm going to copy to a comment 16:15:33 i'm blocking on other teams for now 16:15:37 thanks dcf1 16:17:32 cohosh: fwiw, i spent quite some time looking into i18n and prometheus-based metrics in go, just in case that may come in handy in snowflake or other projects 16:17:46 i'll sum up all i learned in the respective issues 16:17:59 cool :) most of the nsowflake pieces that need i18n are not in Go 16:18:23 but that sounds useful for GetTor and just to have in case that changes 16:19:28 GetTor for sure 16:20:16 anything else before we move on to reading group? 16:20:44 not from me 16:21:04 yey, reading group time 16:21:19 anyone have a summary ready? i can do a quick one if no 16:21:26 i prepared one 16:21:31 agix: oh yay! 16:21:50 16:21:56 The paper analyzes real-world TLS traffic from over 11.8 billion TLS connections in order to identify which wide range of TLS client implementations are actually used on the Internet. The data included counts and time-stamps of unique Client Hello messages, a sample of SNI and metadata for each Client Hello and Server Hello responses. 16:22:05 For each connection a fingerprint was generated by calculating the SHA1 hash over several specific extensions including the TLS record version, handshake version, cipher suite list, compression method list, elliptic curve list, EC point format list, extension list, signature algorithm list and ALPN list. 16:22:19 The collected fingerprints are then used to analyze how distinguishable certain censorship circumvention tools are from real-world traffic. 16:22:19 In total, 230000 unique fingerprints were collected. 16:22:24 Some of the key findings: 16:22:33 Some TLS implementations generate several fingerprints, like Google Chrome, which generate at least 4 fingerprints even from the same device, due to sending different combinations of extension depending on the context and size of TLS requests. 16:22:47 To measure how quickly fingerprints change and how this might impact a censor using a whitelist approach, a list of new fingerprints (1 week old) was compiled and compared to the collected amount in the following weeks, showing a steady and but small increase of 0,33%. However, TLS updates in Chrome or iOS would cause a whitelist approach to block half of all connections after 6 months. 16:23:00 As of August 2018 Psiphon was able to mimic Chrome 58-64 making it less likely to be blocked, followed by Outline (which uses a randomized protocol to look like nothing), meek, Snowflake, Lantern, Tapdance and Signal. 16:23:18 In order to assist censorship circumvention tools, a TLS library named uTLS (fork of Golangs TLS library) was created, which allows developers to mimic arbitrary Client Hello messages. As of August 2018 the library has been adopted by Psiphon, Lantern and TapDance. 16:23:24 16:26:10 one of the things that really stands out in this paper to me is the amount of real world data used 16:27:03 They've used their CU Boulder network tap in other research as well 16:27:13 9 months of collecting TLS fingerprint data is impressive 16:27:34 Like "Detecting Probe-resistant Proxies" which we've discussed 16:28:51 i'd be interested in a study that compares differences in the fingerprints from CU Boulder to another location 16:29:08 obviously this type of data collection is difficult 16:29:37 Yeah there's a big opportunity there. 16:30:02 I wonder how difficult the talks might have been with the university staff to permit the experiment 16:30:28 heh 16:30:36 One of the strengths of "Seeing Through Network-Protocol Obfuscation" (https://censorbib.nymity.ch/#Wang2015a) is that they used some real-world traces evaluate false classification rates. 16:30:40 is sergey here this time? 16:33:59 this paper is also interesting because it carves out an edge case in the parrot is dead rule 16:34:24 uTLS hides TLS traffic by mimicking the fingerprints of other tools 16:34:50 IMO it's not really an edge case; parrot is dead was always somewhat overstated 16:34:56 the way i explain why this works better than the other types of mimicry called out in that paper (like skype) is because TLS is relatively simple 16:35:18 dcf1: that's fair, it was a theoretical result, not censorship seen in the wild 16:35:50 But yeah, uTLS dmeonstrably works at what it is trying to do 16:36:10 meek (obfs4proxy meek_lite) in Tor Browser uses uTLS since Ocotober 2019 16:36:20 Could anyone give me a short overview on how complex it is to adjust the fingerprint of transports like meek or Snowflake? 16:37:02 meek uses uTLS, which makes it a lot simpler from the perspective of someone using the uTLS library 16:37:27 i wonder how much maintenance work is put into keeping uTLS up to date with current fingerprints 16:37:37 agix: you can choose from one of the premade fingerprints with something like `utls=hellorandomized`, `utls=hellofirefox_65`, `utls=helloios_auto` in the bridge line 16:37:43 the paper gave some measurements on how fingerprints change over the timeline of the data collection 16:38:45 Currently it's obfs4proxy lagging behind, as its built-in list of fingerprints isn't up to date with what uTLS offers now 16:39:16 Though the "auto" fingerprints will still use the most recent, as I understand it, even if obfs4proxy's built-in list doesn't know about all the ones available 16:39:29 that sounds like a good 'first contribution' ticket 16:39:57 Well, except obfs4proxy isn't maintained by us or at gitlab.torproject.org, really 16:40:03 dcf1: is hellofirefox_65 still commonly used? 16:40:05 that requires pulling the latest versions of uTLS right? 16:40:14 yeah 16:40:17 i don't see the problem. ask yawning to merge and if that doesn't work out, we can fork 16:40:18 we need to do that manually for tor browser builds 16:40:30 so that is something we can do 16:40:41 But it's true it's a concern with keeping uTLS up to date. The last commit is from August. 16:40:50 * cohosh nods 16:41:19 There's some good community involvement, for example from Psiphon contributing code https://github.com/refraction-networking/utls/issues?q=is%3Aissue+author%3Arod-hynes 16:41:51 when did meek start to use uTLS? 16:42:38 It was work too, to check the fingerprint with each new ESR release, and sometimes it required configuration changes for the headless browser, like re-enabling TLS session tickets https://trac.torproject.org/projects/tor/ticket/26241 16:43:11 cohosh: dcf1: phw: we can discuss that process after your meeting (whether relying on new obfs4proxy versions or updating it in tor-browser-build) 16:43:26 agix: https://blog.torproject.org/new-release-tor-browser-90 https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/29430 16:43:40 dcf1: nice thanks! 16:44:09 Both the mainline meek code and uTLS have support for uTLS. At the same time, Tor Browser switched from the mainline to obfs4proxy. 16:44:55 obfs4proxy's internal transport name is meek_lite, because it originally did not support TLS camouflage, even though it was compatible with the protocol otherwise. Now that there's uTLS support, meek_lite is effectively the same as meek. 16:45:35 i've been meaning to ask what the story behind meek_lite is 16:45:47 maybe this is a digression 16:46:36 That's the story, Yawning implemented the meek protocol in obfs4proxy but without browser camouflage, and gave it an incompatible transport name so people wouldn't be fooled into thinking it had all the same blocking resistance. 16:47:19 Then with uTLS, Tor Browser's started using obfs4proxy because it permitted getting rid of some project dependencies. 16:48:52 phw: Oh, one other complication, obfs4proxy doesn't actually use the mainline uTLS, but a fork (with an incompatible license) also made by Yawning. https://gitlab.com/yawning/utls 16:49:10 lol gdi that license thing 16:49:59 that's a great readme in yawning's fork 16:50:01 "Your tears are delicious, and your code will burn." 16:51:23 The fork is also unchanged for half a year 16:51:52 So yeah, we may be approaching a situation meek as in before, where it accurately imitated the fingerprint it intended to imitate, but that fingerprint was out of date and no longer common 16:52:56 that's frustrating, it's not that uTLS is out of date 16:53:03 right? 16:53:26 just that licenses, and project maintainership are making it difficult to easily update 16:54:32 Well both forks have unsurprisingly diverged 16:54:45 Each has some fixes that the other does not, I believe 16:55:47 so where are the biggest pain points in keeping uTLS up to date? 16:56:03 you mentioned finding new fingerprints for new versions of browsers is a lot of work 16:56:11 I don't know. I've asked sergey but I don't remember anything specific. 16:56:18 cohosh: no, that's not what I meant 16:56:23 i thought the discussions of GREASE in the paper are interesting 16:56:25 ah 16:57:07 I haven't done it, but there is apparently code in uTLS where you can give it a pcap and it will generate code to give you that fingerprint 16:57:22 aha wow! 16:57:31 but it's not always that easy, as I understand it uTLS required a lot of refactoring to support the changes in crypto/tls for TLS 1.3 16:57:34 https://tlsfingerprint.io/pcap 16:58:36 It looks like both versions have the same fingerprint list, last added Chrome 83 in June 2020. https://github.com/refraction-networking/utls/commits/master/u_parrots.go https://gitlab.com/yawning/utls/-/commits/obfs4proxy-dev/u_parrots.go 16:58:58 But they have different fixes beyond that, "Fix GREASE repeating values"; " Yawning Angel's avatar 16:59:16 Support more than one KeyShare extension correctly", "Add support for TLS Certificate Compression" in the other 16:59:35 It's a bummer how it worked out 17:00:49 yeah 17:01:43 as far as snowflake, it looks like they were using the old chrome-based version 17:02:12 That shouldn't matter for TLS purposes, for domain fronting, then and now, Snowflake uses crypto/tls 17:02:27 oh this was for the communication with the broker 17:02:33 ? 17:02:45 Yes, nothing to do with WebRTC in this paper 17:03:06 hmm so we might want to consider moving to uTLS for snowflake 17:03:09 We need to add uTLS (or equivalent) to Snowflake at some point 17:03:22 cool 17:03:52 It is not that hard, but unfortunately using uTLS with HTTP/2 is the most complicated use case and requires some tricks that are not entirely satisfying 17:03:52 i was wondering, since we're using pion/webrtc if a uTLS-like equivalent for DTLS would be useful later on 17:04:07 (for the webrtc part of the connection) 17:04:54 It was actually Yawning that figred out a decent way to do it https://lists.torproject.org/pipermail/tor-dev/2019-January/013633.html 17:04:59 dcf1: is it because of APLN? I don't see immediately why HTTP/2 would be difficult 17:05:13 *ALPN 17:05:35 This is what the meek implementation looks like, there are many comments here: https://gitweb.torproject.org/pluggable-transports/meek.git/commit/?id=b8fb876145cda3c14d335a3fc88b5e422a926150 17:06:35 lol now that i look at it i think i reviewed this commit awhile ago 17:06:55 I believe that even this scheme can fail if an HTTPS server switches from HTTP/1 to HTTP/2 in between requests (which could conceivably happen with a load balancer or something, though it's unlikely with the kinds of CDNs we are interested in) 17:07:58 But at any rate, it's more complicated than https://github.com/refraction-networking/utls#migrating-from-cryptotls 17:10:27 Oh yeah you did https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/meek/-/issues/29077#note_2602629 17:10:35 :) 17:16:46 seems like discussion is winding down 17:17:17 anything else before we end the meeting? 17:18:12 okay let's end it here 17:18:21 thanks agix for the summary and for choosing this reading! 17:18:46 #endmeeting