16:00:27 <onyinyang> #startmeeting tor anti-censorship meeting 16:00:27 <MeetBot> Meeting started Thu Jun 26 16:00:27 2025 UTC. The chair is onyinyang. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:27 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:42 <meskio> hello 16:01:48 <meskio> while people update the pad a reminder: 16:02:12 <meskio> Next week is mid-year Tor break, so all of us will be (mostly) AFK and there will not be any meeting 16:02:26 <meskio> next meeting is July 11 16:02:30 <meskio> is it 11th? 16:03:06 <meskio> yes, 11th, I'll not be around that week neither 16:03:23 <shelikhoo> hi~hi~ 16:04:51 <cohosh> hi 16:06:14 <onyinyang> ok, let's start 16:06:40 <onyinyang> We have a couple of discussion points and we'll end with the reading group :) 16:06:54 <onyinyang> The first discussion point is on the Iran network shutdown 16:07:07 <meskio> I guess we should update the title 16:07:12 <meskio> as the network is comming back 16:07:22 <meskio> and we can see how snowflake is being overloaded 16:07:25 <meskio> again 16:08:07 <meskio> and now we have access to a vantage point in the country, so maybe we can investigate if they are blocking it by fingerprint or listing proxies 16:08:29 <shelikhoo> yes, I have yet to check that vantage point... 16:08:50 <meskio> I had ssh into it and it was slow but worked, but I didn't have the time to run anything from there 16:08:56 <cohosh> the broker is very overloaded 16:09:07 <cohosh> or perhaps just the proxy pool 16:10:05 <cohosh> the restricted proxy pool is totally used up and it looks like we have an equal number of matched vs idle proxies overall 16:10:53 <cohosh> but the bridge also shows that the number of currently connected clients has almost doubled since the 20th: https://metrics.torproject.org/userstats-bridge-transport.html?start=2025-06-01&end=2025-06-26&transport=snowflake 16:11:50 <meskio> and 10k of those ~35k are from Iran: https://metrics.torproject.org/userstats-bridge-combined.html?start=2025-03-28&end=2025-06-26&country=ir 16:11:56 <onyinyang> woah 16:11:59 <cohosh> it could be that it isn't blocked, and difficulty connecting is due to proxy pool and broker capacity 16:13:06 <meskio> but we've seen more users in the past 16:13:14 <shelikhoo> or maybe everyone just got internet back and need to access network at the same time 16:13:28 <shelikhoo> let's say someone check email once per week 16:13:30 <meskio> now we are in 35k, we hitted 90k in 2023: https://metrics.torproject.org/userstats-bridge-transport.html?start=2021-03-01&end=2025-06-26&transport=snowflake 16:13:46 <cohosh> ah good point 16:14:06 <shelikhoo> when network is back everyone will check email at the same time 16:14:07 <shelikhoo> over 16:14:38 <meskio> I agree, everybody is trying to connect at once from Iran, but I also have the feeling that there is something happening there depleting our proxy pool 16:15:11 <cohosh> the broker does seem disproportionately overwhelmed 16:15:41 <ggus> re: snowflake lack of proxies: i believe ac-team needs to feed comms team with more specific needs, just saying 'we need more proxies' is not very engaging or appealing for potential volunteers outside our community, and right now, we need more volunteers outside the tor community. is there a number of how many proxies are needed or how many snowflake users we have in iran right now or even, how many 16:15:47 <ggus> users in iran are being rejected because of lack of proxies? i can create a ticket about finding new volunteers, but i'm lacking of those information from ac-team. 16:15:52 <dcf1> the broker seems ok on resource consumption currently, though I will put it back to the configuration it was at after this month so it remains under the credit 16:16:40 <ggus> that said, i asked arturo to post a call for snowflake proxies in our social channels: https://mastodon.social/@torproject/114745109066226826 16:17:55 <meskio> ggus: you have lack of information from us, because we don't know exactly what is happening 16:18:09 <meskio> we need to investigate it, and we couldn't do it without a vantage point 16:18:17 <meskio> that is why we've been so vage 16:18:33 <meskio> thanks for the call for proxies 16:19:16 <shelikhoo> yes! we does need more proxies 16:19:42 <meskio> but it was not even clear if more standalone proxies improve or create more problems... 16:19:54 <cohosh> ggus: yes, thanks. i think a ticket that we can update as we get more information would be best 16:20:56 <cohosh> more info to potential volunteers is a good idea 16:21:34 <cohosh> it's unfortunate our scraped prometheus metrics aren't publicly avaible, that's the best source of info we have on real-time proxy pool usage 16:21:57 <cohosh> but we could maybe give updated screen shots to show the number of client polls that are matched 16:22:31 <cohosh> there are also some things we could do with the snowflake-stats metrics in CollecTor 16:24:11 <ggus> ^yeah, i think screenshots to show the number of client polls is a good idea; it's like when you have a crowdfunding campaign and you see the project reaching the goal in a status bar 16:25:32 <meskio> I plan to be partially online until next thursday, I can help producing those screenshots, if someone is going to be around to use them 16:27:46 <ggus> https://gitlab.torproject.org/tpo/community/relays/-/issues/116 16:28:28 <ggus> cohosh: we don't know if the rejected clients because of the lack of enough proxies are from iran or other country, right? 16:29:10 <meskio> I'll keep updated this ticket until get AFK 16:29:38 <shelikhoo> thanks 16:29:46 <cohosh> ggus: from prometheus we do have country information for client polls 16:30:55 <meskio> is this from the new broker metrics change? 16:31:05 <meskio> or I've missed the whole time? 16:31:20 <cohosh> no, it's been there the whole time 16:31:25 <meskio> wow, nice 16:31:25 <cohosh> i think, let me check heh 16:31:44 <meskio> I can update the dashboard so we have a country selector 16:32:03 <cohosh> yeah https://snowflake-broker.torproject.net/prometheus 16:32:24 <cohosh> snowflake_rounded_client_poll_total has a "cc" field 16:32:51 <meskio> :) 16:33:16 <onyinyang> anything else on this topic? 16:33:32 <shelikhoo> nothing from me 16:33:40 <meskio> after the break I hope someone can investigate the situation in Iran 16:33:54 <meskio> if I find the time I'll have a look and document whatever I find 16:34:31 <meskio> nothing more from me 16:35:03 <onyinyang> hopefully next week will be uneventful o_o 16:35:41 <onyinyang> anyway, the next topic is the webtunnel bridges block in Russia 16:36:04 <meskio> since a couple of days there are many user reports saying that webtunnel bridges are being block in Russia 16:36:10 <shelikhoo> I have tested this and found the block was based on SNI 16:36:35 <meskio> so the censor is listing bridges and blocking them 16:36:37 <shelikhoo> so, it is likely that censor has collected bridge lines and blocked their SNI name 16:36:54 <meskio> make sense, it was even surprising it took them so long to do it 16:37:03 <shelikhoo> yes, listing bridge is my assumption 16:37:05 <dcf1> because connecting to the bridge with an altered SNI works, correct? 16:37:15 <meskio> yes, people is doing that 16:37:41 <meskio> there is a modifyed version of webtunnel that does domain fronting 16:37:52 <meskio> and there are reports that works for half of the bridges 16:38:53 <ggus> i think it worth to share the context that this block is on top of obfs4 block (and that's why we did a webtunnel campaign in december 2024 - feb 2025), so mobile tor users are pratically blocked 16:41:15 <meskio> I assume the bridges that are not working with this domain fronting patch might be because their webserver rejects requests with an SNI they don't host 16:41:26 <meskio> what I'm surprised is that actually work for many 16:41:56 <meskio> people is just using google.com as SNI or youtu.be 16:44:16 <shelikhoo> actually in the background I was developing a protocol that could bypass tls sni based block while looks like tls 16:44:32 <shelikhoo> but it is still in development, and has not get a working stage yet 16:44:44 <shelikhoo> but I think this event does make something like this more important 16:44:53 <meskio> :) 16:45:20 <onyinyang> indeed :) 16:45:31 <onyinyang> are there any other actions we can take in the meantime, or anything else to discuss on this topic? 16:46:15 <shelikhoo> nothing from me other than we should keep monitoring the situation 16:46:20 <meskio> not from me, I'll try to look more into this and see if I have any concrete proposals 16:46:27 <shelikhoo> and maybe ingest the patch into main 16:46:46 <shelikhoo> if it does cover a usage case unsupported by our main branch 16:47:03 <meskio> some of the things in the patch are already in lyrebird, like uTLS support 16:47:18 <meskio> or it is adding cert pinning that you have already mostly done 16:47:37 <meskio> but the host http header setting should be included, I agree 16:47:38 <shelikhoo> maybe it is cert pinning? I think it might still be missing in lyrebird 16:47:48 <shelikhoo> yes, and the http header setting 16:48:26 <meskio> yes 16:48:32 <shelikhoo> that's all from me on this topic 16:48:49 <onyinyang> ok well. . .that leaves 10 min for the reading group, which isn't very much 16:49:11 * meskio can do another 10min of overtime if needed, but not way more 16:49:44 <shelikhoo> I am happy with push it a week as well, if this paper is important\ 16:49:49 <shelikhoo> and need more discussion 16:49:58 <onyinyang> I'm good with either 16:50:41 <meskio> I think is an interesting paper, but maybe there is not a long discussion on it, as not so many things there affect us 16:50:49 <onyinyang> I think probably we need more than 10 minutes, so if we are all ok with extending the discussion by 10 extra minutes, let's go ahead 16:50:55 <onyinyang> otherwise, let's push it to next time? 16:51:35 <meskio> let's do it now 16:52:03 <meskio> my fast TL;RD is: the authors found two things: 16:52:29 <meskio> * chinese censorship is not anymore centralized on the edge of the country and now there is one region with it's own extra firewall 16:52:42 <dcf1> https://gfw.report/publications/sp25/en/ 16:52:57 <meskio> * the GFW is not perfectly bidirectional and some things are only censored in outgoing connections 16:53:21 <meskio> the paper mostly looks into web blockades 16:53:41 <meskio> do I miss any interesting keypoint? 16:54:08 <shelikhoo> one interesting I wish to discuss is the censor's limitation 16:54:16 <dcf1> the partial bidirectionality is interesting. they say it was first observed by GFWeb, 2024 https://censorbib.nymity.ch/#Hoang2024a 16:54:21 <shelikhoo> like it assume tcp header length 16:54:37 <shelikhoo> and unable to process fragmentation 16:54:54 <dcf1> also the fact that the Henan firewall seems totally different, technically, than the GFW. like its blocking behavior and network fingerprinting is not even close. 16:55:01 <onyinyang> I'm not sure this was never _not_ the case. I'm not sure I read it correctly but I understood it more as: ignoring regional differences in censorship may be missing a lot 16:55:03 <shelikhoo> both of them can give us some hint about how to avoid this censorship in unprivileged userspace 16:55:08 <dcf1> yeah, like those qualities for example. 16:55:16 <dcf1> and the TCP header length = 20 thing is so so weird 16:56:06 <shelikhoo> onyinyang: yes, there are also different level of censorship for different isp 16:56:20 <dcf1> Figure 3 shows cross-province connections as well as international connections https://gfw.report/publications/sp25/en/#fig:3-client-to-sink-server-data-matrix 16:56:51 <dcf1> at least according to this, regional firewalls is not something widespread, but only in Henan. (with the caveat that they say they were not able to test all provinces) 16:57:18 <onyinyang> one thing that struck me as strange/suspicious was that they used a different vps for henan only, it would have been interesting if they used the same vps and/or multiple vps' to compare behaviour 16:57:25 <onyinyang> but I also don't know if this would have mattered at all 16:57:32 <dcf1> The TCP header of 20 bytes thing is especially weird considering a further experiment they did: "the Henan Firewall did parse the TCP header length field in the TCP header, but had a condition to only block a connection when its TCP header length is 20 bytes." 16:57:36 <shelikhoo> actually there is also report of regional firewall in different regions as well 16:58:15 <shelikhoo> however, henan is the one that is easier to publish a paper on 16:58:22 <dcf1> hmm, ok 16:58:40 <shelikhoo> since in other place like Fujian, getting a vps with regional censorship is much harder 16:59:00 <shelikhoo> and typically requires a real residential network 16:59:18 <shelikhoo> which has ethical concerns 16:59:21 <shelikhoo> so... 16:59:35 <meskio> the paper clearly states why they avoided that to don't put people on risk 16:59:42 <shelikhoo> yes 17:00:23 <onyinyang> ah ok 17:01:16 <shelikhoo> but otherwise that 20 byte tcp header assumption is a very interesting point as well 17:01:43 <shelikhoo> I suspect that this works mostly fine for censoring windows machine's traffic 17:02:08 <shelikhoo> which works well enough, at least for people checking how the censorship is working 17:02:39 <dcf1> shelikhoo: what makes you think it is specific to windows? 17:03:00 <dcf1> this is the Nmap TCP/IP fingerprint database: https://svn.nmap.org/nmap/nmap-os-db 17:03:12 <dcf1> The `O=` fields record TCP options: https://nmap.org/book/osdetect-methods.html#osdetect-o 17:03:38 <dcf1> The `%` is a delimiter, so you can search for OSes with empty TCP options by searching for `%O=%` 17:03:40 <shelikhoo> I downloaded a few pcap files captured from window machine, and their first payload packet has no options 17:04:17 <shelikhoo> I didn't say it is specific to windows 17:04:27 <shelikhoo> I just say it works well enough for censor 17:04:59 <dcf1> "this works mostly fine for censoring windows machine's traffic" I'm curious if you know that windows tends to use zero TCP options sometimes, or something like that 17:05:30 <shelikhoo> no... I just inspected a few downloaded pcap from wireshark's website 17:05:45 <shelikhoo> https://gitlab.com/wireshark/wireshark/-/wikis/uploads/__moin_import__/attachments/SampleCaptures/smb-on-windows-10.pcapng 17:05:46 <dcf1> In nmap-os-db, look at the OPS line "contains the TCP options received for each of the probes (the test names are O1 through 06)" 17:05:54 <shelikhoo> https://gitlab.com/wireshark/wireshark/-/wikis/uploads/__moin_import__/attachments/SampleCaptures/nspi.pcap 17:06:05 <dcf1> E.g. 17:06:09 <dcf1> Fingerprint Microsoft Windows 10 1607 - 11 23H2 17:06:12 <dcf1> OPS(O1=MFFD7NW8ST11%O2=MFFD7NW8ST11%O3=MFFD7NW8NNT11%O4=MFFD7NW8ST11%O5=MFFD7NW8ST11%O6=MFFD7ST11) 17:06:43 <dcf1> This is the part that doesn't make sense to make. Things like TCP timestamps are ubiquitous on all but the tiniest network stacks, at least according to my understanding. 17:07:11 <dcf1> I would have thought that limiting censorship to TCP segments without options would affect almost no traffic. 17:07:20 <dcf1> But clearly it must have affected enough traffic for users to notice. 17:07:49 <dcf1> There might be some common situation where TCP connections are set up without options that I'm not aware of. They have a graph showing it's about 20%. 17:08:09 <shelikhoo> we should maybe check more about Windows XP or Windows 7 17:08:17 <dcf1> nothign really to say about it, just it's quite against my intuition 17:08:40 <shelikhoo> I think a lot of enterprise user doesn't really upgrade to the most recent OS 17:08:49 <dcf1> those are represented int he database as well 17:09:10 <dcf1> it's not like the TCP timestamp option, for example, is new technology: https://www.rfc-editor.org/rfc/rfc1323 is from 1992! 17:09:59 <dcf1> that's the part that stood out to me the most, because it's so weird. but it's probably not the most important point. 17:10:16 <meskio> maybe they are recycling an old version of the GFW for this local firewals :P 17:10:33 <dcf1> The 01020304050607080900 RST payload is quite a strange thing too. 17:11:16 <dcf1> The Nmap OS detection documentation comments on that too: https://nmap.org/book/osdetect-methods.html#osdetect-rd 17:11:24 <shelikhoo> at least in smb-on-windows-10.pcapng 17:11:30 <dcf1> "Some operating systems return ASCII data such as error messages in reset packets. This is explicitly allowed by section 4.2.2.12 of RFC 1122." "Some of the few operating systems that may return data in their reset packets are HP-UX and versions of Mac OS prior to Mac OS X." 17:11:33 <shelikhoo> the first payload packet does not have an option 17:11:50 <dcf1> shelikhoo: yes, I assume there must be something I don't understand 17:11:51 <shelikhoo> while subsequent packet might have these option 17:12:17 <shelikhoo> so for nmap, it detect whether it EVER has these options set 17:12:35 <dcf1> But the MSS option, for example, goes on the SYN packet, and it's pretty common 17:12:39 <shelikhoo> but for the censorship, it just need to make sure the client hello message is tcp length = 20 17:12:58 <shelikhoo> syn packet is not the first payload packet 17:13:02 <dcf1> shelikhoo: no, the Nmap data are from individual response packets. not full established connections. 17:13:13 <shelikhoo> okay... 17:13:24 <shelikhoo> sorry I not an expert on nmap 17:13:25 <dcf1> I apologize. I didn't mean to start an argument. As I said, there must be something I don't understand. 17:13:43 <dcf1> I don't have anything else to add. 17:13:48 <shelikhoo> yes... I think we should look at packet captures to find out 17:14:00 <shelikhoo> rather than looking at rfcs 17:14:01 <shelikhoo> over 17:14:34 <onyinyang> I guess we can end it there for today then 17:15:07 <shelikhoo> yes thanks~ 17:15:08 <onyinyang> Thanks everyone for the discussion! 17:15:28 <onyinyang> #endmeeting