16:00:45 <meskio> #startmeeting tor anti-censorship meeting
16:00:45 <MeetBot> Meeting started Thu May  8 16:00:45 2025 UTC.  The chair is meskio. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:45 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:49 <meskio> hello everybody
16:00:54 <meskio> here is our meeting pad: https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469
16:00:56 <onyinyang> hihi o/
16:00:56 <meskio> ask me in private to give you the link of the pad to be able to edit it if you don't have it
16:00:58 <meskio> I'll wait few minutes for everybody to add you've been working on and put items on the agenda
16:01:05 <cohosh> hi
16:01:41 <shelikhoo> hi~
16:02:12 <shelikhoo> sorry I was trying to hold the meeting, but slightly forgot about time...
16:02:49 <meskio> no prob, I think it should be my turn anyway
16:03:14 <shelikhoo> yes..
16:03:18 <meskio> let's start with the first topic:
16:03:23 <meskio> about snowflakestaging
16:03:27 <meskio> shelikhoo: ???
16:03:31 <shelikhoo> yes!
16:03:34 <shelikhoo> that is from me
16:03:54 <shelikhoo> right now snowflake staging environment is up and running
16:04:33 <shelikhoo> and snowflake packet transport mode is one is the on going trial
16:04:50 <shelikhoo> I would like to know if there is any feedback about snowflake packet transport mode
16:05:12 <cohosh> nice!
16:05:16 <shelikhoo> hehe!
16:05:19 <cohosh> do you have instructions for how to run it?
16:05:34 <shelikhoo> https://gitlab.torproject.org/shelikhoo/snowflakestaging#tldr
16:05:43 <shelikhoo> yes, there is a readme file with all the documents
16:06:46 <cohosh> thanks for putting that together
16:06:55 <cohosh> what kind of time frame are you hoping to receive feedback in?
16:07:55 <shelikhoo> I don't have a specific time frame but I do have 2 tasks get blocked here
16:08:09 <shelikhoo> I think maybe just share the feedback when you are ready
16:08:13 <cohosh> okay, so as soon as we get a chance to try it out :)
16:08:26 <shelikhoo> and there is no need to aggregate them together
16:09:02 <cohosh> is the original issue a good place for giving feedback? i don't have anything to share yet
16:09:54 <cohosh> or merge request
16:10:13 <shelikhoo> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40398
16:10:27 <cohosh> oh i mean on the packet transport mode
16:10:37 <cohosh> or are you looking for feedback on the staging environment itself?
16:10:52 <shelikhoo> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/315
16:11:00 <shelikhoo> I have 2 blocked task
16:11:03 <shelikhoo> here
16:11:09 <shelikhoo> one is the staging environment
16:11:19 <shelikhoo> anther one is the udp transport mode
16:11:45 <shelikhoo> so both of them kind of need feedback or comment
16:12:20 <shelikhoo> but a quicker thing is that I am planning to relicense the staging environment repo into MIT license
16:12:49 <cohosh> ok thanks, i will take a look and aim to have feedback by early next week
16:12:50 <shelikhoo> is there alternative plans that are better than relicese it into MIT license
16:12:55 <shelikhoo> yes, thanks!
16:13:04 <meskio> I wonder if some of the readme should be moved to a suvival guide in the team wiki, or maybe we could just link it from the wiki once the repo has a permament home
16:13:04 <cohosh> i'm ok with whatever license you decide
16:13:43 <meskio> I think the default we've being using for licensing is BSD 3-Clause
16:13:44 <shelikhoo> I think we could move it to wiki once we have a stable link for it
16:13:55 <meskio> but I'm ok with MIT if you prefer that one
16:14:15 <shelikhoo> meskio: are you aware of the practical difference between 3BSD and MIT
16:14:23 <meskio> not really
16:14:33 <meskio> I find them fairly interchangable
16:14:43 <meskio> I just see most stuff at Tor being 3BSD
16:15:11 <shelikhoo> okay then it is an inconsequential choice...
16:15:23 <shelikhoo> I am happy to license it as 3BSD as well
16:15:48 <shelikhoo> ohhhh... so meskio what is your opinion, I am happy to go either way
16:16:24 <meskio> let's do 3BSD to keep it the same than other projects
16:16:39 <shelikhoo> okay, let's use 3BSD here
16:17:00 <shelikhoo> I don't have too much to discuss on this topic
16:17:16 <meskio> good, let's move to the enxt one
16:17:26 <meskio> find a long term solution for utls downgrade check
16:17:33 <meskio> shelikhoo: again?
16:17:41 <shelikhoo> yes...
16:17:50 <shelikhoo> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/lyrebird/-/merge_requests/104
16:18:07 <shelikhoo> a week ago, when there was a tor browser release
16:18:29 <shelikhoo> I did a emergency patch of a known security issue in utls
16:19:36 <shelikhoo> My current plan is wait for Tor browser to use a more recent version of the go toolchain
16:19:41 <shelikhoo> and resync with upstream
16:20:09 <meskio> sounds good to me
16:20:28 <meskio> AFAIK the apps team do plan to update go version in the next big number release
16:20:53 <onyinyang> we have had a renovate bot MR for the same change for conjure
16:20:54 <shelikhoo> yes... I think there are also patched golang toolchain
16:21:14 <onyinyang> but there also aren't any mirrored go containers for 1.24
16:21:22 <shelikhoo> that despite being more recent, have breaking changes removed
16:21:50 <cohosh> meskio: when is the next major release?
16:22:05 <cohosh> i can't remember if we discussed it before
16:22:12 <meskio> I don't recall, let me see, I think I wrote something in the renovate MR in lyrebird
16:22:33 <shelikhoo> I think it is in a few months
16:24:13 <meskio> mmm, I can't find it, but I recall it to be in few months, yes
16:24:36 <cohosh> shelikhoo: i think your plan is good, do you think a few months is reasonable to wait for the toolchain uprade?
16:25:10 <cohosh> i guess we will need to respond like you did to any security issues in the short term until then
16:25:24 <shelikhoo> cohosh: I think this is reasonable. I would appreciate some additional eyes on my backported fix just in case I made a mistake
16:25:44 <shelikhoo> the part I patched is in a rather sensitive code path
16:26:01 <cohosh> ok, and that's here: https://gitlab.torproject.org/shelikhoo/utls-temporary
16:26:21 <cohosh> https://gitlab.torproject.org/shelikhoo/utls-temporary/-/commit/55c892a09c813abfaf11e445cf1a19126ea19cc1
16:26:25 <shelikhoo> https://gitlab.torproject.org/shelikhoo/utls-temporary/-/commit/55c892a09c813abfaf11e445cf1a19126ea19cc1
16:26:26 <shelikhoo> yes
16:26:30 <meskio> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/510#note_3163693
16:26:34 <meskio> September 2025
16:26:34 <cohosh> heh irc race condition
16:27:27 <cohosh> shelikhoo: i'll look after this meeting and leave a comment on the lyrebird MR this afternoon
16:27:47 <shelikhoo> yes... and thanks for onyinyang for the review of my merge request on a short notice!!!
16:28:09 <shelikhoo> cohosh: yes! if you did found an issue please let me know in private
16:28:12 <cohosh> hopefully there won't be too many security issues in the next 4 months :)
16:28:17 <cohosh> shelikhoo: will do
16:28:27 <shelikhoo> hahaha... that patch is quite.... scary
16:28:40 <shelikhoo> i was unable to cherrypick the change
16:28:58 <shelikhoo> and have to find the place to apply the code in a few hours...
16:29:15 <shelikhoo> but anyway I do need to highlight that unlike the browser team
16:29:37 <shelikhoo> we don't know the security issue until it become public
16:30:02 <cohosh> yeah it's a tough position to be in
16:30:14 <shelikhoo> so we kind of need to race against the every time
16:30:15 <shelikhoo> yes
16:30:26 <meskio> thanks for the fast response
16:30:26 <shelikhoo> anyway that is everything I have about this topic
16:30:57 <meskio> good, let's move to the next one:
16:30:59 <meskio> RFC: Collect Snowflake Proxy Pool Health Information
16:31:01 <shelikhoo> hehe
16:31:09 <shelikhoo> this is from me as well
16:31:10 <shelikhoo> okay
16:31:31 <shelikhoo> https://gitlab.torproject.org/tpo/anti-censorship/connectivity-measurement/logcollector/-/issues/7#note_3197038
16:32:02 <shelikhoo> so right now we are trying to understand why sometime it becomes really slow to connect to snowflake network
16:32:35 <shelikhoo> which could contribute to decreasing user counts in some region
16:33:03 <shelikhoo> and one of the way we could achieve this is by having a machine readable structured log
16:33:46 <shelikhoo> that help us understand what is reason that each connection attempt on the vantage point fails
16:34:16 <shelikhoo> which could include failure to connect to broker
16:34:18 <shelikhoo> or
16:34:42 <shelikhoo> failure to connect to a particular peer as that peer is blocked by network address
16:35:11 <shelikhoo> this would be implemented as a system to output all the events emitted in snowflake client
16:35:29 <shelikhoo> with additional events added to collect sufficient amount of information
16:36:00 <cohosh> this is something that we've tried to do with the log message we pass to tor and it was the original motivation for the events code that we already have
16:36:09 <cohosh> is your intention to build on this?
16:36:17 <shelikhoo> yes
16:36:36 <shelikhoo> I should note that this log would include the ip address and port the client try to connect to
16:36:54 <shelikhoo> and make it easier for others to harvest proxy ip addresses
16:36:56 <cohosh> nice, it's been helpful already in looking at user reports with copy-pasted tor logs but we didn't think about making it easily machine parseable
16:37:02 <shelikhoo> which they are already doing
16:37:33 <cohosh> ah, i see
16:38:03 <cohosh> i suppose it's something we can let our safelog mask for user reports still but safe logging would be turned off for the vantage point tests?
16:38:23 <dcf1> Yes I'm confused, how is it different from unsafe-logging?
16:39:09 <shelikhoo> the difference is that the new data would make it much easier to process the output
16:39:22 <cohosh> ah easier for us to proces = easier for censors too, i suppose
16:39:27 <shelikhoo> yes
16:39:31 <cohosh> but the data was always available
16:39:54 <dcf1> "easier to process" = is that because it contains more information, information that is more structured, because it is automatically uploaded?
16:39:57 <shelikhoo> and I don't think we want snowflake client to output json data to tor's user log
16:40:16 <dcf1> Is the machine-readable part an attempt to circumvent the default safe logging that would otherwise redact IP addresses?
16:40:57 <dcf1> https://gitlab.torproject.org/tpo/anti-censorship/connectivity-measurement/logcollector/-/issues/7#note_3197038 "upload structured log alongside binary packet capture"
16:41:00 <shelikhoo> no, previously, each time we update snowflake, there was a change that we break someone else's text grab regex
16:41:14 <dcf1> Uploading pcaps?!? Do I understand that right? On first reading that seems unreasonable.
16:41:33 <shelikhoo> it is running on vantage points
16:41:39 <cohosh> dcf1: we have pcaps already availabe for a limmited time
16:41:40 <shelikhoo> it is not running on user's device
16:41:43 <dcf1> Aha, I see.
16:41:53 <cohosh> but you need access to download them
16:42:11 <shelikhoo> this structured log will not be generated by default as well
16:42:28 <shelikhoo> so we have to use a command line switch to generate it on vantage point
16:43:00 <shelikhoo> dcf1: https://gitlab.torproject.org/tpo/anti-censorship/connectivity-measurement/bridgestatus/-/blob/main/recentResult_iran-v2a02?ref_type=heads
16:43:17 <shelikhoo> the last field is a password protected pacp file
16:43:30 <shelikhoo> its access is restricted
16:44:07 <dcf1> So is the sole advantage that the structured logging format avoids needing to change grep patterns for the unstructured snowflake logs? The structured logging doesn't contain any new information that isn't already available?
16:44:35 <dcf1> If so, are changes in the unstructured log format really a big problem? Is it happening frequently enough that it's getting in the way of analysis?
16:45:03 <shelikhoo> if there is an information we need that is currently unavailable, it will be added to the event emit system
16:45:08 <dcf1> I'm trying to understand if the tpo/anti-censorship/connectivity-measurement/logcollector#7 proposal is a blocker for some desired analysis, or if it's more of abstraction for abstraction's sake.
16:45:11 <shelikhoo> even if the content is too verbose for log
16:45:43 <dcf1> "if" is there information you need now, or is this a speculative "might need it" change for the future?
16:46:05 <shelikhoo> if we don't need it now, then we don't need to add it now
16:46:11 <shelikhoo> we can add it in the future
16:46:36 <dcf1> IMO snowflake is already suffering from too much stuff added speculatively because it might be needed in the future
16:47:07 <shelikhoo> the structured log is designed to make sure we don't need to parse the unstructured log to get the content we need
16:47:15 <dcf1> Is this a case where, you started to do some analysis, looked at the state of the available information, got annoyed at the format, and decided it all needed to be reimplemented in a different way before any analysis can be done?
16:48:07 <dcf1> If there's no specific research question now, if we don't know for sure that currently, here's some information we need that we don't have, then it doesn't sound like adding a new, optional log format is going to help with analysis.
16:48:38 <shelikhoo> the specific research question is that why snowflake connection time is slow
16:48:50 <dcf1> I'm worried about it becoming one more custom thing we need to maintain, and if the anticipated need to add a lot of new information to it never arises, in the future we'll be looking at the code and saying, "why did we add this again?"
16:49:19 <dcf1> shelikhoo: no, I mean more specific than that. A hypothesis like: snowflake bookstrap is slow because a subset of proxy IP addresses is getting blocked.
16:49:36 <dcf1> To falsify that hypothesis, you need logs of proxy IP addresses and the status of connection attempts
16:49:48 <dcf1> which is information already present in the --unsafe-logging logs, correct?
16:50:01 <shelikhoo> yes
16:50:36 <shelikhoo> it is also possible that some connection failure are the result of 5XX error from cdn
16:50:38 <dcf1> If the idea is, we don't know what questions to ask yet, so we better collect all information, and before we can do that we need to design a new information-collecting system just in case we might need to collect something that we don't currently have,
16:50:58 <dcf1> then I guess I am politely pushing back against the proposal.
16:51:20 <cohosh> in the past, i have also written patches specifically for experiments, which avoids adding bloat to the snowflake repo itself
16:52:04 <dcf1> Like I think a more prudent course of action would be to analyze the existing data (even convert existing logs into the proposed JSON format first if you like), and then if the analysis hits a roadblock, then consider changing the way things are logged.
16:52:23 <dcf1> Add complexity only as required, on demand, not proactively in anticipation of possible future needs.
16:52:40 <dcf1> Forgive me if I misunderstand the situation.
16:53:59 <dcf1> It's fun to do greenfield design of new logging subsystems etc., but it may not actually effectively serve the higher-level goal
16:54:28 <shelikhoo> I think the analysis of the situation was correct, if i add this structured log system, then the analysis will be easier, but maintenance of snowflake will be slightly more difficult; and verse versa
16:54:54 <dcf1> I agree: there are benefits/tradeoffs either way
16:55:46 <shelikhoo> yes, I think i could try to run the experiment and analysis without this change and come back later if I find it really necessary
16:56:05 <dcf1> The question is not: will this improve things, but what is the best use of limited resources
16:57:02 <shelikhoo> okay... I will give the analysis the unstructured text a try and report my finding
16:57:13 <meskio> sounds like we have a decision for now
16:57:19 <meskio> let's move to the next topic
16:57:20 <shelikhoo> yes
16:57:39 <meskio> I'm planning to deprecate arm and 386 docker obfs4-bridge images
16:57:48 <meskio> snowflake doesn't support those archs in docker neither
16:57:54 <meskio> there are very few users
16:58:02 <meskio> but if someone has something against it?
16:58:18 <meskio> I'll do a last release adding a deprecation message to the logs of those archs
16:58:32 <shelikhoo> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/container_registry/11
16:58:36 <shelikhoo> I am not against it
16:58:51 <shelikhoo> but I wish to say that snowflake-proxy does support arm
16:59:00 <shelikhoo> over
16:59:04 <meskio> it does support arm64
16:59:09 <shelikhoo> yes
16:59:12 <meskio> arm was raspberry2
16:59:21 <meskio> raspberry3 is from 2016
16:59:40 <meskio> we do support arm64 in obfs4-bridge
16:59:49 <shelikhoo> okay, sorry for the noise
16:59:57 <meskio> :)
17:00:28 <meskio> and in the interesting links it seems that a lot of snowflake users in turkmenistan
17:00:46 <meskio> frontdesk told me that they are sharing snowflake bridgelines to turkmen users
17:00:59 <meskio> the reports says it works but is very slow
17:00:59 <cohosh> i'm glad it's working there still since the censorship has ramped up
17:01:19 <cohosh> better than what it was before lift of blocking
17:01:28 <meskio> we should probably open a new issue in the censorship-analysis repo for this
17:01:38 <dcf1> "a lot" = thousands, where previously it was never more than 100, something changed for sure.
17:02:44 <meskio> I guess we can finish the meeting here
17:02:55 <shelikhoo> eof from me
17:03:13 <meskio> #endmeeting