15:59:14 <onyinyang[m]> #startmeeting tor anti-censorship meeting
15:59:14 <MeetBot> Meeting started Thu Oct 19 15:59:14 2023 UTC.  The chair is onyinyang[m]. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:59:14 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:59:20 <shelikhoo> hi~
15:59:22 <meskio> hello
15:59:32 <onyinyang[m]> hello everyone!
15:59:45 <onyinyang[m]> here is our meeting pad: [https://pad.riseup.net/p/tor-anti-censorship-keep](https://pad.riseup.net/p/tor-anti-censorship-keep)
15:59:54 <onyinyang[m]> I am just going to restart my client quickly while we fill out the pad
16:03:57 <meskio> I removed the running flag discussion, I don't think we need to talk about it, but if someone disagrees let me know
16:04:37 <onyinyang[m]> Cool, I was just going to start with the Armored bridge line discussion topic since it seems to be the only one
16:05:19 <onyinyang[m]> So the first discussion topic today is continuing our discussion from last week:
16:05:19 <onyinyang[m]> * Armored Bridge line Spec(Oct-19: let's discuss again)
16:05:19 <onyinyang[m]> https://gitlab.torproject.org/tpo/anti-censorship/team/-/issues/126#note_2954127
16:06:13 <shelikhoo> yes!
16:06:54 <shelikhoo> last week, we have decided to read the spec and discuss it this week
16:07:08 <meskio> I think the spec looks good, thank you for the work there shelikhoo
16:07:19 <shelikhoo> it is about sharing bridge in a way that have better os integration and detects errors
16:07:20 <meskio> there are few things I think will be nice to add there:
16:07:43 <meskio> we should have some examples of bridgelines and their convertion, so people implementing it has something to test
16:08:09 <meskio> and I think we should decide on a domain name and include specifics on how the bridge URL will look like with it
16:08:25 <meskio> I have two domain names bought: brdg.es and bridge.st
16:08:38 <meskio> I think I like the first, but I'm ok with any or another
16:09:17 <shelikhoo> I have no preference on the domain name
16:09:28 <dcf1> With .es, it should be puent.es :)
16:09:51 <dcf1> just joking
16:09:52 <meskio> yes, I thought about it, but maybe confusing for non-spanish speakers :D
16:10:40 <shelikhoo> speaking of non-english languages, right now the armored bridge line does not support non-ascii charactors
16:10:47 <shelikhoo> as a result of a compression step
16:11:04 <shelikhoo> do we wish to address this issue, or let it be
16:11:34 <meskio> I don't expect to have non-ascii bridgelines in the near future, thinks like webtunnel urls can be converted to ascii chars...
16:11:58 <meskio> but maybe is something to say explicitly so implementations throw an error if non-ascii is inputed
16:12:23 <dcf1> what is the data type of the arguments that bridges publish now in their descriptors?
16:12:48 <dcf1> i.e., in the bridge descriptor specs? Probably u8[] or UTF-8, I would guess.
16:13:08 <shelikhoo> yes, I will add it should return an error if non-ascii characters was encountered
16:13:22 <dcf1> what happens if a bridge publishes some arguments that are non-ascii? Presumably there would be an error, at what stage of the pipeline would the error happen?
16:14:02 <meskio> good questions, I don't know, I'm not even sure on what spec should that be
16:14:57 <dcf1> when an obfs4 bridge publishes e.g. cert= and iat-mode= parameters, there's a protocol for that. I'm looking for it now.
16:16:09 <cohosh> it doesn't say in the extra-info spec: https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/spec/dir-spec/extra-info-document-format.md
16:16:13 <cohosh> it just says arglist
16:16:25 <meskio> so implementation specific...
16:16:54 <meskio> there is also the pt-spec
16:16:58 <dcf1> I think there may be a blanket requirement that descriptors have to be UTF-8 (added after the core team started experimenting with rust, since it's less convenient to deal with byte strings)
16:18:00 <dcf1> In goptlib, you call SmethodArgs to add arguments you want to have published in the descriptor:
16:18:03 <dcf1> https://pkg.go.dev/git.torproject.org/pluggable-transports/goptlib.git#SmethodArgs
16:18:22 <dcf1> The data type there is a Go string, which is a byte array.
16:18:44 * arma2 is nearby, listening in case anything needed from him
16:19:14 <dcf1> It gets communicated to the tor process using the SMETHOD message
16:19:15 <meskio> arma2: any idea if bridgelines were supposed to support UTF-8 chars?
16:19:15 <cohosh> ah the pt spec says ascii: https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/spec/pt-spec/ipc.md
16:19:16 <dcf1> https://spec.torproject.org/pt-spec/ipc.html#pluggable-transport-server-messages-server-messages
16:19:40 <arma2> meskio: the original plans were only 'normal' characters, but, i don't know if things have changed since then
16:19:57 <shelikhoo> the things is we can let's say, add a single bit at the beginning of the bridgeline
16:19:58 <cohosh> but, we are also planning to change how we deal with PTs and core tor
16:20:02 <shelikhoo> the armored one
16:20:04 <dcf1> cohosh: ASCII is for the variable name, the value is <ArgChar> ::= <any US-ASCII character but NUL or NL>
16:20:22 <shelikhoo> so that if we could add unicode support later without workaround
16:20:26 <cohosh> dcf1: ah i see
16:20:51 <dcf1> Yeah so as far as the PT spec goes, I think the type is byte arrays, not even guaranteed UTF-8.
16:21:02 <meskio> having an extra byte for the type at the beginning might make sense so we are ready to suppor other formats...
16:21:37 <shelikhoo> or even just a bit
16:21:59 <meskio> mmm, a bit is a bit limited as we might need extra ones in the future...
16:22:00 <dcf1> cannot contain newline or \0, though, the encoding of SMETHOD args doesn't handle that case
16:22:26 <shelikhoo> yes, then we could just add a empty byte
16:22:37 <shelikhoo> as a way to future proof
16:22:51 <dcf1> oh sorry cohosh, I pasted without thinking. ArgChar itself says US-ASCII, it doesn't contradict what you said
16:23:22 <meskio> maybe a byte is too much, as you work on making it small, I don't know
16:23:42 <dcf1> There are some PT messages where non-ASCII bytes can be escaped (e.g. STATUS), but SMETHOD ARGS uses its own custom escaping scheme that doesn't have support for that.
16:24:51 <cohosh> we have to worry about the dir spec more than the PT spec though if we are scrapping it soon anyway?
16:24:56 <cohosh> (for some definition of soon)
16:24:56 <shelikhoo> dcf1: the 0 byte or bit(s) will be removed during bridgeline unwrap process, so it doesn't matter if couldn't be processed by pt spec
16:25:49 <dcf1> shelikhoo: no, what I'm worried about is that if bridgeline armoring supports a narrower data type than is generally permitted for bridge lines, then what kind of error results, is it easy to detect and debug, etc.
16:26:25 <dcf1> so I'm thinking of ways it would be possible to get a non-ASCII bridgeline, and I'm thinking one possible way would be if the server PT supplies non-ASCII args to tor which are then published in a descriptor
16:26:55 <shelikhoo> yes... I think the result of discussion will be we just add a type indicator, and the error will be encountered when we try to encode to the armored bridge line
16:27:12 <shelikhoo> this will happen when we try to distribute a armored bridgeline
16:27:22 <dcf1> (Of course one could circumvent the normal reporting mechanism and just maliciously post descriptors of whatever contents to Collector, but I'm not thinking about deliberate attacks here, I'm thinking of possible inadvertent consequences)
16:28:31 <meskio> the only inadvertent consecuence that could be already happen is webtunnel urls with utf chars, not sure if they are encoded before publishing them
16:28:33 <dcf1> Through our research just now, it looks like the way I was thinking of would not actually work to get non-ASCII into a descriptor. (goptlib enforces it, though who knows what tor does once it gets the bytes)
16:29:01 <dcf1> Yeah, so it seems like it may not actually be a serious problem in normal use then.
16:29:17 <meskio> sounds good then the 7 bit encoding
16:29:30 <dcf1> I will say, though, that the fact we are having this discussion makes me think that maybe the armoring is trying to be a bit too clever o_O
16:29:33 <shelikhoo> yes, so should we have a type indicator or not
16:30:07 <dcf1> and maybe could be solved by removing an element of encoding (trading compression for simplicity), rather than further complicating by adding a type indictor
16:30:10 <meskio> I think is a good idea to have it for future proofness, but I'm not sure if a bit, a byte or 4 bits...
16:30:31 <dcf1> But I haven't really looked at the proposal, this is just an outsider's impression. I'll support your decision.
16:30:49 <shelikhoo> (there are already 2 filter dropped to favor simplicity)
16:31:12 <shelikhoo> I think we can go with one byte, just in case
16:31:33 <shelikhoo> the extra length should not matter that much in the end
16:31:53 <shelikhoo> and processing things bitwise could get out of hand quickly
16:32:26 <shelikhoo> (although it already so in the compression step)
16:32:42 <meskio> I'm ok with any solution (either having a type indicator or UTF-8 support)
16:33:24 <shelikhoo> okay I will comeback with these suggestions adopted
16:33:40 <meskio> sounds good
16:33:50 <onyinyang[m]> It seems like we've mostly come to a conclusion on this topic and can move on.
16:34:00 <meskio> should we decide a domain name? I haven't hear opinions, should we use brdg.es as is shorter?
16:34:21 <shelikhoo> I think we can go with brdg.es
16:34:30 <meskio> great
16:34:36 <meskio> onyinyang[m]: I'm finished now with this
16:34:51 <onyinyang[m]> cool :)
16:34:58 <onyinyang[m]> The next topic is from last week but is there anything further to discuss about the snowflake broker?
16:35:23 <onyinyang[m]> If not we can move to interesting links
16:35:28 <shelikhoo> I have already deployed a new version this monday. nothing to discuss from me
16:35:38 <meskio> nice
16:35:47 <onyinyang[m]> ok great. Let's discuss the interesting links then
16:36:02 <onyinyang[m]> The first is:     "On Precisely Detecting Censorship Circumvention in Real-World Networks"
16:36:02 <onyinyang[m]> https://www.robgjansen.com/publications/precisedetect-ndss2024.html
16:36:12 <shelikhoo> yes there seems to be a lots of new interesting links...
16:36:19 <meskio> it looks like an interesting collection of papers? should we pick one for a reading group?
16:36:58 <onyinyang[m]> did a conference just post a bunch of accepted papers that I missed? XD
16:37:11 <onyinyang[m]> I think picking one for a reading group sounds like a good idea!
16:37:12 <dcf1> I found these 3 papers last week. I'll post them to the mailng list probably.
16:37:33 <dcf1> "On Precisely Detecting..." is the one I would recommend for a reading group.
16:37:49 <meskio> sounds good, and rob is around if we want to invite him
16:38:15 <onyinyang[m]> nice
16:38:41 <onyinyang[m]> what is the time frame we usually give for people to read the paper before the discussion?
16:38:52 <meskio> usually we give two weeks
16:39:14 <meskio> but I will be AFK in two weeks, and I see others in our team will be AFK in the following weeks
16:39:41 <meskio> ahh, no, nov 12 looks good
16:39:42 <onyinyang[m]> yes, I was just checking that
16:39:47 <meskio> this is 3 weeks from now
16:40:00 <onyinyang[m]> I will be away XD but can probably join for the discussion
16:40:19 <cohosh> do you mean 9th?
16:40:25 <shelikhoo> from a quick look at chart, it seems snowflake rendezvous is the one considered to be the weakest part for censorship resistance
16:40:34 <meskio> ahh, true, you are AFK
16:40:54 <onyinyang[m]> yes but also the 12th is not a thursday, as cohosh points out lol
16:40:56 <cohosh> nov 12th is not a thursday, the 9th works for me
16:41:07 <onyinyang[m]> 9th also works for me :)
16:41:11 <meskio> ahh, yes I mean 9
16:41:16 <meskio> I was looking at october
16:41:17 <meskio> my head
16:41:28 <onyinyang[m]> hehe
16:41:31 <meskio> 9 sounds good to everone
16:41:37 <onyinyang[m]> Ok, November 9th it is!
16:41:56 <meskio> I'll poke rob about it just in case he wants to join
16:42:35 <onyinyang[m]> Is there anything else from any of the interesting links or otherwise that anyone wants to bring up?
16:42:35 <cohosh> we should invite ryan too
16:43:15 <meskio> ahh, is the main author, I saw rob picture and I assumed...
16:43:24 <meskio> I won't know ryan
16:43:42 <meskio> I think...
16:44:11 <meskio> I'll dropthem an email
16:44:12 <dcf1> https://www.rwails.org/research/wails_precisely_ndss24.pdf if you prefer, rob's page for it nicer though :D
16:44:22 <meskio> :D
16:44:48 <dcf1> meskio: I've already emailed them about it btw
16:45:03 <meskio> ohh, great, one less thing in my queue, thanks
16:45:18 <dcf1> We added a citation to the snowflake paper last week after finding it
16:46:14 <dcf1> shelikhoo: no, you are reading the chart backwards, they say rendezvous is by far the most difficult to classify (using the features they use, anyway) because it is basically HTTPS
16:46:48 <dcf1> "Detecting the TLS connections to the broker performed the least-well among the four circumvention protocols (FPR = 0.18), which is expected: our network capture contains mostly TLS flows, and the Snowflake broker connections are genuine TLS connections."
16:47:18 <dcf1> DTLS data transfer, they say, is the most easily detectable, more than obfs4.
16:47:24 <shelikhoo> dcf1: oh no... I will read it more carefully...
16:47:59 <dcf1> But one of the main observations is that even with their enhanced classifiers, there are too many false positives to be practical, so they turn to multiple observations per host to increase precision
16:48:03 <dcf1> It's an interesting paper.
16:48:44 <dcf1> "For more realistic base rates, such as λ > 1 × 10⁶, the precision attained by any of the classifiers is near-zero."
16:49:40 <onyinyang[m]> that sounds promising
16:49:44 <shelikhoo> I think a real censor could in theory block all international dtls traffic from a certain ip if that ip contacted broker's sni
16:50:31 <shelikhoo> but in reality, we didn't see this kind of things in practice
16:51:17 <shelikhoo> let's leave this to the discussion, I will need to read the paper in detail..
16:51:53 <onyinyang[m]> ok, I think that's it for today. I'm going to end the meeting now.
16:52:10 <onyinyang[m]> #endmeeting