17:59:13 <phw> #startmeeting anti-censorship meeting
17:59:13 <MeetBot> Meeting started Thu Jan 30 17:59:13 2020 UTC.  The chair is phw. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:59:13 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
17:59:26 <phw> here's our meeting pad: https://pad.riseup.net/p/tor-anti-censorship-keep
17:59:29 <dcf1> sorry, still doing some IRC debugging
18:00:07 <arma2> dcf1: let us know if you need help with the irc debugging :)
18:00:19 <dcf1> kyle_ must be the one we were trying to get connected
18:00:22 <phw> agix: feel free to add yourself to our meeting pad. we use it to keep track of what we did, what we plan to do, and what we need help with
18:00:31 <kyle_> hello
18:00:47 <dcf1> kyle_: here's the meeting agenda: https://pad.riseup.net/p/tor-anti-censorship-keep
18:01:12 <agix> alright thanks :)
18:01:15 <kyle_> Great! thanks
18:01:58 <phw> i would say let's get started by talking about kyle_'s thesis :)
18:02:07 <kyle_> sure thing
18:02:38 <kyle_> so for some background, im a senior undergrad at princeton working on an evaluation of snowflake
18:03:25 <kyle_> we're currently looking at the fingerprintability of snowflake with respect to other applications that use WebRTC to connect
18:04:14 <kyle_> right now im in the process of collecting enough data to then build a classifier
18:04:33 <phw> kyle_: who's your advisor at princeton?
18:04:38 <cohosh> good stuff kyle_ !
18:04:39 <kyle_> prateek mittal
18:04:45 <phw> ah, gotcha
18:05:19 <dcf1> that's interesting, if you have a project link or anything written up so far (or when you're ready), you can add a link to https://trac.torproject.org/projects/tor/wiki/doc/Snowflake/Fingerprinting, which you probably already have seen
18:05:39 <dcf1> kyle_: what can we do to help your research project?
18:06:07 <kyle_> so i was hoping to get your advice on a few things
18:06:50 <kyle_> right now ive been thinking about building the classifer using traffic from the snowflake connection and traffic from a Facebook Messenger/Google Hangouts call
18:07:24 <kyle_> initially to just create a binary classifier (Snowflake vs. not-Snowflake)
18:07:52 <kyle_> would you agree with this approach and do you have any suggestions?
18:08:19 <cohosh> it might be good to see if there are more WebRTC applications out there
18:08:32 <cohosh> maybe this is helpful: https://trac.torproject.org/projects/tor/ticket/30579#comment:15
18:08:47 <cohosh> it's a list of STUN servers, that may or may not be used for WebRTC or similar applications
18:08:56 <dcf1> I think it's a good place to start, it's good to have something implemented to compare against
18:09:20 <kyle_> +cohosh: thanks! ill take a look there
18:09:34 <cohosh> specifically, this is a video conferencing thing that might be used in China: http://www.gotye.com.cn/
18:09:36 <dcf1> One challenge of this kind of work is that it's not clear that Famebook Messenger/Google Hangouts is the "right" thing to compare against; i.e., we don't really know quantitatively how much of a fraction of WebRTC traffic those make up.
18:10:11 <dcf1> But it's fine to proceed with the assumption that those are the things you want to compare against, and build a classifier.
18:10:14 <arma2> right. i would expect that snowflake will look different from google hangouts, in a number of obvious ways. i guess the question for the snowflake folks is: is that traffic profile ultimately what we're aiming to look like, or are we instead hoping that there are a variety of webrtc implementations and we'll hide in the long tail of the noise
18:10:54 <arma2> it's fine if the answer is "either of those approaches would be great -- whichever one works"
18:10:57 <cohosh> google and facebook are also quite frequently blocked in a lot of places
18:10:58 <phw> some dpi vendors publish regular reports about what fraction of observed traffic a given protocol constitutes. if we're lucky, it can tell us the relative popularity of some webrtc-based applications.
18:11:03 <dcf1> right. because what the censor ultimately cares about it not "Snowflake or not-Snowflake", but "can I afford to block this connection, given my best guess about it.
18:11:55 <arma2> it would be fun to have a pluggable transport that actually aims to look like an expected protocol. rather than one that is distinguishable but we argue that the protocol is so complicated, in real life deployment, that you can't afford to block weird implementations of it.
18:12:07 <cjb> I wonder whether there's any other popular user of datachannels, and whether snowflake traffic looks like that
18:12:32 <cohosh> Steam supposedly uses data channel for game networking
18:12:33 <dcf1> right. kyle_, if you have an angle on those types of questions too, that would be a valuable research contribution.
18:13:23 <cohosh> https://github.com/ValveSoftware/GameNetworkingSockets
18:13:51 <cohosh> hmm maybe that's not webrtc though but something else that uses STUN
18:14:06 <kyle_> yeah these are all things that ill have to consider as i continue my work
18:14:13 <arma2> kyle_: for another data point about the distant past: uproxy, the google ideas circumvention idea, originally planned to use webrtc as transport, but they wanted to use FTE to transform the webrtc traffic, "because what if somebody blocks webrtc?"
18:14:14 <dcf1> so it might be good to allow people to think about this and continue with some more discussion over email or something
18:14:51 <dcf1> kyle_: is it okay if we contact you (don't need to post contact details here and now, the channel is logged if that's a concern for you)
18:15:00 <dcf1> or can we start an email thread (public or private)?
18:15:05 <kyle_> yeah, absolutely!
18:15:21 <dcf1> or, this would be on-topic for public discussion forum like https://github.com/net4people/bbs or https://ntc.party/
18:16:10 <kyle_> i could post there as well
18:16:23 <arma2> the anti-censorship-team@ list is also public and a fine venue, especially for tor (or snowflake) specific topics
18:16:42 <arma2> depends what audience you want to reach :)
18:16:43 <dcf1> I wanted to give you a chance to meet the people involved with Snowflake. (kyle_ got a referral to me, but it's not just me working on it)
18:17:04 <dcf1> https://lists.torproject.org/cgi-bin/mailman/listinfo/anti-censorship-team
18:17:07 <dcf1> http://lists.torproject.org/pipermail/anti-censorship-team/
18:17:10 <cohosh> thanks for sharing kyle_, i'm excited to see what you come up with
18:17:21 <kyle_> thanks, it will definitely be helpful to bounce ideas and discuss things with everyone
18:17:27 <kyle_> thanks for having me
18:17:40 <dcf1> ok kyle_ I guess you and I can be in touch and discuss how you want to do communication as you work on the project
18:17:49 <kyle_> perfect sounds good!
18:17:55 <phw> thanks dcf1 and kyle_
18:17:56 <arma2> kyle_: my guess is that you won't even need a fancy machine learning classifier to distinguish snowflake from google hangouts. though using the ml gadget might be faster and simpler.
18:18:43 <dcf1> ooh, one other idea quick, you could try comparing against https://webtorrent.io/, that's more likely to use DataChannels and I hear it's popular.
18:19:11 <arma2> kyle_: yeah. one of the huge contributions you could make is trying to get a handle on the landscape of what uses webrtc in practice, and how, and where.
18:19:58 <kyle_> i think ill definitely have to start looking into webrtc more now
18:19:59 <arma2> whereas the censor, in each situation, just needs to understand use of webrtc on *their* network
18:20:25 <arma2> so it might be that snowflake stands out more on the internet connection from the refuge camp in sudan, than it does on china's gfw
18:20:43 <kyle_> i see what you mean
18:21:40 <arma2> so in an ideal world, the research conclusion would not be black or white, "snowflake works or it doesn't", but it would be more nuanced: *where* in the world is snowflake more likely to stand out vs blend in?
18:22:16 <kyle_> right and even that conclusion will probably change as new applications adopt webrtc
18:22:17 <arma2> and that requires the holy grail of censorship circumvention analysis, which is knowing what internet traffic looks like everywhere
18:22:24 <kyle_> haha
18:22:31 <arma2> yep. temporal issues are key too, you're right.
18:23:51 <phw> ok, let's move on with our agenda. thanks for showing up kyle_!
18:24:46 <phw> next up is our default-cc list on trac. i pasted our current configuration to our meeting pad. basically, cohosh and i are copied on all circumvention-related tickets, and dcf and arlolra are copied on a subset
18:24:55 <phw> please let me know if you want me to add/remove you from any of these
18:24:56 <kyle_> of course,thanks for the discussion, ill try and come for all subsequent meetings
18:26:27 <phw> also, it's the end of the month again and i'll soon be working on our monthly report
18:26:40 <phw> #action please add your monthly highlights to our report: https://pad.riseup.net/p/qSHmkenLXX5yE0pW3_2y
18:27:32 <phw> finally, the ndss'20 proceedings are out, with a ton of censorship-related papers. also, the iclab paper got into oakland and is finally out
18:28:04 <dcf1> more things to read :)
18:28:23 * phw is looking forward to net4people/bbs reading gropus
18:28:26 <dcf1> "Detecting Probe-resistant Proxies" is partially about an active-probing vulnerability in obfs4proxy (patched now)
18:28:33 <arma2> amir's mass browser thing is in ndss too right?
18:29:03 <phw> arma2: it may be the thing that's called swarmproxy
18:29:14 <arma2> ah ha. ok. i guess he needs a new name for each new paper. :)
18:29:30 <arma2> if anybody here wanted to write a book report style summary of those papers, e.g. for the tor blog, the world would love you for it
18:29:46 <arma2> (or a subset of them :)
18:30:15 <dcf1> For the Russia one there's https://github.com/net4people/bbs/issues/20
18:30:34 <cohosh> i wish rss was still a thing
18:30:42 <dcf1> I haven't read any of the others yet
18:31:43 <phw> let's move on to our 'needs help with' sections
18:31:51 <dcf1> I don't think github has, but ntc.party has RSS for topics: https://ntc.party/c/censorship-research-publications/22.rss
18:32:18 <cohosh> dcf1: ah nice
18:32:32 <phw> agix: is there anything we can help with related to your bridgedb work? or related to anything else, really?
18:34:17 <agix> not yet really. I have quit some reading up to do, but I am getting there :) I guess till the next meeting I will have some specific questions
18:34:20 <phw> cohosh: do you mind reviewing #31872 and #31427 for me?
18:34:44 <cohosh> phw: sure
18:34:46 <phw> agix: sounds good! you can reach all of us here on irc in the meanwhile, so you don't need to wait until next week's meeting
18:35:02 <cohosh> phw:  can you review #33002
18:35:08 <agix> phw: cool thanks!
18:35:22 <cohosh> dcf1: i can also take a look at #33038
18:35:45 <cohosh> i usually try to get to those quickly, idk how that one slipped my attention
18:35:56 <dcf1> cohosh: thanks, no problem
18:36:02 <phw> cohosh: yup, added to my review pile
18:36:23 <cohosh> dcf1: don't worry about pinging me for reviews on #tor-dev if you want them more quickly for your work
18:37:23 <phw> cjb: for #31011, a network-team person would probably be best for a review
18:37:40 <cjb> yep, agree
18:38:21 <phw> i think we're done with reviews. anything else for today?
18:38:30 <arlolra> I started on #19026, hope that's ok
18:38:40 <cohosh> arlolra: sounds great!
18:38:43 <dcf1> yeah all good arlolra
18:38:50 <arlolra> ok, thanks
18:39:22 * phw waits for a minute before ending the meeting
18:39:25 <gaba> I can bring that review to the network team
18:39:26 <cjb> I have some questions about snowflake+android+golang!  It's fine for me to just ask cohosh outside the meeting if that's better.
18:39:30 * gaba is here reading
18:39:51 <cohosh> cjb: sounds good :)
18:40:17 <phw> thanks gaba
18:40:48 <cjb> cool, is DM or #tor-dev or somewhere else better?
18:41:09 <arma2> recommend saying it in the public channels (like #tor-dev) in case others want to jump in, or learn
18:41:46 <cohosh> yup #tor-dev is best
18:41:49 <cjb> sgtm, thanks
18:42:40 <phw> ok, let's end the meeting. thanks everyone for attending!
18:42:45 <phw> #endmeeting