17:59:13 #startmeeting anti-censorship meeting 17:59:13 Meeting started Thu Jan 30 17:59:13 2020 UTC. The chair is phw. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:59:13 Useful Commands: #action #agreed #help #info #idea #link #topic. 17:59:26 here's our meeting pad: https://pad.riseup.net/p/tor-anti-censorship-keep 17:59:29 sorry, still doing some IRC debugging 18:00:07 dcf1: let us know if you need help with the irc debugging :) 18:00:19 kyle_ must be the one we were trying to get connected 18:00:22 agix: feel free to add yourself to our meeting pad. we use it to keep track of what we did, what we plan to do, and what we need help with 18:00:31 hello 18:00:47 kyle_: here's the meeting agenda: https://pad.riseup.net/p/tor-anti-censorship-keep 18:01:12 alright thanks :) 18:01:15 Great! thanks 18:01:58 i would say let's get started by talking about kyle_'s thesis :) 18:02:07 sure thing 18:02:38 so for some background, im a senior undergrad at princeton working on an evaluation of snowflake 18:03:25 we're currently looking at the fingerprintability of snowflake with respect to other applications that use WebRTC to connect 18:04:14 right now im in the process of collecting enough data to then build a classifier 18:04:33 kyle_: who's your advisor at princeton? 18:04:38 good stuff kyle_ ! 18:04:39 prateek mittal 18:04:45 ah, gotcha 18:05:19 that's interesting, if you have a project link or anything written up so far (or when you're ready), you can add a link to https://trac.torproject.org/projects/tor/wiki/doc/Snowflake/Fingerprinting, which you probably already have seen 18:05:39 kyle_: what can we do to help your research project? 18:06:07 so i was hoping to get your advice on a few things 18:06:50 right now ive been thinking about building the classifer using traffic from the snowflake connection and traffic from a Facebook Messenger/Google Hangouts call 18:07:24 initially to just create a binary classifier (Snowflake vs. not-Snowflake) 18:07:52 would you agree with this approach and do you have any suggestions? 18:08:19 it might be good to see if there are more WebRTC applications out there 18:08:32 maybe this is helpful: https://trac.torproject.org/projects/tor/ticket/30579#comment:15 18:08:47 it's a list of STUN servers, that may or may not be used for WebRTC or similar applications 18:08:56 I think it's a good place to start, it's good to have something implemented to compare against 18:09:20 +cohosh: thanks! ill take a look there 18:09:34 specifically, this is a video conferencing thing that might be used in China: http://www.gotye.com.cn/ 18:09:36 One challenge of this kind of work is that it's not clear that Famebook Messenger/Google Hangouts is the "right" thing to compare against; i.e., we don't really know quantitatively how much of a fraction of WebRTC traffic those make up. 18:10:11 But it's fine to proceed with the assumption that those are the things you want to compare against, and build a classifier. 18:10:14 right. i would expect that snowflake will look different from google hangouts, in a number of obvious ways. i guess the question for the snowflake folks is: is that traffic profile ultimately what we're aiming to look like, or are we instead hoping that there are a variety of webrtc implementations and we'll hide in the long tail of the noise 18:10:54 it's fine if the answer is "either of those approaches would be great -- whichever one works" 18:10:57 google and facebook are also quite frequently blocked in a lot of places 18:10:58 some dpi vendors publish regular reports about what fraction of observed traffic a given protocol constitutes. if we're lucky, it can tell us the relative popularity of some webrtc-based applications. 18:11:03 right. because what the censor ultimately cares about it not "Snowflake or not-Snowflake", but "can I afford to block this connection, given my best guess about it. 18:11:55 it would be fun to have a pluggable transport that actually aims to look like an expected protocol. rather than one that is distinguishable but we argue that the protocol is so complicated, in real life deployment, that you can't afford to block weird implementations of it. 18:12:07 I wonder whether there's any other popular user of datachannels, and whether snowflake traffic looks like that 18:12:32 Steam supposedly uses data channel for game networking 18:12:33 right. kyle_, if you have an angle on those types of questions too, that would be a valuable research contribution. 18:13:23 https://github.com/ValveSoftware/GameNetworkingSockets 18:13:51 hmm maybe that's not webrtc though but something else that uses STUN 18:14:06 yeah these are all things that ill have to consider as i continue my work 18:14:13 kyle_: for another data point about the distant past: uproxy, the google ideas circumvention idea, originally planned to use webrtc as transport, but they wanted to use FTE to transform the webrtc traffic, "because what if somebody blocks webrtc?" 18:14:14 so it might be good to allow people to think about this and continue with some more discussion over email or something 18:14:51 kyle_: is it okay if we contact you (don't need to post contact details here and now, the channel is logged if that's a concern for you) 18:15:00 or can we start an email thread (public or private)? 18:15:05 yeah, absolutely! 18:15:21 or, this would be on-topic for public discussion forum like https://github.com/net4people/bbs or https://ntc.party/ 18:16:10 i could post there as well 18:16:23 the anti-censorship-team@ list is also public and a fine venue, especially for tor (or snowflake) specific topics 18:16:42 depends what audience you want to reach :) 18:16:43 I wanted to give you a chance to meet the people involved with Snowflake. (kyle_ got a referral to me, but it's not just me working on it) 18:17:04 https://lists.torproject.org/cgi-bin/mailman/listinfo/anti-censorship-team 18:17:07 http://lists.torproject.org/pipermail/anti-censorship-team/ 18:17:10 thanks for sharing kyle_, i'm excited to see what you come up with 18:17:21 thanks, it will definitely be helpful to bounce ideas and discuss things with everyone 18:17:27 thanks for having me 18:17:40 ok kyle_ I guess you and I can be in touch and discuss how you want to do communication as you work on the project 18:17:49 perfect sounds good! 18:17:55 thanks dcf1 and kyle_ 18:17:56 kyle_: my guess is that you won't even need a fancy machine learning classifier to distinguish snowflake from google hangouts. though using the ml gadget might be faster and simpler. 18:18:43 ooh, one other idea quick, you could try comparing against https://webtorrent.io/, that's more likely to use DataChannels and I hear it's popular. 18:19:11 kyle_: yeah. one of the huge contributions you could make is trying to get a handle on the landscape of what uses webrtc in practice, and how, and where. 18:19:58 i think ill definitely have to start looking into webrtc more now 18:19:59 whereas the censor, in each situation, just needs to understand use of webrtc on *their* network 18:20:25 so it might be that snowflake stands out more on the internet connection from the refuge camp in sudan, than it does on china's gfw 18:20:43 i see what you mean 18:21:40 so in an ideal world, the research conclusion would not be black or white, "snowflake works or it doesn't", but it would be more nuanced: *where* in the world is snowflake more likely to stand out vs blend in? 18:22:16 right and even that conclusion will probably change as new applications adopt webrtc 18:22:17 and that requires the holy grail of censorship circumvention analysis, which is knowing what internet traffic looks like everywhere 18:22:24 haha 18:22:31 yep. temporal issues are key too, you're right. 18:23:51 ok, let's move on with our agenda. thanks for showing up kyle_! 18:24:46 next up is our default-cc list on trac. i pasted our current configuration to our meeting pad. basically, cohosh and i are copied on all circumvention-related tickets, and dcf and arlolra are copied on a subset 18:24:55 please let me know if you want me to add/remove you from any of these 18:24:56 of course,thanks for the discussion, ill try and come for all subsequent meetings 18:26:27 also, it's the end of the month again and i'll soon be working on our monthly report 18:26:40 #action please add your monthly highlights to our report: https://pad.riseup.net/p/qSHmkenLXX5yE0pW3_2y 18:27:32 finally, the ndss'20 proceedings are out, with a ton of censorship-related papers. also, the iclab paper got into oakland and is finally out 18:28:04 more things to read :) 18:28:23 * phw is looking forward to net4people/bbs reading gropus 18:28:26 "Detecting Probe-resistant Proxies" is partially about an active-probing vulnerability in obfs4proxy (patched now) 18:28:33 amir's mass browser thing is in ndss too right? 18:29:03 arma2: it may be the thing that's called swarmproxy 18:29:14 ah ha. ok. i guess he needs a new name for each new paper. :) 18:29:30 if anybody here wanted to write a book report style summary of those papers, e.g. for the tor blog, the world would love you for it 18:29:46 (or a subset of them :) 18:30:15 For the Russia one there's https://github.com/net4people/bbs/issues/20 18:30:34 i wish rss was still a thing 18:30:42 I haven't read any of the others yet 18:31:43 let's move on to our 'needs help with' sections 18:31:51 I don't think github has, but ntc.party has RSS for topics: https://ntc.party/c/censorship-research-publications/22.rss 18:32:18 dcf1: ah nice 18:32:32 agix: is there anything we can help with related to your bridgedb work? or related to anything else, really? 18:34:17 not yet really. I have quit some reading up to do, but I am getting there :) I guess till the next meeting I will have some specific questions 18:34:20 cohosh: do you mind reviewing #31872 and #31427 for me? 18:34:44 phw: sure 18:34:46 agix: sounds good! you can reach all of us here on irc in the meanwhile, so you don't need to wait until next week's meeting 18:35:02 phw: can you review #33002 18:35:08 phw: cool thanks! 18:35:22 dcf1: i can also take a look at #33038 18:35:45 i usually try to get to those quickly, idk how that one slipped my attention 18:35:56 cohosh: thanks, no problem 18:36:02 cohosh: yup, added to my review pile 18:36:23 dcf1: don't worry about pinging me for reviews on #tor-dev if you want them more quickly for your work 18:37:23 cjb: for #31011, a network-team person would probably be best for a review 18:37:40 yep, agree 18:38:21 i think we're done with reviews. anything else for today? 18:38:30 I started on #19026, hope that's ok 18:38:40 arlolra: sounds great! 18:38:43 yeah all good arlolra 18:38:50 ok, thanks 18:39:22 * phw waits for a minute before ending the meeting 18:39:25 I can bring that review to the network team 18:39:26 I have some questions about snowflake+android+golang! It's fine for me to just ask cohosh outside the meeting if that's better. 18:39:30 * gaba is here reading 18:39:51 cjb: sounds good :) 18:40:17 thanks gaba 18:40:48 cool, is DM or #tor-dev or somewhere else better? 18:41:09 recommend saying it in the public channels (like #tor-dev) in case others want to jump in, or learn 18:41:46 yup #tor-dev is best 18:41:49 sgtm, thanks 18:42:40 ok, let's end the meeting. thanks everyone for attending! 18:42:45 #endmeeting