17:00:53 <mikeperry> #startmeeting Tor Congestion Control Research 17:00:53 <MeetBot> Meeting started Wed Aug 15 17:00:53 2018 UTC. The chair is mikeperry. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:53 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 17:01:08 <komlo> taking a sec to look over the agenda 17:01:38 <mikeperry> Yeah, reminder it is at https://pad.riseup.net/p/TorCongestionControl-keep 17:02:05 <mikeperry> and it is now reloading in my window so I can't see it :) 17:03:22 <mikeperry> so anyway the first thing that I wanted to do was try to consolidate some of the folklore attacks that has been floating around wrt datagram transports 17:04:39 <iang> Note that "congestion control" is not necessarily the same as "datagram transport" 17:04:43 <mikeperry> we've had a few tech reports about various approaches, and a couple research implementations, but there is not a real comprehensive treatment of security+privacy issues with this approach 17:04:52 <arma3> i am just barely here. am sending my two additions to the research question list via email. 17:05:54 <mikeperry> iang: yes, right. and we also need to figure out if any of the folklore generalizes to all congestion control, or if some of it is inherently a property of datagrams 17:06:24 <Jaym> Just to clarify: are the concerns and issues about circuit-wide datagram transport? Or hop-by-hop? 17:06:51 <iang> some (e.g. hopper) works better however Tor gets more responsive, regardless of what's happening at the network 17:07:10 <mikeperry> and many of these things actually apply to Tor as is. but the channels are just more noisy 17:07:20 <iang> correct 17:07:43 <mikeperry> so after we all go through this list, a key thing for later discussion will be how to measure this so that it is comparable to current Tor 17:07:51 <iang> and I don't think the answer to that is "let's keep Tor performance bad so that these attacks don't get better" 17:08:19 <iang> we should address the attacks in a better way 17:09:01 <komlo> i think it is a good point to distinguish datagram transports from congestion control, and changes how we address concerns 17:09:20 <iang> agree 17:10:07 <mikeperry> ok sounds good 17:10:23 <nickm> A lot of the problems listed on that list are datagram-specific 17:10:54 <mikeperry> So basically I see the Part I.B questions 3-4 being related to datagram vs non-datagram congestion control 17:11:11 <komlo> ideally we should look at both, but they could be analyzed separately (which hopefully will help clarify pros/cons on either side) 17:11:53 <mikeperry> I can speak to the datagram piece. datagram and drop-signaled congestion control allows us to put bounds on the memory consumption and queues of relays -- no more OOM/Sniper attacks. Those becomes congestion attacks instead. 17:12:35 <nickm> IMO the problem with queues are performance, not OOM. 17:12:37 <mikeperry> also, moving away from TCP ontol a more secure connection model eliminates "Man-on-the-side" connection termination vectors.. which I think may be easier to pull off against Tor than we have considered.. 17:12:44 <iang> mikeperry: that 17:12:53 <iang> s for the end-to-end datagram model 17:13:05 <iang> is that the only model on the table? In Rome, we talked about both 17:13:06 <mikeperry> yes 17:13:11 <catalyst> i agree with nickm, based on experiences troubleshooting bufferbloat-related issues 17:13:15 <iang> and arma's email just now says both 17:14:02 <mikeperry> nickm: we have seen a couple research papers now that deliberately trigger the circuit OOM killer as part of their attacks 17:14:54 <nickm> hm, fair enough 17:15:17 <nickm> But I don't think that the reason to move to datagram is to avoid that: it creates far more sidechannels than it closes. 17:15:22 <nickm> At least, so it seems 17:15:33 <nickm> I think the big reason people have been talking about datagram is for performance, right? 17:15:45 <mikeperry> iang: if you want to enumerate alternatives on the pad, that would be useful 17:16:34 <mikeperry> nickm: yes. that's also part of why I want to have this meeting. because it's clear we need to study more than just performance 17:17:02 <nickm> right 17:17:53 <Jaym> We can assume passive traffic confirmation to be easier, but we're fine with that right? 17:17:53 <mikeperry> nickm: I also think that the types of "new" side channels in datagram are forms of side channels that already exist in Tor, but are more noisy.. like from an information-theoretic perspective, a drop is a less noisy form of "pause throughput for a while" 17:18:11 <nickm> I think of these as high-bandwidth 17:18:18 <nickm> *high-bandwidth sidechannels 17:18:25 <mikeperry> and I think we should find a way to measure this bandwidth 17:19:33 <mikeperry> because if we introduce padding that is not end-to-end and unacked, the channel becomes more noisy, which means less bandwidth is available. at what padding rate does that start to approach delay-based side channels? 17:19:40 <iang> at europoakland, george danezis had a neat paper closing some of these side channels. the downside is that it only protected you against 3rd parties, not the person you're communicating with 17:20:02 <nickm> could you link that in the pad? 17:21:36 <nickm> mikeperry: so I tried to think about it, and I think there are a few ways to measure the bandwidth, and we need to think about more than one... 17:22:02 <komlo> would it be fair to say that one of the first questions that should be answered is if datagram transports can be safefy used? 17:22:02 <nickm> one important one, practically speaking, is the total number of bits you can send per circuit. 17:22:11 <nickm> another would be bits per cell 17:22:14 <mikeperry> iang: that still sounds interesting.. in the end-to-end model the partners would be client and exit, with the other nodes being "third parties".. I think in some ways side channels that the exit can induce are kinda equivalent to many ways they could muck with current TCP and sendme delivery already.. 17:22:17 <nickm> another would be bits per second 17:23:08 <nickm> all of those matter, i think 17:24:08 <nickm> but we're still learning here: it would not be shocking if somebody comes up with a better way to advertise these in a year or two 17:24:30 <mikeperry> komlo: I am not sure that answer can be definitive without good metrics for this stuff that can be compared to Tor as it is today 17:24:53 <mikeperry> komlo: and even then, there are tweaks to all of these designs that may change things 17:25:06 <nickm> s/advertise/measure/ 17:25:11 <nickm> sorry, my brain is messed up 17:25:26 <nickm> *but we're still learning here: it would not be shocking if somebody comes up with a better way to *measure these in a year or two 17:26:30 <sysrqb> right, so should this meeting concentrate on the merits of datagram transport or congestion control? 17:26:46 <mikeperry> nickm: I see us needing to resort to noisy-channel communication theory to get those values.. and they may be a function of other activity 17:27:07 <sysrqb> it seems like it started with the latter and move onto the former 17:27:19 <nickm> I think it started with talking about QUIC 17:27:37 <nickm> what do you say, mikeperry ? 17:27:50 <iang> has anyone here been communicating with the qatar group working on quictor specifically? 17:28:20 <catalyst> i think we should think about whether we want to look at congestion/latency performance improvements as opposed to datagram transport specifically 17:28:34 <catalyst> i.e., I.B.4 in the agenda 17:28:46 <mikeperry> iang: I believe all they had was an abstract. I think both QUUX and qutor studied replacing TLS with QUIC 17:29:04 <nickm> mikeperry: end-to-end, or just on the link? 17:29:17 <mikeperry> nickm: just the link, unfortunately. 17:29:26 <nickm> much easier to analyze, though 17:29:28 <iang> iiuc, they have (working?) code at least. 17:29:29 <nickm> but yea 17:30:37 <mikeperry> yeah, the code I saw replaced connection_or.c with quic things 17:31:41 <nickm> Do we have a list of congestion-control ideas we want to look at? 17:32:06 <Jaym> Would you be interested by an end-to-end implem? I may have some funding which would allow me to do this 17:32:08 <nickm> I'm hearing "datagram" vs "other", but I bet we all have different stuff in mind for "other" 17:32:28 <iang> stef's on a plane right now (I think? maybe just at the airport?) but her CC paper is approaching ready 17:32:56 <nickm> Jaym: that's a good question, and one we're going to ahve to figure out. 17:32:58 <mikeperry> and I have some proposed hacks to datagram to mitigate drop+reorder side channels 17:33:22 <iang> it keeps the hop-by-hop tcp, but uses the queues to do rate-based CC 17:33:57 <komlo> mikeperry: maybe one action item from this meeting can be to identify the metrics needed to determine if datagram transports can be used safely? (this might fit under research frameworks) 17:33:58 <nickm> interesting 17:34:03 <mikeperry> iang: is that an explicit signal sent as a cell by each hop or inferred? 17:34:33 <iang> explicit 17:35:46 <mikeperry> komlo: yeah 17:36:29 <mikeperry> one thing we should consider wrt metrics is how they behave under lab conditions vs the tor network 17:37:04 <komlo> i started an action items section at the bottom of the pad, adding this 17:37:15 <nickm> so, what's next? :) 17:37:25 <mikeperry> like how do we model noise introduced by other traffic, as well as the law of large numbers/base rate issues 17:37:30 <iang> mikeperry: which is part of a much larger task of building measurement tools representative of the live network 17:38:50 <mikeperry> like I see naive bits-per-second metrics being incomparable to ones that properly take into account various forms of noise -- organic and deliberate 17:39:27 <mikeperry> for example: how much capacity to drop-based side channels have in the end-to-end model if the adversary cannot tell which packets are unacked padding vs end-to-end acked data 17:40:55 <nickm> komlo: sorry, I can't make my bullet points match yours :/ 17:40:58 <mikeperry> then there is also circuit level encryption -- can we use order preserving encryption to middle hops such that they can correct reordering or convert it into drops 17:41:27 <mikeperry> and can we also detect tagging attacks early with that 17:41:32 <iang> I don't see OPE being of benefit there? 17:42:09 <mikeperry> iang: like if there was an OPE field that conveyed ordering to middle hops, they could prevent out-of-order packets from making it to the last hop 17:43:10 <iang> couldn't you just have a per-circuit sequence number in the clear (under the hop-by-hop encryption) in that case? If all middle nodes see it, isn't that the same? 17:43:25 <Jaym> Can that be done in the reverse path? 17:43:40 <Jaym> exit->client, with encryption 17:44:19 <iang> the cleartext version of course can. the ope version would be trickier, since the exit doesn't share keys with the middle or guard 17:44:24 <nickm> we're coming up on 60 minutes. what else do we want to answer today? 17:45:05 <Jaym> iang: the question was for the OPE field onion-encrypted 17:45:37 <mikeperry> iang: I think that cleartext sequence numbers allows the endpoint to have a cleaner signal as to which of their drops were applied to padding vs which actually were done to end-to-end traffic 17:46:11 <nickm> I think we need to have this design written down to analyze it. There are a lot of moving parts 17:46:22 <iang> the endpoint here is the exit node, which *has* to know that, since it's the one shoving the result into a TCP stream. 17:46:25 <nickm> (same goes for most designs) 17:47:57 <komlo> nickm: +1 17:47:57 <mikeperry> so yeah, it seems that one action item is to enumerate the designs. naive QUIC; padded QUIC; padded+OPE QUIC; semi-reliable hop-by-hop (ian's I.A.5.a); and ECN-under-current-Tor 17:48:47 <iang> didn't sjmurdoch long ago have a tech report listing a whole lot of such options? 17:49:03 <iang> of course it's somewhat outdated 17:49:11 <nickm> I think that's from back in the svn days; it should still be around 17:49:33 <mikeperry> iang: but if the sequence numbers are visible to all hops, if I'm the guard and I deliberately drop packet N, the exit sees that packet N was a sequencde number that made it to the end and was dropped.. but if N is not visible to me at the guard, I don't know if my drop survived at the exit, or was actually an organic drop 17:50:25 <mikeperry> iang: yah, it examined basically our options for datagram transports using then-existing userland congestion control, but did not have any of this side channel analysis 17:50:28 <iang> the guard still knows it's the nth packet on the circuit. the difference isn't that big, I'd think? 17:50:42 <iang> right. the side-channel analysis is key here 17:51:19 <tvdm> murdoch report from November 2011, I think (can link in pad) 17:51:31 <nickm> yes please 17:52:34 <mikeperry> iang: if some percentage of packets are being dropped at the middle, there will be a probability that your attempted drop is a drop of a padding cell.. there will also be a probability of organic drops also being present 17:52:43 <arma3> both of steven's tech reports are on https://research.torproject.org/techreports.html 17:52:52 <mikeperry> so you basically lose bits in your side channel and have to perform some kind of error correction 17:53:06 <arma3> (and by both apparently i mean all five) 17:53:33 <iang> mikeperry: sure. I don't see that as hurting the attack, though, I guess. 17:54:18 <iang> since the attacker is basically trying to get ~1 bit through. 17:54:49 <mikeperry> iang: I think there is a point where the cost of doing that error correction starts to make the side channel comparable to existing delay-based side channels in Tor today 17:55:41 <mikeperry> which will depend on the padding rate and the organic drop rate 17:55:44 <iang> that's true. (Though it's not like we're really comfortable with that existing channel either.) 17:56:14 <nickm> +1 17:56:26 * iang needs to be at his next meeting. I'll keep the window open and scroll back after if this meeting continues. 17:57:16 <mikeperry> ok I think as far as action items, we have: 1. create a better sketch of design alternatives 17:57:51 <mikeperry> 2. Try to break our questions into research topics 17:59:14 <mikeperry> wrt the delay-based side channel -- is there anything we can do to mitigate it? and if we could, why would that not also apply to the end-to-end datagram case? 18:00:39 <nickm> good question. 18:00:52 <komlo> mikeperry: i think with each design, enumerating the assumptions and potential safety tradeoffs will help determine what mitigations/further work are necessary? 18:01:10 <nickm> I would be happy to try to write up what we've been calling the "side channel" issues here, though they extend far beyond side channels 18:01:47 <nickm> I'd also like to write up all the similar issues in current Tor; especially if mikeperry can help me enumerate them. 18:02:00 <Jaym> yes! more precisions about what you consider side-channels and what is not would be neat 18:02:14 <mikeperry> I am a bit confused by the difference between "side channel" and I.A.1'/extended issues 18:03:01 <nickm> IMO a side channel is when you can send more information in the protocol than is intended. 18:03:02 <mikeperry> komlo: +1 18:03:05 <arma3> bram, back in the day, really thought we should use libutp style delay-based congestion control, rather than drop-based, and rather than the weird thing we do now. steven tried to set up a real experiment, but couldn't get stuff working. 18:03:17 <mikeperry> nickm: +1 on writing up/enumerating similar issues in Tor 18:03:22 <komlo> +1 to doing a cross-comparison between designs. also addressing which threats should be mitigated/are outside of scope 18:03:46 <nickm> 1.A.1' is something different: we aren't just sending information, we're provoking different behavior by a communications partner 18:04:06 <arma3> "active side channel"? maybe there is a better name 18:05:32 <mikeperry> my preference is for us to arrive at a metrics framework under which these things are all information leaks that can be comparable in terms of the amount of information provided to the adversary 18:06:11 <nickm> That's desirable, but only to the extent that it reflects some ground truth 18:07:41 <nickm> I can try to do the writeup on attacks next week, if that's worthwhile 18:08:22 <Jaym> That seems useful to me 18:08:31 <tvdm> Agreed.. 18:08:57 <komlo> (signing off but will read scrollback later) 18:08:58 <mikeperry> yeah 18:09:02 <nickm> can somebody else take on coming up with a list (and breakdown) of existing designs in this field? 18:09:15 <mikeperry> I can do this 18:09:29 <nickm> mikeperry: also can you send me a list of the attacks in current tor that you're thinking of some time this week? 18:09:38 <nickm> or before monday? 18:09:38 <mikeperry> I'll also try to enumerate some research-paper sized topics that we have so far 18:11:03 <mikeperry> ok 18:11:46 <mikeperry> so I will take a stab at C, F, and G on the action items list 18:12:01 <mikeperry> and then send that around, perhaps on a fresh pad. 18:12:52 <mikeperry> and then from there we can ponder if we want to discuss metrics, research topics, or frameworks in the next meeting in a few weeks 18:13:20 <mikeperry> maybe waiting for a preprint of stephanie and ian's paper for further comparison 18:14:18 <mikeperry> I think we're good for now, then? 18:14:24 * Samdney is watching and reading the backlog ... 18:14:26 <nickm> I think! 18:14:45 <nickm> seems like a good start 18:14:51 <nickm> ready to #endmeeting? 18:15:04 <mikeperry> yep! 18:15:08 <mikeperry> #endmeeting