17:00:53 <mikeperry> #startmeeting Tor Congestion Control Research
17:00:53 <MeetBot> Meeting started Wed Aug 15 17:00:53 2018 UTC.  The chair is mikeperry. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:53 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
17:01:08 <komlo> taking a sec to look over the agenda
17:01:38 <mikeperry> Yeah, reminder it is at https://pad.riseup.net/p/TorCongestionControl-keep
17:02:05 <mikeperry> and it is now reloading in my window so I can't see it :)
17:03:22 <mikeperry> so anyway the first thing that I wanted to do was try to consolidate some of the folklore attacks that has been floating around wrt datagram transports
17:04:39 <iang> Note that "congestion control" is not necessarily the same as "datagram transport"
17:04:43 <mikeperry> we've had a few tech reports about various approaches, and a couple research implementations, but there is not a real comprehensive treatment of security+privacy issues with this approach
17:04:52 <arma3> i am just barely here. am sending my two additions to the research question list via email.
17:05:54 <mikeperry> iang: yes, right. and we also need to figure out if any of the folklore generalizes to all congestion control, or if some of it is inherently a property of datagrams
17:06:24 <Jaym> Just to clarify: are the concerns and issues about circuit-wide datagram transport? Or hop-by-hop?
17:06:51 <iang> some (e.g. hopper) works better however Tor gets more responsive, regardless of what's happening at the network
17:07:10 <mikeperry> and many of these things actually apply to Tor as is. but the channels are just more noisy
17:07:20 <iang> correct
17:07:43 <mikeperry> so after we all go through this list, a key thing for later discussion will be how to measure this so that it is comparable to current Tor
17:07:51 <iang> and I don't think the answer to that is "let's keep Tor performance bad so that these attacks don't get better"
17:08:19 <iang> we should address the attacks in a better way
17:09:01 <komlo> i think it is a good point to distinguish datagram transports from congestion control, and changes how we address concerns
17:09:20 <iang> agree
17:10:07 <mikeperry> ok sounds good
17:10:23 <nickm> A lot of the problems listed on that list are datagram-specific
17:10:54 <mikeperry> So basically I see the Part I.B questions 3-4 being related to datagram vs non-datagram congestion control
17:11:11 <komlo> ideally we should look at both, but they could be analyzed separately (which hopefully will help clarify pros/cons on either side)
17:11:53 <mikeperry> I can speak to the datagram piece. datagram and drop-signaled congestion control allows us to put bounds on the memory consumption and queues of relays -- no more OOM/Sniper attacks. Those becomes congestion attacks instead.
17:12:35 <nickm> IMO the problem with queues are performance, not OOM.
17:12:37 <mikeperry> also, moving away from TCP ontol a more secure connection model eliminates "Man-on-the-side" connection termination vectors.. which I think may be easier to pull off against Tor than we have considered..
17:12:44 <iang> mikeperry: that
17:12:53 <iang> s for the end-to-end datagram model
17:13:05 <iang> is that the only model on the table?  In Rome, we talked about both
17:13:06 <mikeperry> yes
17:13:11 <catalyst> i agree with nickm, based on experiences troubleshooting bufferbloat-related issues
17:13:15 <iang> and arma's email just now says both
17:14:02 <mikeperry> nickm: we have seen a couple research papers now that deliberately trigger the circuit OOM killer as part of their attacks
17:14:54 <nickm> hm, fair enough
17:15:17 <nickm> But I don't think that the reason to move to datagram is to avoid that: it creates far more sidechannels than it closes.
17:15:22 <nickm> At least, so it seems
17:15:33 <nickm> I think the big reason people have been talking about datagram is for performance, right?
17:15:45 <mikeperry> iang: if you want to enumerate alternatives on the pad, that would be useful
17:16:34 <mikeperry> nickm: yes. that's also part of why I want to have this meeting. because it's clear we need to study more than just performance
17:17:02 <nickm> right
17:17:53 <Jaym> We can assume passive traffic confirmation to be easier, but we're fine with that right?
17:17:53 <mikeperry> nickm: I also think that the types of "new" side channels in datagram are forms of side channels that already exist in Tor, but are more noisy.. like from an information-theoretic perspective, a drop is a less noisy form of "pause throughput for a while"
17:18:11 <nickm> I think of these as high-bandwidth
17:18:18 <nickm> *high-bandwidth sidechannels
17:18:25 <mikeperry> and I think we should find a way to measure this bandwidth
17:19:33 <mikeperry> because if we introduce padding that is not end-to-end and unacked, the channel becomes more noisy, which means less bandwidth is available. at what padding rate does that start to approach delay-based side channels?
17:19:40 <iang> at europoakland, george danezis had a neat paper closing some of these side channels.  the downside is that it only protected you against 3rd parties, not the person you're communicating with
17:20:02 <nickm> could you link that in the pad?
17:21:36 <nickm> mikeperry: so I tried to think about it, and I think there are a few ways to measure the bandwidth, and we need to think about more than one...
17:22:02 <komlo> would it be fair to say that one of the first questions that should be answered is if datagram transports can be safefy used?
17:22:02 <nickm> one important one, practically speaking, is the total number of bits you can send per circuit.
17:22:11 <nickm> another would be bits per cell
17:22:14 <mikeperry> iang: that still sounds interesting.. in the end-to-end model the partners would be client and exit, with the other nodes being "third parties".. I think in some ways side channels that the exit can induce are kinda equivalent to many ways they could muck with current TCP and sendme delivery already..
17:22:17 <nickm> another would be bits per second
17:23:08 <nickm> all of those matter, i think
17:24:08 <nickm> but we're still learning here: it would not be shocking if somebody comes up with a better way to advertise these in a year or two
17:24:30 <mikeperry> komlo: I am not sure that answer can be definitive without good metrics for this stuff that can be compared to Tor as it is today
17:24:53 <mikeperry> komlo: and even then, there are tweaks to all of these designs that may change things
17:25:06 <nickm> s/advertise/measure/
17:25:11 <nickm> sorry, my brain is messed up
17:25:26 <nickm> *but we're still learning here: it would not be shocking if somebody comes up with a better way to *measure these in a year or two
17:26:30 <sysrqb> right, so should this meeting concentrate on the merits of datagram transport or congestion control?
17:26:46 <mikeperry> nickm: I see us needing to resort to noisy-channel communication theory to get those values.. and they may be a function of other activity
17:27:07 <sysrqb> it seems like it started with the latter and move onto the former
17:27:19 <nickm> I think it started with talking about QUIC
17:27:37 <nickm> what do you say, mikeperry ?
17:27:50 <iang> has anyone here been communicating with the qatar group working on quictor specifically?
17:28:20 <catalyst> i think we should think about whether we want to look at congestion/latency performance improvements as opposed to datagram transport specifically
17:28:34 <catalyst> i.e., I.B.4 in the agenda
17:28:46 <mikeperry> iang: I believe all they had was an abstract. I think both QUUX and qutor studied replacing TLS with QUIC
17:29:04 <nickm> mikeperry: end-to-end, or just on the link?
17:29:17 <mikeperry> nickm: just the link, unfortunately.
17:29:26 <nickm> much easier to analyze, though
17:29:28 <iang> iiuc, they have (working?) code at least.
17:29:29 <nickm> but yea
17:30:37 <mikeperry> yeah, the code I saw replaced connection_or.c with quic things
17:31:41 <nickm> Do we have a list of congestion-control ideas we want to look at?
17:32:06 <Jaym> Would you be interested by an end-to-end implem? I may have some funding which would allow me to do this
17:32:08 <nickm> I'm hearing "datagram" vs "other", but I bet we all have different stuff in mind for "other"
17:32:28 <iang> stef's on a plane right now (I think? maybe just at the airport?) but her CC paper is approaching ready
17:32:56 <nickm> Jaym: that's a good question, and one we're going to ahve to figure out.
17:32:58 <mikeperry> and I have some proposed hacks to datagram to mitigate drop+reorder side channels
17:33:22 <iang> it keeps the hop-by-hop tcp, but uses the queues to do rate-based CC
17:33:57 <komlo> mikeperry: maybe one action item from this meeting can be to identify the metrics needed to determine if datagram transports can be used safely? (this might fit under research frameworks)
17:33:58 <nickm> interesting
17:34:03 <mikeperry> iang: is that an explicit signal sent as a cell by each hop or inferred?
17:34:33 <iang> explicit
17:35:46 <mikeperry> komlo: yeah
17:36:29 <mikeperry> one thing we should consider wrt metrics is how they behave under lab conditions vs the tor network
17:37:04 <komlo> i started an action items section at the bottom of the pad, adding this
17:37:15 <nickm> so, what's next? :)
17:37:25 <mikeperry> like how do we model noise introduced by other traffic, as well as the law of large numbers/base rate issues
17:37:30 <iang> mikeperry: which is part of a much larger task of building measurement tools representative of the live network
17:38:50 <mikeperry> like I see naive bits-per-second metrics being incomparable to ones that properly take into account various forms of noise -- organic and deliberate
17:39:27 <mikeperry> for example: how much capacity to drop-based side channels have in the end-to-end model if the adversary cannot tell which packets are unacked padding vs end-to-end acked data
17:40:55 <nickm> komlo: sorry, I can't make my bullet points match yours :/
17:40:58 <mikeperry> then there is also circuit level encryption -- can we use order preserving encryption to middle hops such that they can correct reordering or convert it into drops
17:41:27 <mikeperry> and can we also detect tagging attacks early with that
17:41:32 <iang> I don't see OPE being of benefit there?
17:42:09 <mikeperry> iang: like if there was an OPE field that conveyed ordering to middle hops, they could prevent out-of-order packets from making it to the last hop
17:43:10 <iang> couldn't you just have a per-circuit sequence number in the clear (under the hop-by-hop encryption) in that case?  If all middle nodes see it, isn't that the same?
17:43:25 <Jaym> Can that be done in the reverse path?
17:43:40 <Jaym> exit->client, with encryption
17:44:19 <iang> the cleartext version of course can.  the ope version would be trickier, since the exit doesn't share keys with the middle or guard
17:44:24 <nickm> we're coming up on 60 minutes. what else do we want to answer today?
17:45:05 <Jaym> iang: the question was for the OPE field onion-encrypted
17:45:37 <mikeperry> iang: I think that cleartext sequence numbers allows the endpoint to have a cleaner signal as to which of their drops were applied to padding vs which actually were done to end-to-end traffic
17:46:11 <nickm> I think we need to have this design written down to analyze it. There are a lot of moving parts
17:46:22 <iang> the endpoint here is the exit node, which *has* to know that, since it's the one shoving the result into a TCP stream.
17:46:25 <nickm> (same goes for most designs)
17:47:57 <komlo> nickm: +1
17:47:57 <mikeperry> so yeah, it seems that one action item is to enumerate the designs. naive QUIC; padded QUIC; padded+OPE QUIC; semi-reliable hop-by-hop (ian's I.A.5.a); and ECN-under-current-Tor
17:48:47 <iang> didn't sjmurdoch long ago have a tech report listing a whole lot of such options?
17:49:03 <iang> of course it's somewhat outdated
17:49:11 <nickm> I think that's from back in the svn days; it should still be around
17:49:33 <mikeperry> iang: but if the sequence numbers are visible to all hops, if I'm the guard and I deliberately drop packet N, the exit sees that packet N was a sequencde number that made it to the end and was dropped.. but if N is not visible to me at the guard, I don't know if my drop survived at the exit, or was actually an organic drop
17:50:25 <mikeperry> iang: yah, it examined basically our options for datagram transports using then-existing userland congestion control, but did not have any of this side channel analysis
17:50:28 <iang> the guard still knows it's the nth packet on the circuit.  the difference isn't that big, I'd think?
17:50:42 <iang> right.  the side-channel analysis is key here
17:51:19 <tvdm> murdoch report from November 2011, I think (can link in pad)
17:51:31 <nickm> yes please
17:52:34 <mikeperry> iang: if some percentage of packets are being dropped at the middle, there will be a probability that your attempted drop is a drop of a padding cell.. there will also be a probability of organic drops also being present
17:52:43 <arma3> both of steven's tech reports are on https://research.torproject.org/techreports.html
17:52:52 <mikeperry> so you basically lose bits in your side channel and have to perform some kind of error correction
17:53:06 <arma3> (and by both apparently i mean all five)
17:53:33 <iang> mikeperry: sure.  I don't see that as hurting the attack, though, I guess.
17:54:18 <iang> since the attacker is basically trying to get ~1 bit through.
17:54:49 <mikeperry> iang: I think there is a point where the cost of doing that error correction starts to make the side channel comparable to existing delay-based side channels in Tor today
17:55:41 <mikeperry> which will depend on the padding rate and the organic drop rate
17:55:44 <iang> that's true.  (Though it's not like we're really comfortable with that existing channel either.)
17:56:14 <nickm> +1
17:56:26 * iang needs to be at his next meeting.  I'll keep the window open and scroll back after if this meeting continues.
17:57:16 <mikeperry> ok I think as far as action items, we have: 1. create a better sketch of design alternatives
17:57:51 <mikeperry> 2. Try to break our questions into research topics
17:59:14 <mikeperry> wrt the delay-based side channel -- is there anything we can do to mitigate it? and if we could, why would that not also apply to the end-to-end datagram case?
18:00:39 <nickm> good question.
18:00:52 <komlo> mikeperry: i think with each design, enumerating the assumptions and potential safety tradeoffs will help determine what mitigations/further work are necessary?
18:01:10 <nickm> I would be happy to try to write up what we've been calling the "side channel" issues here, though they extend far beyond side channels
18:01:47 <nickm> I'd also like to write up all the similar issues in current Tor; especially if mikeperry can help me enumerate them.
18:02:00 <Jaym> yes! more precisions about what you consider side-channels and what is not would be neat
18:02:14 <mikeperry> I am a bit confused by the difference between "side channel" and I.A.1'/extended issues
18:03:01 <nickm> IMO a side channel is when you can send more information in the protocol than is intended.
18:03:02 <mikeperry> komlo: +1
18:03:05 <arma3> bram, back in the day, really thought we should use libutp style delay-based congestion control, rather than drop-based, and rather than the weird thing we do now. steven tried to set up a real experiment, but couldn't get stuff working.
18:03:17 <mikeperry> nickm: +1 on writing up/enumerating similar issues in Tor
18:03:22 <komlo> +1 to doing a cross-comparison between designs. also addressing which threats should be mitigated/are outside of scope
18:03:46 <nickm> 1.A.1' is something different: we aren't just sending information, we're provoking different behavior by a communications partner
18:04:06 <arma3> "active side channel"? maybe there is a better name
18:05:32 <mikeperry> my preference is for us to arrive at a metrics framework under which these things are all information leaks that can be comparable in terms of the amount of information provided to the adversary
18:06:11 <nickm> That's desirable, but only to the extent that it reflects some ground truth
18:07:41 <nickm> I can try to do the writeup on attacks next week, if that's worthwhile
18:08:22 <Jaym> That seems useful to me
18:08:31 <tvdm> Agreed..
18:08:57 <komlo> (signing off but will read scrollback later)
18:08:58 <mikeperry> yeah
18:09:02 <nickm> can somebody else take on coming up with a list (and breakdown) of existing designs in this field?
18:09:15 <mikeperry> I can do this
18:09:29 <nickm> mikeperry: also can you send me a list of the attacks in current tor that you're thinking of some time this week?
18:09:38 <nickm> or before monday?
18:09:38 <mikeperry> I'll also try to enumerate some research-paper sized topics that we have so far
18:11:03 <mikeperry> ok
18:11:46 <mikeperry> so I will take a stab at C, F, and G on the action items list
18:12:01 <mikeperry> and then send that around, perhaps on a fresh pad.
18:12:52 <mikeperry> and then from there we can ponder if we want to discuss metrics, research topics, or frameworks in the next meeting in a few weeks
18:13:20 <mikeperry> maybe waiting for a preprint of stephanie and ian's paper for further comparison
18:14:18 <mikeperry> I think we're good for now, then?
18:14:24 * Samdney is watching and reading the backlog ...
18:14:26 <nickm> I think!
18:14:45 <nickm> seems like a good start
18:14:51 <nickm> ready to #endmeeting?
18:15:04 <mikeperry> yep!
18:15:08 <mikeperry> #endmeeting