#tor-dev log

16:00:29 <nickm> #startmeeting
16:00:29 <MeetBot> Meeting started Fri Jan 22 16:00:29 2016 UTC.  The chair is nickm. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:29 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:59 <dgoulet> yellow!
16:01:03 <mikeperry> I have coffee, a breakfast burrito, and a huge pile of nice warm blankets. I'm ready to rock and roll! ;)
16:01:23 <athena> greetings, meeting!
16:01:24 <mikeperry> or stop, drop, and roll. ready for that too!
16:01:25 <nickm> I have tea brewing, and a small pile of roasted edamame
16:01:27 <nickm> hi all
16:02:08 <nickm> today we talk about prop#251 and maybe a bit of prop#254 too.
16:02:27 <nickm> prop#251 seems pretty straightforward.  Let me summarize and I'll see if Mike thinks I'm summarizing right.
16:03:49 <nickm> "Many routers record flow information that could be useful for low-budget traffic analysis.  We can make this flow information far less useful by padding some of our connections. This padding can be done cheaply: All we need to do is make sure that client<->guard connections don't stay silent for too long in either direction."
16:03:54 <nickm> how did I do?
16:05:14 <mikeperry> that's about it. there's a some additional details about what to do about connection lifespan, connection usage, and mobile clients
16:05:25 <mikeperry> and a couple of details that have changed slightly since the proposal
16:05:59 <mikeperry> (esp related to mobile clients and negotiation from 254)
16:06:40 <nickm> Anybody have questions and/or stuff to add and/or discuss and/or debate?
16:06:49 <nickm> In a minute, I'd like to try arguing against it a little, to make sure that the arguments against it are bad.
16:07:00 <nickm> Or at least that they have good counterarguments
16:08:05 <dgoulet> I honestly have nothing to say much about it, read twice today and I find only positive with it
16:08:40 <mikeperry> one question was: should we try to also use connection-level padding to obscure circuit setup timing, also? or is that mission-creep?
16:09:08 <nickm> Do we think we can  do that in a way that would work?
16:09:11 <dgoulet> mikeperry: weren't you thinking of padding for HS also to mitigate some potential side channel attack?
16:09:17 <mikeperry> my thinking right now is that we should use multihop padding for circuit setup, so we can conceal it from the guard, also (which I think is the most dangerous form of circuit setup fingerprinting)
16:09:46 <nickm> Probably that's a separate proposal?
16:09:51 <mikeperry> this netflow stuff is only channel/TLS level padding, so the guard node would know what is padding and what is not
16:10:19 <mikeperry> which makes it less useful for trying to extend into stuff that protects against the guard
16:10:47 <isis> mikeperry: i recall that i/we came up with some number somewhere for the average monthly bandwidth overhead… and it was an amount too high for mobile clients without unlimited data plans…
16:11:04 <mikeperry> also it is questionable how well circuit setup fingerprinting works against the multiplexed TLS connections anyway
16:11:25 <mikeperry> isis: yes. the proposal is missing the results of that discussion, but the implementation is not
16:11:42 <nickm> is there an updated document someplace?
16:12:06 <mikeperry> the implementation allows clients to fully disable padding via torrc, or reduce it to about 25% of the default with a different option
16:12:30 <isis> mikeperry: okay, so there is now some options which Orbot/etc. should set by default?
16:12:40 <athena> concur with both dgoulet and mikeperry's opinions: this is an easy defense against ISP stats-gathering between client and guard, but padding outside the circuits like this is useless against potentially malicious relays
16:12:41 <isis> ah
16:13:29 <nickm> mikeperry: btw, so you're not suprised: I'm going to want a torspec patch to merge along with the implementation. :)
16:13:40 <mikeperry> nickm: no. I am also wondering if we should put the final version of this stuff in tor-spec.txt, or just describe the new protocol stuff there, and put the behavior in traffic-analysis-spec.txt or something
16:14:00 <nickm> that would be fine; I'd suggest padding-spec.txt
16:14:56 <mikeperry> ok. great. esp since if/when we add multi-hop padding, its usage will be quite more in-depth and tunable than just the wire format specification
16:15:22 <mikeperry> and so all that material shouldn't clutter tor-spec.txt IMO
16:15:41 <nickm> right
16:15:59 <nickm> So, let's try to figure out the counterarguments.
16:16:30 <nickm> This defense matters if there are significant adversaries who have access to router flow information, or something like it, but who don't have a better traffic-analysis surveillance mechanism.
16:16:41 <nickm> Do we believe they exist?
16:17:43 <mikeperry> I think that is exactly what EU and .au data retention will look like. ISPs mandated to retain connection information
16:18:28 <athena> plausibly they could; a lot of the stuff we've heard about with TLA surveillance is tapping further upstream than that rather than close to the user inside big ISP ASes
16:19:11 <nickm> I can also totally believe that ISPs would share their flow information with most anybody who asks for it, like kids trading basketball cards.
16:19:12 <athena> (which makes sense in terms of minimizing the number of places that have to be tapped from the TLA's perspective)
16:19:36 <athena> for some (user, guard) pairs, there won't be another opportunity to see that particular traffic flow
16:23:17 <nickm> Okay, seems believable.
16:23:22 <nickm> How do we know this won't kill the network?
16:23:46 <mikeperry> we can throttle it or turn it off from the consensus
16:24:08 <nickm> how do we know the default settings won't kill the network? :)
16:24:21 <mikeperry> we also will have stats in extra-info to monitor what happens
16:24:50 <mikeperry> section 2.2 has estimates on upper bounds of overhead
16:25:07 <mikeperry> those are still valid upper bounds. the actual amounts should be much lower
16:25:25 <mikeperry> even those bounds indicate that the overhead will be on par with the current directory traffic overhead
16:27:45 <nickm> and even less for very-idle or very-busy clients?
16:28:19 <mikeperry> section 2.2 assumes 2.5M clients idle all the time, 24x7. so wrost case in terms of overhead
16:29:55 <nickm> I think idle clients eventually stop predicting circuits, right?
16:30:02 <nickm> do they eventually close their guard connections?
16:30:10 <mikeperry> (the defense does not transmit padding if clients are idle for ~30min, or when they are sending non-padding data)
16:30:35 <mikeperry> yes. the later patches on that branch give us better control over circuit prediction and orconn lifespan, both client side and in the consensus
16:32:09 <nickm> this is seeming pretty good to me. anybody else want to jump in with anything?
16:32:35 <mikeperry> CircuitIdleTimeout and PredictedPortsRelevanceTime were unified into CircuitsAvailableTimeout, which governs orconn lifespan indirectly as well (since orconns shut down a few minutes after being without circuits)
16:33:37 <mikeperry> RedcuedConnectionPadding basically halves orconn lifespan by halving those values (in addition to making padding only unidirection, and only at least every 15s instead of every 9s)
16:34:28 <mikeperry> 15s is the default inactive record timeout for all routers I could find. so it seemed like a good option for people who want reduced padding
16:35:01 <mikeperry> the default is at least every 9s, because COTS routers can be set to have an inactive record timeout as low as 10s
16:35:46 <mikeperry> err, 9s is our default maximum delay before sending a packet. 15s is the "reduced" maximum delay
16:36:37 <mikeperry> so, when creating the new documentation since the proposal, is it OK just to put the new stuff in padding-spec.txt, or do I also have to update the proposal?
16:37:19 <nickm> I think that if you make  a patch for the repo, that's fine.
16:37:38 <nickm> A couple of protocol questions:
16:38:18 <nickm> 1) The detection for "should I pad this incoming connection" seemed a little weird in your code.  I wonder if we could just say something like "If they didn't authenticate to me, then they are a client or a bridge, so pad."
16:38:27 <nickm> would that make more sense?
16:38:31 <nickm> Any problems there?
16:39:21 <mikeperry> I have been wondering about trying to use AUTHENTICATE to also verify TLS in crazy ways as a client..
16:40:05 <mikeperry> after the create_fast debacles, I am also kinda wary of making the client-vs-relay distinction based only on behavior in general
16:41:36 <nickm> well, what could an an attacker do with this?
16:41:47 <nickm> at worst you can make a relay send you padding
16:41:52 <mikeperry> (ie moxie's tortunnel.. and I also saw some old cruft in this code where it was still making client-vs-relay assumptions on CREATE_FAST usage as well in some branches)
16:41:54 <nickm> when you are really a relay
16:43:14 <mikeperry> well, I'd guess you'd make them not pad to you in the event of a future behavior change... which may also be hard to catch in unit tests
16:43:40 <nickm> I'm not sure that's so bad.
16:43:58 <nickm> You already have control over what you're sent; if you want to expose information, you can just publish what's padding and what isn't.
16:44:08 <nickm> Turning off padding to/from yourself is a power you inherantly have, I think
16:44:50 <mikeperry> yes, but the accidental/side-effect nature of it is what bothers me.. what is wrong with checking the consensus and caching that decision? that is what I tried to do. it seems way more definite
16:46:10 <nickm> well, you can be a relay without being in the consensus yet.
16:46:28 <nickm> it's not "definite"
16:46:33 <nickm> or rather, if you're in, then you're definitely a relay.
16:46:41 <nickm> but if you're not, you still might be a relay
16:47:49 <mikeperry> yeah, one of your questions was about that. I tried to handle that case while processing create cells, where we were already doing a consensus check
16:48:52 <nickm> another question:
16:49:27 <nickm> The negotiation approach for this padding.  It doesn't seem like it could easily switch to a different strategy in the future.
16:49:49 <nickm> Like, if we decide that instead of min/max we want to tune a poisson distribution or something, do we bump the version byte, or what?
16:49:58 <mikeperry> I mean, from a spec point of view, it would mean I also have to update tor-spec.txt with a weird statement like "Omg clients should never send AUTHENTICATE or else they will disable padding". esp if we want to ensure that alternate tor implementations don't try to add extra security and then shoot themselves
16:50:03 <mikeperry> that just seems weird to me
16:51:49 <mikeperry> nickm: yes. that's why the version is in there. though I suppose it may also require a link protocol update, too, to negotiate the supported subversions.. so maybe it is redundtant?
16:53:12 <mikeperry> (the version in channelpadding_negotiate)
16:54:04 <nickm> I guess it might be redundant for the reason you give, but best to keep the version byte.
16:54:09 <mikeperry> I guess that field will avoid the need for us to make a new cell command in future updates. so yeah, it is still useful for agility/forward compat
16:54:34 <nickm> It would be good to have a paragraph about what to do if we change how the negotiation works (eg "new link protocol, new cell version, ...")
16:55:01 <mikeperry> ok. yeah, that sounds like tor-spec material. yes
16:57:53 <nickm> any other topics on this one?
16:58:04 <nickm> Maybe it's okay to have short proposal meetings!
17:00:22 <mikeperry> it was a full hour. time flies
17:00:27 <nickm> #endmeeting