14:58:21 <nickm> #startmeeting prop267
14:58:21 <MeetBot> Meeting started Thu Mar 17 14:58:21 2016 UTC.  The chair is nickm. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:58:21 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:58:29 <nickm> good morning! Or whatever time it may be.
14:59:03 <nickm> The proposal today is https://gitweb.torproject.org/torspec.git/plain/proposals/267-tor-consensus-transparency.txt
14:59:44 <nickm> I see me and ln5 are here.  Anyone else joining in?
15:00:05 <tjr> Me =)
15:00:07 <mikeperry> I am lurking sleepily
15:00:45 <Sebastian> I'm here
15:00:50 <ln5> yay
15:01:33 * karsten is also here, though unprepared.
15:01:34 <nickm> would somebody other than ln5 like to summarize so ln5 can check for accuracy?
15:02:28 <Sebastian> I can try.
15:02:45 <nickm> go for it!
15:03:44 <Sebastian> The super-highlevel picture is that it would be neat if there was a way to verify that dirauths don't treat some clients specially, provide more ways to archive historical data in a tamperproof fashion and allow this all to be externally verified
15:04:17 <Sebastian> the used technique comes from certificate transparency, where they do the same thing for website certificates
15:04:51 <Sebastian> the way this is achieved is by using a special append-only datastructure that can be cheaply verified
15:05:32 <Sebastian> What happens is that interested parties (for example, dirauths) submit their documents to one or more logs, and get back a proof of inclusion
15:06:16 <Sebastian> So when a relay/client/anyone wants to ensure that a consensus they have was actually produced and distributed by a dirauth, it can check that the log contains that consensus
15:07:20 <Sebastian> Interested parties can independently verify that a log is kept correctly by auditing its contents and ensuring that just one consensus is added per time slot, for example
15:08:16 <Sebastian> The former is called an auditor, the latter a monitor
15:08:20 <Sebastian> they can be independent
15:08:36 <Sebastian> The proposal is about adding this to Tor
15:09:25 <Sebastian> dirauths submit, users of a consensus keep three states: unknown, consensus was logged, consensus was logged by a good log
15:09:40 <nickm> (so, no bad states?)
15:10:10 <Sebastian> not in the proposal I think but maybe I misremember. I think I'm done remembering stuff from the proposal anyway.
15:10:19 <nickm> ln5: is that more or less right?
15:10:30 <ln5> yes, a couple minor things:
15:10:38 <ln5> - re who treat some clients differently: s/dirauths/any actor with five keys/1
15:10:51 <ln5> - clients and relays can submit consensus documents which proper
15:10:52 <ln5> dirauths (acting as monitors) can detect as bogus if they are
15:11:23 <ln5> and no, there are no state atm which says that a consensus is bad
15:12:08 <ln5> and the last (best) state is not about the log being good but rather that the consensus user has seen enough proof to think it's properly logged, i.e. it's been verified
15:12:15 <ln5> eof
15:12:37 <Sebastian> thanks for the corrections :) I treat your first point as equivalent.
15:12:52 <ln5> Sebastian: thanks for the summary. very good!
15:12:57 <nickm> Is the idea that every tor client should be a consensus user?  How does bootstrap work?
15:13:06 <ln5> yes, i'm sure _you_ do. but for all the others :)
15:13:27 <ln5> nickm: "consensus user" is my term for clients and relays, yes
15:13:44 <ln5> i hadn't though about bootstrap before weasel mentioned it in valencia :)
15:13:51 <ln5> (still have no answer)
15:14:00 <ln5> (adding it to open issues)
15:14:45 <nickm> How much of this fetching should be anonymized and how much direct?
15:14:52 <nickm> (and publishing)
15:15:35 <ln5> as for fetching, the idea is that when consensuses are fetched, as usual, we also include a proof of inclusion
15:16:19 <ln5> this proof of inclusion can be verified by going to the log and asking for the current tree head and a consistens proof between "the old tree head" and the current tree head
15:16:36 <ln5> i think this communication with the log should be done over tor
15:17:02 <ln5> nothing here is super sensitive afaict but (!) a log might be able to track a user if it's not
15:17:19 <ln5> (by issuing special tree heads and/or proofs for a certain client)
15:18:04 <tjr> So in Certificate Transparency (CT) you get a SCT which is a 'promise of inclusion in from the log'.  They are issued instantly.  But here you propose waiting for inclusion in the log, which will take minutes if not an hour.
15:18:11 <ln5> oh, when i say "tree head" i mean a signed document with the head of a merkle tree in which all consensus documents are stored
15:18:56 <nickm> #item let's document what needs to be/should be anonymized.
15:19:06 <nickm> also, minor suggestion:
15:19:17 <ln5> tjr: i suggest we 1) require logs to produce a new tree within, say 10 minutes, and 2) use something like an SCT's as a "cookie" in order to not have to wait for an HTTP response for 10 minutes
15:19:37 <nickm> if you're worried about attackers who can break RSA2048, why not replace SHA2-256 with SHA3-512, or a concatenation of SHA2-512 and SHA3-512?
15:19:53 <ln5> unless someone thinks it's crazy to postpone consensuses by 10 minutes
15:20:21 <tjr> Requiring logs to produce a new tree head every N minutes worries.  We explicitly constrained them from being able to do that for CT Gossip...
15:20:38 <ln5> nickm: i'd be happy to have stronger hash algorithms
15:20:48 <tjr> For tracking purposes.  If these are going to be general purpose logs, instead of Tor logs, we might want a similar constraint on them.
15:21:23 <ln5> tjr: i expect less trouble with tracking thanks to being able to mandate using tor for communicating with logs
15:21:38 <ln5> but i worry about requiring new tree in N minutes bc of operational issues
15:22:29 <nickm> bootstrapping makes it hard to mandate tor though.
15:22:46 <ln5> so maaaaybe we need SCT's in their original form. i'm not sure. (they're a promise of inclusion within N hours, typically 24, but they need to be "watched" bu auditors.)
15:22:51 <ln5> ah, right
15:22:56 <nickm> [like, either you allow people to use an unchecked consensus with to talk to the logs, or you have a bootstrapping issue.]
15:22:58 <tjr> Yea, logs have an MMD for operation needs.  I do think I like the SCT model better
15:23:08 <tjr> A SCT is also less data to carry around, and less data to validate. You check a signature instead of checking a signature and confirming a merkle path.
15:23:15 <ln5> (MMD == maximum merge delay)
15:23:34 <ln5> tjr: i dislike SCT's because they're just a promise
15:23:55 <ln5> but yes, you can "turn them into" something verifiable
15:24:45 <ln5> i'm not worried about the amount of data to carry around -- the consensus document is on the order of a MB and a proof is on the order of hundreds of bytes
15:25:12 <ln5> (a proof is a list of hashes, max lenght for a tree of size N being log2(N))
15:25:30 <ln5> (at least true for an inclusion proof)
15:26:05 <nickm> (remember, the consensus size doesn't matter as much as the compressed consensus size: disk is cheap, bw is expensive.)
15:26:14 <ln5> nickm: i'm not convinced we should ever stop a consensus user from using a consensus that is properly signed but misses "log proofs"
15:26:15 <tjr> I guess another thing I'm worried about the amount of (new) code that is run for every relay on the network.  Breaking the responsabilities apart would modularize it more, and mean not every relay runs the core (complicated) auditing code
15:26:28 <nickm> ln5: ack. I couldn't tell whether that was the idea or not.
15:26:39 <ln5> nickm: i agree that's not very clear
15:27:11 <ln5> what i do like is if consensus users would submit documents they're unsure about, in order to catch a "bad consensus"
15:27:18 <nickm> (also, consensus diffs will make bw even less.)
15:27:27 <atagar> good morning world
15:27:55 <Sebastian> hi atagar, here for consensus transparency meeting?
15:27:58 <ln5> tjr: valid concern
15:28:09 <atagar> s4chin: 'Is there any Tor project, preferably in Python, which has a mentor?' => Quite a few. See https://www.torproject.org/getinvolved/volunteer.html.en#Coding
15:28:21 <tjr> ln5: If SCTs are used, it seems easier to just require every consensus be accompanied by some number of SCTs.
15:28:24 <atagar> Sebastian: nope
15:28:31 <tjr> Although perhaps that's placing an outside dependency we don't want to do
15:28:59 <Sebastian> atagar: please hang on while we're meeting then :)
15:29:20 <ln5> tjr: that would mimic CT a lot, yes. my work with CT has made me dislike the fact that there's no real proof included though.
15:29:58 <s4chin> atagar: I'll read through Nyx in the next 2 days, and contact you later. Will that be fine?
15:30:53 <ln5> or, i'd really like it if we could deliver the real proofs together with consensuses. but maybe that won't fly. one doesn't exclude the other though, at the cost of complexity i guess.
15:31:42 <atagar> s4chin: certainly
15:31:59 <tjr> I guess I don't really mind the promise vs 'real proof' thing.  Having a 'real proof' isn't evidence you weren't attacked. Verifying a promise would just an additional step for the auditor, who is already doing complicated stuff.  /shrug
15:32:00 <nickm> I'm okay with not all relays doing the same thing to audit the log, but it's important that all relays and clients do the same thing to check the consensuses that they use.
15:32:01 <ln5> i wonder if there are more issues with bootstrapping. i mean, if we're not rejecting a consensus just bc it lacks log proofs of any kind, are we in the clear?
15:32:15 <tjr> nickm: Agree
15:32:32 <ln5> nickm: agree
15:32:33 <nickm> ln5: well, we need to get worried if we can't find log proofs for it after a while.
15:33:41 <ln5> nickm: or we decide to just submit it and keep using it -- first victims are going to be victims, the rest will be saved by monitors picking this up and Dealing With It Somehow
15:33:57 <nickm> well, if you submit it you need to know it's getting in, yeah?
15:34:26 <nickm> if your consensus is hostile, you cannot trust that tor is making real connections, unless you have some non-tor reason to think it is.
15:34:32 <ln5> nickm: it'd be great if it went in, yes
15:34:35 <nickm> (eg signed "yes i'm including that" receipt.)
15:35:37 <tjr> You don't actually need to know it's gotten in.  If you have a SCT ( a signed promise) and can get that promise 'out to the world' - if it never got in, that's enough to show log misbehavior and shut the log down.
15:35:58 <ln5> tjr: that's separate from what nickm is talking about i think
15:36:03 <ln5> but important
15:36:06 <nickm> What i'm talking about is this:
15:36:08 <tjr> That's the part where the gossip comes in.  Gossip only works if you assume an attacker client eventually makes a non-hostile connection
15:36:19 <tjr> *an attacked client
15:36:24 <nickm> Let's say I start with an attacker-controlled consensus and I only use tor.
15:36:47 <nickm> The attacker controls the consensus, so thay can MITM the whole network, and prevent any connections they don't want me to make.
15:37:11 <nickm> The only way I can know I'm getting to any log is if I find something authenticated with a key the attacker doesn't control
15:37:37 <nickm> If our plan involves "Eventually you escape" we should probably get some ideas on how that happens :)
15:37:45 <tjr> (When, of course, you don't know what the attacker controls.)
15:38:00 <nickm> Also, dumb question: These proofs are signed with public keys, yeah?
15:38:11 <tjr> Yup
15:38:23 <ln5> nickm: i've been working under the assumption that we won't be able to deal with perfect mitm forever
15:38:33 <ln5> no, the proofs are not signed
15:38:38 <ln5> the tree heads are
15:38:49 <tjr> Okay, slight difference :)
15:38:50 <ln5> the promises are too
15:38:57 <ln5> (SCT)
15:39:14 <nickm> Okay.  The tree heads are.  And the promises are.  If we beleive that the attacker can maybe factor or steal all the RSA keys we're using, why do we believe the signatures on the tree heads or promises?
15:39:44 <ln5> nickm: they're vulnerable, but we will detect when someone do it
15:39:56 <ln5> that's the key difference and what makes it worth it imo
15:40:03 <tjr> Well for all of Transparency, I think we hand-wave over the 'eventually you escape' problem. We assume you eventually escape. This is because non-Tor network MITMs eventually come to an end.  The day Iran MITMs the whole internet and never stops, and never lets any leakage occur is the day they undetectably hack a log and get away with it.
15:40:44 <nickm> I think we might want to try to clear up some of the handwaving here though.
15:40:58 <ln5> that'd be great :)
15:40:59 <nickm> Because the tor situation is different from the non-tor situation.
15:41:39 <nickm> If you compromise a client's view of the tor network on the level of impersonating all the directory authorities, and that client believes a fake consensus that you control...
15:42:01 <nickm> and the client is running TAILS or something and they only make connections over tor...
15:42:23 <nickm> then there isn't a "real internet" for you to "always" MITM.
15:42:29 <nickm> it's only tor.
15:44:16 <ln5> well, if tor is you network, there's indeed a network to MITM by controlling the consensus. or am i misunderstanding?
15:44:42 <ln5> maybe i think MITM is something else then what other think
15:45:10 <Sebastian> tor mitm means you control the keys
15:45:18 <Sebastian> if you do that once there's little reason for you to suddenly lose that power
15:45:49 <nickm> I mean that you need to tell a story about how that TAILS user ever gets back onto the real Tor network, or finds a real log in any other way.
15:46:05 <Sebastian> Currently the only way this can happen is if you disconnect for a few days
15:46:08 <ln5> Sebastian: that's a big difference from ip network level mitm, true
15:46:09 <Sebastian> and Tor goes back to the dirauths
15:46:32 <Sebastian> once you do that over a "secure" connection, you would get the real consensus. Unless the dirauths are hacked and target you
15:46:51 <Sebastian> but yeah.
15:47:11 <Sebastian> I think there's a fundamental difference compared to https certificates
15:47:20 <nickm> we should think about the persistence story here too.
15:47:29 <boklm> GeKo: ok. 6.0a4-hardened-build1 is matching too.
15:47:31 <Sebastian> persistence?
15:47:46 <ln5> nickm: what we store on disk on client machines?
15:47:55 <ln5> or the persistence in an attack?
15:47:55 <nickm> yes, that, and also:
15:48:02 <tjr> So.  Is there any possibility of using the (future?) hardcoded DirCaches here?  Every N days you talk to one of those and gossip a SCT or a proof?
15:48:09 <nickm> * What if the attacker can tell whether the client is bootstrapping.
15:48:35 <nickm> * What if the attacker can predict that the client is not keeping persistent Tor state.
15:49:00 <nickm> I think in the second case we're in bad shape, but we should figure out how bad
15:49:14 <meejah> anadahz: looking...
15:49:45 <ln5> (point of order: i need to leave in 10 minutes, i scheduled only 1h for this)
15:49:54 <nickm> me too; I have another meeting :)
15:50:11 <nickm> anyway I think we have got a lot of forward movement here, so that's good.
15:50:30 <nickm> I think next step is to revise&expand the draft, and try to analyze bootstrapping/TAILS/etc cases?
15:50:42 <nickm> any other next steps?
15:51:13 <ln5> nickm: those are good next steps. i'm also interested in figuring out what more than consensus docs we could/should log.
15:51:29 <nickm> maybe votes.
15:51:30 <ln5> but the things discussed today are probable more pressing
15:51:41 <nickm> Also, the set of authorities can change over time.  We need an answer for that.
15:51:56 <meejah> anadahz: https://github.com/meejah/txtorcon/issues/146 is the bug, and ref's the fix
15:51:57 <ln5> we need a question first :)
15:52:12 <meejah> not sure why there aren't 0.14.2 release-notes :(
15:52:41 <nickm> ln5: Can a client get any use from a log that believes in a newer or older set of authorities?
15:53:38 <ln5> nickm: the log doesn't put any judgement on its contents, or vet for it. it just makes it public and verifiably so. so yes.
15:53:52 <nickm> well, in that case i can spam the log, yeah?
15:53:59 <nickm> like, upload infinite stuff
15:54:09 <ln5> the log only _accepts_ stuff from a set of authorities, that's true
15:54:23 <ln5> but the list of those can be changed any time
15:54:31 <ln5> without any other side effects afaict
15:54:52 <nickm> And if the client's list is not the log's list, but instead an older or newer list, what happens?
15:55:50 <Sebastian> I don't really see the problem here
15:56:05 <Sebastian> it won't find the consensus/can't submit it because invalid
15:56:12 <ln5> (this is only for submission, ofc.) if the client's document is not signed by any of (or a majority of) the dirauths that the log knows of, it won't be accepted.
15:56:27 <Sebastian> the only problem I see is if the log believes in a non-canonical set of dirauths
15:56:39 <ln5> Sebastian: unless the log is very behind on updating its list of known dirauths
15:56:50 <Sebastian> see my last line? :)
15:56:53 <ln5> (and they have changed substantially)
15:57:11 <ln5> Sebastian: i don't understand "non-canonical" i'm afraid
15:57:46 <ln5> nickm: are you going to summarise or should i be doing that?
15:57:51 <Sebastian> I understand canonical to mean "intended set of dirauths as proclaimed by the currently released Tor version"
15:58:11 <ln5> Sebastian: i see. but there are older clients too.
15:58:28 <Sebastian> but by definition they are either Too Old or they won't have a problem.
15:58:31 <ln5> anyway, the list of dirauths known by the log should probably just grow
15:58:39 <nickm> ln5: if you could summarize and collect the issues, that would be great.
15:58:49 <ln5> nickm: will do. thanks for running the meeting!
15:58:53 <nickm> also if you can edit the NetworkTeam/Meetings wiki page to link to the meetbot log.
15:59:02 <nickm> thanks for working on transparency, ln5!
15:59:03 <ln5> ok
15:59:06 <nickm> have a lovely weekend!
15:59:14 <Sebastian> ah, I see the issue. you want to identify when an old client is attacked using now-invalid dirauths
15:59:15 <nickm> and now we both have to go, so let's
15:59:17 <nickm> #endmeeting