16:01:13 <nickm> #startmeeting prop259 can be fun
16:01:13 <MeetBot> Meeting started Mon Jun 20 16:01:13 2016 UTC.  The chair is nickm. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:13 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:01:30 <asn> OK :)
16:01:35 <asn> how should we start you think?
16:02:06 <asn> i tried to provide a status report of the current situation here: https://trac.torproject.org/projects/tor/ticket/12595?cversion=0&cnum_hist=9#comment:37
16:02:35 <nickm> so, has everybody had a chance to read that summary?
16:02:51 <asn> tl;dr more work needs to be done on the prop259 design and of course on the implementation (yes there is a PoC implementation by thoughtworks)
16:04:25 <nickm> so, I wonder how much the C poc, the python simulator, and the proposal currently match each other
16:04:55 <asn> good question.
16:05:04 <asn> i'm also not sure.
16:05:09 <asn> i think the python simulation is a bit behind.
16:05:30 <asn> (e.g. i think it still includes the dystopic/utopic heuristic that we decided to remove)
16:06:18 <asn> the proposal and the C PoC should be kind of close, but I actually have not read the C code deep enough.
16:07:49 <asn> (i think the python simulation stopped being devleop,ed because the thoughtworks crew realized how complex Tor networking/guard logic is, and they tried to speed implement the algo in little-t-tor)
16:08:27 <Lambda> HI!
16:08:32 <asn> hey
16:08:33 <nickm> hi, and welcome
16:08:42 <nickm> we're talking about guards, and meetbot is keeping a log. :)
16:08:47 <Lambda> thanks
16:09:37 <nickm> asn: so, the problem that you mentioned on the ticket is that the propsal, as written, asssumes that first you will try one guard, and it will succeed or fail, and you will not try another circuit until the first one has succeeded or failed.
16:09:40 <nickm> Is that right?
16:09:53 <asn> yes indeed
16:10:06 <asn> that was an assumption of the original design I think, that is not true at all in real Tor.
16:10:21 <nickm> right
16:11:07 <nickm> so we need a set (or a state) that has more or less the meaning of "currently trying this one; don't know yet"
16:11:17 <nickm> ?
16:11:20 <asn> hm plausible
16:11:28 <asn> or make the data structures be able to facilitate two parallel runs of the algo
16:11:53 <asn> or a combination of the two
16:12:15 <nickm> well, why do we want parallel circuits?  For at least two reasons, I think:
16:12:25 <nickm> 1) there are multiple things we want to do and they need separate circuits.
16:12:40 <nickm> 2) we want to try out multiple circuits to see which of them will work.
16:12:45 <asn> aha
16:12:58 <nickm> I think that maybe we should have the algorithm treat those cases differently?
16:13:05 <asn> (example: consider an HS with many clients asking for rendezvous at once. lots of circuits!)
16:13:16 <nickm> right.  That's case 1.
16:13:21 <asn> yep case 1.
16:13:39 <Yawning> if you're running guard selection at that time
16:13:44 <Yawning> "too fuckign bad, it gets serialized"
16:13:46 <Yawning> imo
16:13:50 <Yawning> >.>
16:13:53 <asn> case 2 mainly happens inside the guard picking algorithm
16:14:28 <athena> hmm, we need to do something like define an equivalence relation on possible circuits that means "these are for the same thing", perhaps
16:15:07 <asn> athena: indeed. different circuits need different guards. although we are doing a good job at squashing these different types of guards.a
16:15:31 <asn> athena: specifically, all current guards are Stable. and almost all guards are Fast.
16:15:49 <asn> and in the future all guards will be V2Dir.
16:16:18 <asn> (The fact that we have all these different type of circuit requirements, complicated design cnsiderably as well.)
16:16:44 <asn> (we recently decided that the less options there are for a guard, the easier it is for the guard algorthm)
16:17:24 <nickm> So, right now the algorithm knows about two states.
16:17:32 <nickm> they seem like "stuff is working" and "stuff isn't working" TO ME>
16:17:37 <nickm> ug, shift.
16:18:02 <asn> (you mean in prop259, or in the current guard algo
16:18:04 <asn> ?
16:18:05 <nickm> maybe in the first one, we try the same guard till it fails, and in the second we try new ones till one works?
16:18:06 <asn> )
16:18:07 <nickm> prop#259
16:18:53 <asn> STATE_PRIMARY_GUARDS is "stuff is working", and the TRY_REMAINING is the "stuff isn't working"?
16:19:10 <nickm> I think, yes.
16:19:21 <asn> i was not thinking about it like this, but yes you can think about it kind of like that.
16:19:22 <nickm> we'd need to specify this clearly, but it isn't crazy
16:19:51 <nickm> alternatively, PRIMARY_GUARDS is "we think we can be conservative and still talk to the network" --
16:20:07 <nickm> and TRY_REMAINING is "we need to explore to find primary guards that would work"
16:20:17 <nickm> I wonder if we can get that stuff written up
16:21:45 <nickm> (I find the current proposal a little hard to read, fwiw.)
16:22:02 <asn> i agree. it got a bit dirty.
16:23:03 <asn> i'd like to clean it up a bit, but i don't want to be the person responsible for it.
16:23:19 <asn> that is, i would like more people joining and/or leading the project.
16:23:49 <nickm> I wonder if I'm competent to make a difference here.  I can join in, I think, but I'm not sure I understand the current thing well enough to lead changes to it.
16:25:38 <nickm> hmmm
16:26:03 <nickm> btw, a few questions on the current proposal that I ran into.
16:26:04 <asn> i think you would definitely be a good person to join in, but you are alraedy doing a million things.
16:26:21 <nickm> 1: What _is_ "the state machine depicted in 2.2" ? I see no depiction.
16:26:53 <nickm> 2: in 2.2, it says "we should save the previous state and set the state to STATE_PRIMARY_GUARDS".  Where, if anywhere, do we look at or restore the previous state?
16:27:29 <nickm> I'm also confsed by language like "return each entry in PRIMARY_GUARDS" in turn.
16:27:45 <nickm> like, that assumes additional unspecified state, like "What is the last PRIMARY_GUARD we returned"
16:27:49 <nickm> 3: ^
16:27:54 <nickm> do those questions make sense?
16:28:38 <asn> i think they do
16:29:21 <asn> I'm not sure what's up with (1). Maybe by state machine they mean the transition between states STATE_PRIMARY_GUARDS and STATE_TRY_REMAINING.
16:29:44 <nickm> maybe. or maybe there were more states in a previous revision of the document? :)
16:29:50 <asn> section 2.2 specifies some sort of state machine indeed.
16:30:27 <asn> i agree that this can be made much more clear.
16:31:17 <asn> and yes, it does feel that there are remnants of more states from a previous revision of the doc.
16:31:36 <nickm> I think that is probably also the answer to my question 2.
16:31:41 <asn> yeah
16:32:12 <asn> I think STATE_TRY_REMAINING used to be two states in the past. One state that goes through the used guards, and another state goes through the list of sampeld guards.
16:33:18 <asn> ---
16:33:31 <asn> wrt "Return each entry in PRIMARY_GUARDS in turn."
16:34:11 <asn> i think it indeed assumes that each time we return a guard, there will be side effects that might mark that guard as unsuitable for use.
16:34:35 <asn> hence next time we hit the algorithm that unsuitable guard will be skipped
16:34:44 <nickm> but it doesn't say that
16:34:53 <nickm> it doesn't say "return the first usable entry".
16:34:56 <nickm> it says "return each entry"
16:35:03 <asn> it says this:
16:35:06 <asn> For each entry, if the
16:35:08 <asn> guard should be retried and considered suitable use it. A guard is
16:35:11 <asn> considered to eligible to retry if is marked for retry or is live
16:35:12 <asn> and id not bad. Also, a guard is considered to be suitable if is
16:35:14 <asn> live and, if is a directory it should not be a cache.
16:35:23 <asn> this "or is live" thing
16:35:38 <asn> is I guess what they relied on
16:35:51 <nickm> hm.  I still claim that this is written with the assumption that we're using coroutines. :)
16:35:56 <asn> there is even an appendix section on what IS_LIVE does (!)
16:36:43 <nickm> like, I could try to kludge up a version of this document that behaves sorta-reasonably for concurrent circuits....
16:37:03 <nickm> but I'm not sure that I can make sure it is really "a version of this document"
16:37:20 <asn> well that's fine.
16:38:06 <asn> but i was also kind of hoping that this task would not get pushed to you.
16:38:50 <nickm> If I get the design advanced, will others help with the implementation?
16:39:07 <nickm> Because IMO it does kinda make sense for me to spend a day or two on design here if I can make progress
16:39:31 <nickm> Yawning: athena : ^  ?
16:40:34 <athena> yeah, certainly happy to help with implementation
16:41:36 <Lambda> if I can help, I’ll happy to :)
16:41:45 <nickm> 'kay.
16:41:51 <nickm> asn: is there a "fix the proposal" ticket?
16:42:09 <asn> [ nickm: another badly explained part of the spec, is this SHOULD_CONTINUE thing. This is basically the current heuristic of Tor that says "Hm, I looped over my guard list and I managed to connect? The network must have gone up! Mark all guards as up and retry."]
16:42:22 <asn> nickm: i guess that's still #12595
16:42:39 <Yawning> hm
16:43:18 <Yawning> yeah I can help with the implementation
16:43:55 <nickm> asn: should I take ownership on #12595, or would you like to create a subticket for me to own to represent the proposal revision?
16:44:04 <asn> nickm: what do you prefer?
16:44:10 <asn> i can create a subticket if that helps you
16:44:15 <nickm> subticket would be great
16:44:20 <asn> ack will do
16:44:23 <nickm> thanks
16:44:34 <nickm> any more to figure out at this meeting, or do we get a 15 minute break before the next? :)
16:45:23 <nickm> #endmeeting