16:03:52 <asn> #startmeeting 16:03:52 <MeetBot> Meeting started Tue Dec 23 16:03:52 2014 UTC. The chair is asn. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:03:52 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:03:59 <asn> (a bit late) 16:04:00 <ohmygodel> first, i did add a small section to the tech report 16:04:13 <ohmygodel> karsten, um, how do i do a pull request ? 16:04:29 <asn> ohmygodel: push your branch to the internet, and pass him the repo url and branch name. 16:04:29 <karsten> ohmygodel: post a link to your repo on the ticket, and I'll pull. 16:04:52 <ohmygodel> can you link the ticket 16:05:00 * karsten finds it 16:05:10 <asn> syverson: hello 16:05:14 <ohmygodel> i i don see it obviously on https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorR 16:05:26 <syverson> asn: hi 16:05:37 <ohmygodel> syverson im doing my status update 16:05:51 <ohmygodel> im asking karsten how to request my changes to the tech report be pulled in 16:05:54 <syverson> OK, sorry if I missed anything 16:06:31 <asn> karsten: is there a ticket for the tech report itself? 16:06:38 <asn> i know of #13509 which is for the proposal 16:06:39 <ohmygodel> asn: push my branch to the internet ? 16:06:43 <karsten> looking. please carry on with reports. 16:06:49 <ohmygodel> sorry if im dumb at this 16:06:50 <asn> ohmygodel: yeah, push it on github or whatever you use. 16:07:22 <asn> ohmygodel: since the upstream repo of the tech report is on torproject.org, you can't really use github's pull request (tm) thing. 16:07:30 <ohmygodel> ok got it 16:07:55 <asn> so you just need to push your branch to your public git repo, and post its url to karsten. 16:08:02 <asn> ohmygodel: please proceed with the status report :) 16:08:09 <ohmygodel> hm yeah so about that 16:08:17 <robgjansen> ha! 16:08:21 <karsten> hmm, okay, maybe there's no ticket. 16:08:44 <ohmygodel> i probably shouldnt publish anything under my name publicly 16:09:13 <ohmygodel> so can i push it to a private repo 16:09:16 <ohmygodel> i use bitbucket 16:09:23 <asn> why not publish with your name? 16:09:23 <ohmygodel> you can merge it if you desire, out of my hands 16:09:27 <asn> sure 16:09:33 <syverson> ohmygodel: at least not without a long lead time 16:09:33 <karsten> ohmygodel: or send me `git format-patch` files. 16:09:34 <ohmygodel> and dont put my name on until i get approval 16:09:45 <ohmygodel> i have to get approval to publish publicly under my name 16:09:50 <robgjansen> send the patch 16:09:53 <ohmygodel> yes it sucks. welcome to the government 16:10:11 <ohmygodel> ok cool ill do either the private repo link or the patch 16:10:14 <ohmygodel> probably the patch actually 16:10:20 <ohmygodel> would be easier 16:10:22 <asn> reat 16:10:22 <karsten> sure 16:10:23 <asn> great 16:10:23 <robgjansen> the patch that karsten can look over and take inspiration from, and then create his own commit 16:10:33 <ohmygodel> ok next for me 16:10:33 <karsten> a very similar one, yes. 16:10:41 <robgjansen> ;) 16:10:46 <ohmygodel> i thought a bunch about the stats and how to report them 16:10:56 <ohmygodel> and id like to discuss options for them during discussion 16:11:06 <asn> #topic stats and how to report them 16:11:07 <ohmygodel> ok thats it for me 16:11:17 * asn shrugs at MeetBot 16:11:19 <asn> great 16:11:20 <asn> next ? 16:11:32 <karsten> I can go next. 16:11:38 <asn> karsten: go for it! 16:11:39 <karsten> - Helped a bit with merging code and preparing announcement. 16:11:42 <karsten> - Looked at David’s tracepoints code and logs and tried to explain strange clusters. 16:11:45 <karsten> - Discussed detecting hidden-service crawlers with Donncha. 16:11:48 <karsten> - Helped Paul set up a mailing list. 16:11:50 <karsten> done 16:11:58 <asn> nice 16:12:08 <asn> the donncha discussion is those two mails in tor-assistants, right? 16:12:13 <karsten> yes. 16:12:18 <asn> or is it in an ml somewhere? 16:12:19 <asn> ok great 16:12:24 <asn> what about the clusters? 16:12:26 <asn> anything found? 16:12:44 <dgoulet> I need to do some more tracing with new tracepoints that karsten suggested 16:12:44 <karsten> we added another tracepoint for *sending* cells. 16:12:47 <syverson> is the donncha discussion only on tor-assistants? 16:12:56 <karsten> the one we had was for *receiving* cells. 16:12:57 <asn> syverson: unfortunately it is. 16:13:15 <karsten> turns out the delay doesn't happen at the introduction point but on the way back to the service. 16:13:22 <asn> karsten: o_o 16:13:26 <karsten> next step is to add another tracepoint to see where on that way. 16:13:32 <asn> karsten: you mean in one of the hops of the circuit? 16:13:34 <karsten> tracepoint for relaying a cell, that is. 16:13:37 <DonnchaC> O 16:13:38 <karsten> yes. 16:13:43 <asn> curazy 16:13:47 <asn> ok. 16:13:48 <DonnchaC> *I'm here 16:13:49 <dgoulet> pretty weird yah 16:13:51 <robgjansen> nice 16:13:57 <asn> next? 16:14:06 * dgoulet can go 16:14:09 <karsten> oh, and about stats: 16:14:11 <karsten> very quickly 16:14:13 <asn> dgoulet: sec 16:14:14 <karsten> hidserv-dir-onions-seen: -19.00 26.50 71.00 92.75 165.00 16:14:14 <karsten> hidserv-rend-relayed-cells: -4709 44534 129374 4021317 14015109 16:14:14 <asn> karsten: yes 16:14:22 <karsten> that's min, q1, median, q3, max. 16:14:26 <asn> karsten: where is that from? 16:14:28 <karsten> more in the discussion part maybe. 16:14:36 <karsten> that's the 42 stats we have by now. 16:14:38 <asn> karsten: great. 16:14:44 <karsten> dgoulet: go. 16:14:46 <dgoulet> karsten: hrm... you don't ahve mine? 16:14:48 <asn> dgoulet: go for it 16:15:01 <dgoulet> my relay is at 7mil cells right now and last one was at 8mil 16:15:01 <karsten> dgoulet: should be in there. 16:15:16 <karsten> what's the fingerprint again? 16:15:22 <dgoulet> karsten: ah top one is 14mil sorry 16:15:27 <karsten> ah 16:16:04 <asn> karsten: a question I had is, should non-HSDirs publish onion-seen? 16:16:14 <dgoulet> ok so quick, most of my work on sponsor I did was on some tickets here https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorR 16:16:22 <asn> karsten: howver, the relays themselves dont know if they are HSDirs... 16:16:26 <asn> karsten: lets keep it for discussion. 16:16:28 <asn> dgoulet: yep 16:16:29 <karsten> ok 16:16:30 <dgoulet> new tracepoints also from karsten, published all the info about it 16:16:56 <dgoulet> my next step now is to run the intro. point experiment and graph the new times from the cells in/out 16:16:59 <dgoulet> done 16:17:24 <robgjansen> ok me 16:17:41 <robgjansen> last week i worked with dgoulet to try to get the tracepoint code running in shadow 16:18:11 <robgjansen> we made progress, lttng was running correctly in shadow but we were not able to get any of the tracepoints printed 16:18:41 <robgjansen> so still no useful data on the shadow front 16:18:51 <dgoulet> robgjansen: on that, I completely failed to build shadow here ... :( (but we can talk later about that) 16:18:52 <karsten> but getting closer! 16:19:16 <robgjansen> other things came up for me, but i will have more time to spend in the next few days 16:19:18 <robgjansen> done 16:19:26 <syverson> me now 16:19:29 <asn> great 16:19:30 <asn> syverson: go 16:19:54 <syverson> talked to ohmygodel about stats, which he will tell more about in discussion 16:20:11 <syverson> set up mailing list with Karsten, will be sending out invites after this meeting 16:20:13 <syverson> done 16:20:19 <asn> ack 16:20:29 <asn> and we are done? 16:20:36 <asn> move on the phase 2? 16:20:40 <asn> or did i forget someone 16:20:42 <asn> ? 16:21:06 <asn> ok 16:21:12 <asn> i guess we can go to discussion 16:21:13 <asn> some topics: 16:21:23 <asn> - aajohnson talks about how to gather statistics 16:21:34 <asn> - karsten and me should discuss how to extrpolate from these statistics to network totals 16:21:52 <asn> - we should discuss what to do about HS count when a relay is not HSDir. 16:22:15 <asn> - we should talk maybe a bit about tech report? 16:22:16 <asn> anything else? 16:22:33 <karsten> sounds like fine topics. 16:22:36 <asn> ok 16:22:42 <asn> ohmygodel: wanna start? 16:22:51 <ohmygodel> so i looked over the tech report this week 16:23:07 <ohmygodel> and a challenge i see 16:23:26 <ohmygodel> is the statistics for which knowing the reporting relay causes a privacy problem 16:23:28 <ohmygodel> for example 16:23:49 <ohmygodel> if relays publish the number of HS descriptor fetches they receive 16:24:14 <ohmygodel> then if you know the .onion, then you know the responsible HSDirs, and then you can see how much larger or smalle 16:24:23 <ohmygodel> r the number of fetches tend to be from the responible HSDirs over time 16:24:43 <ohmygodel> and then you can infer the number of clients typically connecting to that HSDir 16:24:58 <ohmygodel> another good example is introduction points 16:25:15 <ohmygodel> if a relay reports the number of INTRODUCE1 cells it receives 16:25:43 <ohmygodel> then if you know the .onion, you can get the IPs from the descriptor, and then you can see how much larger or smaller the number of INTRODUCE1 cells its IPs tend to see 16:25:53 <ohmygodel> again revealing the number of connections to that specific HS 16:25:57 <asn> yes 16:26:15 <ohmygodel> but there are some ways around this 16:26:33 <ohmygodel> ideal, of course, would be an aggregation procedure where the total number across all relays just pops out 16:26:41 <ohmygodel> e.g. secure multi party computation 16:26:47 <ohmygodel> but i dont think we necessarily need that 16:26:58 <ohmygodel> we can anonymize the reports to hide the reporting relay 16:27:19 <ohmygodel> of course, this brings up the issue of authenticating that the report is from a valid relay 16:27:34 <asn> oh anomyzing reports before they even reach us. 16:27:39 <ohmygodel> yes exactly 16:28:04 <ohmygodel> you can deal with the authentication issue by providing blind signatures from DirAuths to relays that allow them to report a single stat 16:28:22 <asn> hah plausible 16:28:25 <ohmygodel> you can report anonymously via Tor, of course 16:28:35 <ohmygodel> or you could do something simpler and more efficient 16:28:41 <ohmygodel> and run a shuffle over the DirAuths 16:28:58 <asn> for some stats, the anonymity set can be reduced by looking at the path selection probabilities though. 16:29:09 <syverson> ohmygodel: you'll come to delaying reporting I assume? 16:29:28 <ohmygodel> so my point is, there is a solution that seems very implementable in the near term, and would yield numbers that we really want but do not yield themselves to the model of each relay individually and identifiably reporting their stats 16:29:39 <ohmygodel> yes syverson its on my list 16:29:52 <asn> aha. which solution is that? 16:30:03 <ohmygodel> uh, the one i just outlined :-) 16:30:10 <asn> the blind sigs? 16:30:16 <ohmygodel> yse 16:30:18 <ohmygodel> *yes 16:30:31 <asn> hm. that does not immediately seem very implementable to me. 16:30:37 <ohmygodel> why not 16:30:43 <karsten> but better than multi-party things. 16:31:12 <asn> it seems easier than the multiparty thing indeed 16:31:15 <asn> but still tricky. 16:31:18 <asn> and needs security analysis. 16:31:25 <asn> i don't think the cell stats can be anonymized like that for example. 16:31:30 <asn> only big relays have big number of cells. 16:31:31 * nickm perks up at the mention of adding more crypto primitives 16:31:37 <asn> also this. 16:31:58 <asn> also, more types of directory documents. 16:32:06 <asn> and more authority crypto. 16:32:10 <syverson> shuffling through authorities is simpler still, with a little shifting around of trust going on. 16:32:31 <asn> syverson: yes. that's even simpler but a bit awkward. 16:32:32 <ohmygodel> asn that is a good point 16:32:44 <asn> "ehm now the DAs also know the popularity of HSes" 16:33:08 <syverson> threshold shuffle ;) 16:33:18 <ohmygodel> although you could avoid in the shuffle model that by having each relay submit a fixed number of votes 16:33:22 <ohmygodel> each vote is a zero or one 16:33:33 <ohmygodel> they submit the number of ones that is the count they wish to report 16:33:41 <ohmygodel> this applies to counts and to histograms 16:33:52 <asn> hah 16:34:07 <asn> like publish it in batches? 16:34:17 <ohmygodel> btw shuffles have been implemented n+1 times, moreover, it is a separate sytems whose security and insecurity would be fairly orthogonal to the rest of Tor 16:34:20 <asn> "i saw 800 hits. i send 200 to this DA, 100 to that DA, and 500 to that DA?" 16:34:36 <ohmygodel> perhaps but not exactly i had in mind 16:34:54 <ohmygodel> everybody submits 100K votes to DirAuth1 16:34:56 <asn> it still puts more trust to the DAs. i don't really like that. 16:35:03 <ohmygodel> DA1 shuffle, sends to DA2, who repeats, etc. 16:35:09 <ohmygodel> finally all votes are revealed and tallies 16:35:16 <ohmygodel> this is 90s technology, :-P 16:35:29 <ohmygodel> it is completely distributed trust 16:35:34 <ohmygodel> only one DA needs to be trustworthy 16:36:26 <syverson> ohmygodel: say more about trust assumptions you're making 16:36:41 <ohmygodel> what more can i say 16:36:42 <ohmygodel> ohmygodel: only one DA needs to be trustworthy 16:37:16 <karsten> whichever model we use, it needs to handle single failing DAs. 16:37:25 <karsten> this happens much more often than one would think. 16:37:27 <asn> karsten: good point. 16:37:52 <syverson> ohmygodel: are you saying all aggregation happens at DirAuth1 and that's it? 16:37:56 <syverson> I don't like that. 16:38:03 <ohmygodel> no 16:38:14 <syverson> so please say more. 16:38:26 <ohmygodel> everybody sends k votes, each representing a 0 or a 1 16:38:32 <ohmygodel> each dirauth shuffles in turn 16:38:37 <syverson> to DirAuth1? 16:38:56 <ohmygodel> dirauth1 is the first to shuffle 16:39:02 <asn> what is shuffle in this context? 16:39:05 * asn does not know things 16:39:07 <ohmygodel> but all relays can verify that they received all votes DA1 received 16:39:29 <asn> all DAs? or all relays? 16:39:33 <syverson> is this provable shuffles? 16:39:34 <ohmygodel> e.g. via commits to all DAs, or signatures 16:39:47 <ohmygodel> not necessarily, you could use a Dissent v1-style accountability mechanism 16:40:03 <syverson> OK good, that's what I was hoping. 16:40:12 <ohmygodel> but why not 16:40:21 <ohmygodel> provable shuffles are good too 16:40:33 <ohmygodel> anyway, i dont have all details worked out, obviously 16:40:34 <syverson> Grab some implementation from a voting application then? 16:40:39 <asn> i think maybe 16:40:40 <asn> ohmygodel: 16:40:48 <asn> the best way would be to make a tor-dev post? 16:40:54 <asn> you don't need to have all details 16:41:12 <asn> but at least a brief outline, and a brief security analysis would be nice as a mailing list post 16:41:20 <ohmygodel> but it seems to me that there is a solution in the near term, and that doing so is extremely important because the most useful statistics we are not getting have this privacy issue 16:41:22 <karsten> I'd also want to learn more about the other options. 16:41:32 <karsten> like blind signatures, and whatever else comes to mind. 16:41:43 <asn> from what I see, and I don't really understand the whole thing, this does not look like something that can be implemented in say the next month. 16:41:53 <ohmygodel> yeah i can think of at least a couple of different ways you could do it 16:41:56 <asn> it looks like something that can maybe be implemented before summer. 16:42:01 <asn> maybe 16:42:01 <ohmygodel> there is also the issue of poising the aggregate stats 16:42:12 <ohmygodel> because your stats are no longer segregrated to yourself 16:42:40 <ohmygodel> which is an important issues, although perhaps one that need not be solve immediately 16:42:50 <karsten> I could imagine that you publish anonymously, check at a later time that your stat is contained, and then publish non-anonymously that this is the case. 16:42:50 <ohmygodel> my suggestion there would be to use robust stats 16:43:10 <ohmygodel> for example, median instead of average 16:43:25 <karsten> hmm, no, that won't work as I first thought. 16:43:31 * nickm volunteers to implement the blind-signature tweaks on top of ed25519, if somebody tells me what those tweaks are. 16:43:40 <ohmygodel> *poisoning the aggregate stats, sorry 16:43:41 * nickm has already gotten waist-deep in ed25519-ref 16:43:47 <syverson> nickm: cool 16:44:23 <ohmygodel> ok asn so i think a tor-dev post is in order 16:44:28 <asn> ohmygodel: yes please 16:44:29 <karsten> cool. 16:44:30 <ohmygodel> thanks for the suggestion 16:44:35 <asn> ok that's good. 16:44:38 <ohmygodel> i would also like to bring up some safety issues that 16:44:41 <ohmygodel> syverson and i discussed 16:44:42 <asn> shall we move tto next topic? 16:44:47 <asn> since we are approaching one hour. 16:44:56 <ohmygodel> ok i can go to the end of the line 16:45:06 <asn> ohmygodel: go for it. 16:45:08 <asn> let that be the next topic 16:45:38 <ohmygodel> ok great 16:45:45 <ohmygodel> so one issue is that of time delay 16:46:02 <ohmygodel> some stats may refer to relays that continue to act in the capacity upon which they were reported 16:46:05 <ohmygodel> for example 16:46:20 <ohmygodel> if a relay reports on the number of ESTABLISH_INTRO cells it receives 16:46:27 <ohmygodel> it may still be serving as that intro point 16:47:01 <ohmygodel> it seems harmless to utility 16:47:12 <ohmygodel> to add a time delay to many statistics 16:47:32 <ohmygodel> so they wont be reported (or released) until relays will no longer be used in certain roles 16:47:36 <karsten> yep. 16:47:37 <asn> i agree 16:47:50 <ohmygodel> the risk is small, but the utility loss is negligible, so why not 16:47:50 <asn> ohmygodel: but what's the danger to the example you described? 16:47:56 <asn> sure. 16:48:11 <ohmygodel> ok so im discussing that in the section i added to the tech report 16:48:21 <ohmygodel> which is titled “obfuscation techniques" 16:48:34 <ohmygodel> for a similar reason 16:48:37 <asn> ok. 16:48:46 <ohmygodel> statistics about circuits shouldnt be reported 16:48:50 <ohmygodel> until after the circuit has been destroyed 16:49:28 <ohmygodel> e.g. dont report cells on a circuit that still exists 16:49:37 <asn> we do this currently. 16:49:53 <ohmygodel> ok great, that wasn’t clear to me in the tech reoprt 16:50:09 <asn> but the stats we do now are innocuous enough 16:50:14 <asn> that this danger is not very real. 16:50:28 <ohmygodel> i agree 16:50:44 <asn> and tbh i'm hoping that the stats we will do in the future will also be equally innocuous . 16:50:45 <ohmygodel> its just a suggested refinement 16:50:50 <karsten> are we sure this report will be ready by jan 12? 16:51:03 <karsten> is it a problem if it's not? 16:51:04 <ohmygodel> there will be something ready by jan 12 :-) 16:51:08 <asn> karsten: some sort of draft yes. 16:51:11 <asn> karsten: i don't think so. 16:51:12 <karsten> ok. 16:51:15 <syverson> define ready. 16:51:26 <karsten> because it seems it will become even better with more time. 16:51:28 <ohmygodel> yeah i say we give them what we have at that point 16:51:32 <karsten> ready as in we won't change it ever again. 16:51:38 <karsten> (but instead write a new one....) 16:51:41 <asn> ah 16:51:42 <syverson> Oh, then no. 16:51:45 <asn> probably no. 16:51:57 <karsten> sounds good. 16:52:05 <asn> i still haven't thought of a nice format for the tech report :( 16:52:14 <karsten> pdf? 16:52:20 <karsten> ;) 16:52:26 <asn> sorry. i meant that details/risk/benefits format is a bit misleading. 16:52:40 <ohmygodel> asnt i actually liked that 16:52:42 <karsten> ah that, true. 16:52:47 <asn> it just ends up being a big paragraph on weird benefits that are not really benefits 16:52:52 <asn> and with risks "no real risk" 16:52:57 <asn> which makes it look like the stat is actually useful 16:53:08 <ohmygodel> imo that will just take some effort 16:53:12 <asn> but in reality it's something random like "time from INTRODUCE1 to INTRODUCE_ACK" or something 16:53:32 <asn> which is pretty useless imo 16:53:40 <ohmygodel> for example syverson and i went through the first three stats in Sec. 4.2 16:53:49 <karsten> we'll have to define common criteria for what we think is useful or harmful. 16:53:51 <ohmygodel> and discussed several new risks for each 16:53:56 <karsten> and then evaluate all stats using those criteria. 16:54:11 <asn> ohmygodel: true 16:54:13 <ohmygodel> i could imagine treating various timing stats all at once though 16:54:38 <asn> it's also not clear how these stats are going to be reported? 16:54:51 <ohmygodel> e.g. some subset of times between the following sequence of events... 16:54:52 <asn> is it "all the times from INTRODUCE to INTRODUCE_ACK" or "average time" or "median time" or... 16:54:59 <asn> every decision has very different risks. 16:55:03 <karsten> asn: agreed. 16:55:08 <ohmygodel> yes that also needs to be detailed 16:55:18 <ohmygodel> i have listed all stats that i think should be reported as distributions 16:55:21 <ohmygodel> in the section i added 16:55:26 <ohmygodel> it includes all time-based stats 16:56:01 <ohmygodel> btw i have an outline of how to report such distributions safely 16:56:13 <ohmygodel> tl;dr use a noisy histogram 16:56:44 <asn> curious to read about it 16:57:04 <ohmygodel> ok yes perhaps we can discuss next week when you have had a chance 16:57:12 <asn> sure 16:57:15 <asn> so pleaese next topic? 16:57:29 <asn> karsten: we now need to work on how to extrpolate from those stats 16:57:35 <karsten> right. 16:57:39 <ohmygodel> oh on this topic 16:57:43 <asn> and the part I'm very curious about: how to understand how much noise we added to the strats. 16:57:47 <karsten> I only started looking at stats half an hour before the meeting. 16:57:47 <ohmygodel> syverson and i wrote up how to do this for a bunch of stats 16:58:04 <karsten> including the two we just implemented? 16:58:05 <asn> ideally in the end, we should be able to precisely upper and lower bound the stats we got. 16:58:08 <ohmygodel> can i send the writeup somewhere 16:58:12 <ohmygodel> attached to a ticket perhaps? 16:58:26 <asn> ohmygodel: ticket or mailing list all work fine. 16:58:55 <karsten> so, I think I'd start exploring the data we got by ignoring noise. 16:59:03 <asn> karsten: so we should be able to say we have 50k to 85k HSes. not "we have somewhere earound 70k HSes" 16:59:09 <ohmygodel> not including the number of relay cells 16:59:17 <karsten> I'd like to look what fraction of observations we'd expect a certain relay to see. 16:59:30 <asn> karsten: yep. that's going to be a bit tricky. 16:59:35 <asn> karsten: please look into it! 17:00:03 <karsten> for .onions, we should look at the time since the relay first got the HSDir flag, 17:00:11 <karsten> "distance" to other HSDirs, etc. 17:00:23 <karsten> for cells, it's consensus weight fraction during the stats interval, 17:00:33 <asn> karsten: hah even distance from other HSDirs? 17:00:35 <karsten> relevant flags that clients consider when selecting rendezvous points, etc. 17:00:41 <karsten> asn: well, why not. 17:00:45 <karsten> asn: worth a try. 17:00:46 <asn> karsten: i would just consider it uniform by assumption. but distance is more good. 17:01:03 <asn> the whole process will include reading old consensuses. 17:01:12 <asn> and also probably include being able to calculate the RP selection probability. 17:01:18 <karsten> right. 17:01:26 <asn> which might be different from all the other probs we already calculate. 17:01:30 <karsten> do you know how clients select RPs? 17:01:47 <karsten> ohmygodel: do you remember from writing TorPS? 17:01:49 <asn> karsten: i have some notes 17:02:04 <ohmygodel> karsten: i didnt pay attention to HS code 17:02:08 <karsten> ok. 17:02:10 <ohmygodel> i thought they were selected as middle relays are 17:02:29 <asn> ***** Sep 04 18:51:25.000 [warn] RP circuit flags: 17:02:29 <asn> need_uptime = 1 need_capacity = 1 need_guard = 0 allow_invalid = 1 weight_for_exit = 0 need_desc = 1 17:02:46 <asn> i think that's very similar to middle relays. 17:03:06 <asn> i think invalid relays are not considered for middle relays. though. 17:03:43 <ohmygodel> i thought they were 17:03:55 <karsten> are there invalid relays in the consensus? 17:03:58 <asn> the magic is in rend_services_introduce() 17:04:08 <asn> karsten: i think that's non-Valid? 17:04:12 <asn> karsten: don't remember. 17:04:24 <asn> router_crn_flags_t flags = CRN_NEED_UPTIME|CRN_NEED_DESC; 17:04:24 <asn> if (get_options()->AllowInvalid_ & ALLOW_INVALID_INTRODUCTION) 17:04:24 <asn> flags |= CRN_ALLOW_INVALID; 17:04:24 <asn> node = router_choose_random_node(exclude_nodes, 17:04:24 <asn> options->ExcludeNodes, flags); 17:04:33 <karsten> ok. step 1: consensus weight fraction, step 2: worry about flags. 17:04:39 <asn> yes 17:04:43 <karsten> and Wxx values. 17:04:47 <asn> in any case, we don't really need to find out the xact process now. 17:04:48 <dgoulet> asn: that's for IP no? 17:04:53 <asn> dgoulet: yes, i'm stupid. 17:04:55 <karsten> yup 17:04:55 <asn> dgoulet: thanks. 17:05:09 <karsten> we can figure that out. 17:05:16 <dgoulet> circuit_get_best() 17:05:17 <ohmygodel> i checked in TorPS, and the Valid flag is only checked for guard and exit nodes, not middles. That was intentional, although I could have gotten that wrong. 17:05:23 <asn> karsten: we will need to work on this over the next week or so. 17:05:41 <asn> karsten: so that we are not very siurprised on the begininng of jan when we need to do this for real. 17:05:51 <karsten> asn: right. 17:06:05 <asn> ok 17:06:08 <asn> so. 17:06:11 <asn> we are almost done? 17:06:16 * karsten puts that on the high-priority list. 17:06:26 <asn> karsten: do you want to work on that? 17:06:36 <asn> karsten: and i work on how to remove the noise to get the extrpolation error rate? 17:06:38 <ohmygodel> btw i attached that writeup to https://trac.torproject.org/projects/tor/ticket/13509 17:07:08 <ohmygodel> asn you are correct in that you should really output a distribution, not a single value 17:07:12 <karsten> asn: sounds good. I don't have good plans for handling the noise. 17:07:47 <asn> ohmygodel: ye.. we can output a single value for non-technical people. but we should be able to present a range for technical people. 17:08:00 <karsten> we know the distribution, right? 17:08:22 <ohmygodel> there actually is a good general technique to do that: bayesian inferenec via metropolis-hastings sampling 17:08:25 <karsten> so we only need to output variance or something. 17:08:30 <asn> ohmygodel: o.o 17:08:34 <ohmygodel> that way you dont need to attempt to do an explicit calculation 17:09:02 <karsten> and variance depends on the factor we applied to the originally reported value. 17:09:03 <ohmygodel> the question is, given the output, how likely is each input to have yielded that value 17:09:08 <ohmygodel> you generally assume a uniform prior 17:10:25 <ohmygodel> i dont expect you to necessarily do that, but maybe its easier than you think 17:11:02 <ohmygodel> a relevant paper on doing this for differentially-private statistics: “Probabilistic Inference and Differential Privacy” by 17:11:03 <ohmygodel> Oliver Williams and Frank McSherry, NIPS 2010, http://research.microsoft.com/apps/pubs/default.aspx?id=142363 17:11:18 <asn> ok i will look into it 17:11:38 <asn> just doing the noise removal for binning is easy 17:11:44 <asn> that is finding the upper and lower bound. 17:11:57 <asn> i need to think a bit more what happens when it's combined with the additive noise. 17:12:02 <asn> thx for the links btw 17:12:28 <asn> ok guys. 17:12:30 <karsten> ok. I'll explore the stats we received and let you know in a few days. 17:12:36 <asn> i think we should call it a day for today. 17:12:38 <ohmygodel> ill email out a patch 17:12:40 <ohmygodel> to karsten right ? 17:12:46 <karsten> yes 17:12:53 <ohmygodel> great thx 17:13:03 <asn> karsten: and we can be in contact over IRC. 17:13:10 <asn> for the extrapolation activity. 17:13:10 <karsten> be sure to edit the git author to whatever you want to see there. 17:13:18 <asn> and dgoulet, you do ... ? 17:13:18 <karsten> asn: sure. 17:13:20 <ohmygodel> ah ic thx 17:13:21 <asn> it's somewhere up in the backlog. 17:13:27 <dgoulet> asn: I do .. ? 17:13:32 <asn> dgoulet: i don't remember. 17:13:36 <asn> but it's somewhere up in the backlog. 17:13:47 <asn> dgoulet: what will you be doing next week? :) 17:14:17 <dgoulet> asn: more analysis on in/out cells, also start preparing a compact summary that Roger will need for Jan. meeting for his 15 min talk :P 17:14:25 <asn> there are also some christianity festivites going on this week. 17:14:27 <dgoulet> (also will be a bit afk for holidays) 17:14:30 <asn> dgoulet: great 17:14:51 <asn> ok. i think we have some sort of plan laid down for hte next week. 17:14:54 <karsten> asn: we can also talk more at 31c3. 17:14:57 <robgjansen> hopefully getting dgoulet's tracing running correctly in shadow 17:14:58 <nickm> dgoulet: I just had a question on your #13667 17:14:59 <asn> karsten: you are coming? 17:15:03 <asn> karsten: that's great! 17:15:08 <karsten> asn: yes, last two days. 17:15:08 <ohmygodel> ok sayonara amigos 17:15:10 <dgoulet> nickm: ack 17:15:20 <asn> karsten: fantastic. i'm not going to be there the last day. 17:15:23 <dgoulet> thanks all! 17:15:25 <asn> karsten: but we meed the second to last. 17:15:28 <asn> thanks! 17:15:32 * asn relocates 17:15:33 <karsten> thanks! 17:15:34 <asn> bbl 17:15:37 <karsten> asn: sounds good. 17:15:38 <asn> #endmeeting