16:03:03 #startmeeting SponsorR 16:03:03 Meeting started Tue Jan 27 16:03:03 2015 UTC. The chair is asn. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:03:03 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:03:05 robgjansen might be interestd in this too 16:03:12 who is here for this meeting? 16:03:16 icodemachine: Stem simply runs tor with a given configuration so if running tor normally works then it's a difference in the configurations. I'd suggest taking a look at your torrc and comparing it with what you're spawning with stem. 16:03:22 And please don't send me pings. 16:03:23 ah, ok nickm. email works great too 16:03:23 hello meeting. 16:03:24 ohmygodel: any chance you'll be around about 2:00pm est ish? 16:03:32 nickm: sure 16:03:41 ohmygodel: great; I'll make a note 16:03:44 syverson: i wont be at nrl today haha 16:03:54 I'm at a meeting. 16:04:01 great. 16:04:10 so I guess the main topic of our meeting today is roadmapping again. 16:04:15 but let's start with status reports, just in case. 16:04:22 who wants to go? 16:04:24 first? 16:04:35 ok i can go first. 16:04:42 during last week, I mainly focused on uni. 16:04:55 I commented a bit on the extrapolation pdf of Karsten, but didn't have time to think of the hard problems. 16:05:05 I also did some SponsorR bureaucracy. 16:05:18 And also filed some tasks under the "Security" section of the SponsorR roadmap. 16:05:22 which you can find here btw: https://etherpad.mozilla.org/LDiWZpI1sz-roadmap 16:05:27 and that's my contributions this week. 16:05:30 who wants to go nexT? 16:05:44 * asn pokes the mic. is this thing on? 16:05:47 i can 16:05:52 ohmygodel: yes go! 16:06:04 i added some tasks to the statistics section of the etherpad roadmap 16:06:19 i gave karsten some comments on extrapolation as well 16:06:24 very useful comments! 16:06:29 yes 16:06:39 asn: didn't look at yours in detail yet. 16:06:42 fyi: im working on peerflow mostly until 2/23 16:06:49 GeKo: I think my patches for #14420 should fix the OpenSSL mirror issue once and for all 16:06:54 ohmygodel: ack 16:07:05 oh, also i sent some stuff to aaron gibson about secure bandwidth measurement 16:07:17 he asked for what we’re doing, he’s interested in working on it 16:07:22 ok 16:07:23 i suggested that he consider joining sponsor r work 16:07:30 well, he has funding for fixing up torflow i think 16:07:32 hes also applying to an RFA fellowship on that topic 16:07:37 well he has expressed interest, but hasn't followed up. 16:07:44 in the meantime of developing peerflow further 16:07:49 ah nice 16:07:51 ok, maybe we can encourage him more :-) 16:07:52 that’s it for me 16:07:58 great thanks 16:08:01 who wants next? 16:08:04 me 16:08:05 * karsten can go next 16:08:07 ah, go 16:08:07 robgjansen: please 16:08:20 i added some perf entries to the etherpad 16:08:31 then working on peerflow 16:08:57 read into capacity and available bandwidth estimation 16:09:04 btw, can you guys also send me a draft of the peerflow paper. just to see the general idea? 16:09:04 from the networking literature 16:09:22 and may try to apply those to peerflow if time allows 16:09:28 robgjansen: oh wow capacity. mutual information. 16:09:39 ack 16:09:51 asn: if you can wait, ill send you one by the end of the week. the design has changed, and so you might as well get the updated version once its written up. cool? 16:10:01 atagar: Checked my torrc file, it just contains Data Directory, GeoIP locations 16:10:04 ohmygodel: yes, seems like a good idea. 16:10:05 i'm obviosuly working with ohmygodel on this 16:10:38 also started prototype development and scoping 16:10:42 oh 16:10:56 thats all for me 16:10:59 thanks 16:11:01 karsten: you next! 16:11:05 ok 16:11:19 I looked into aaron's suggestions to add some sort of confidence interval to extrapolated stats. 16:11:30 as part of that I implemented the weighted median idea. 16:11:34 icodemachine: Ok. When you run stem.process.launch_tor() provide the torrc_path for that path. That should make it pretty much identical to just running 'tor -f [that torrc path]'. 16:11:52 and as part of that I was deeply confused for half a day why our simulation returns biased results for the cells. 16:12:09 turns out we were distributing cells in too large chunks to relays. 16:12:10 hm. what is the weighted median idea very roughly? 16:12:13 awesome, karsten! im interested in how accurate the weighted median is compared to the simple sum 16:12:41 asn: extrapolate from stats reported by single relays, order them by size, 16:13:00 asn: pick the median, but not the regular median, but weighted by the fraction of stats that relays saw. 16:13:10 that's to avoid putting too much weight on tiny relays. 16:13:16 ah. so that relays that have greater probs, get greater attention. 16:13:19 good. 16:13:22 yes i remember that. 16:13:24 ok 16:13:24 ohmygodel: it's not as accurate, but it has another big advantage: 16:13:30 single outliers are not a problem. 16:13:45 gotta love medians 16:14:01 16:12 < karsten> turns out we were distributing cells in too large chunks to relays. 16:14:03 so, that half day wondering about simulations makes me wonder if we should really run simulations for obtaining CIs. 16:14:04 what do you mean by that? 16:14:06 btw karsten, a reasonable compromise is the alpha-trimmed mean 16:14:21 (lol alpha-trimmed mean.) 16:14:34 you cut off the top alpha/2 and bottom alpha/2, and take the mean of the remaining 16:14:35 asn: in the simulated network, we distribute 10^9 (or 10^10) cells to relays. 16:14:48 ah i see 16:14:53 this area is called “robust statistics”, if you are interested in the idea more generally 16:15:01 ohmygodel: ah, that's also a useful idea for this case. 16:15:02 asn: and we do that in chunks to simulate circuits. 16:15:08 atagar: By the way, one other find, not sure if it is bug or not. I have Tor path and also tried adding the whole path containing the tor.exe file in my System ENV Variable. But still it gives me tor isn't available unless i explicity specify the cmd variable in launchtorwithconfig method 16:15:23 i guess it's what karsten was saying when he wanted to ignore the outliers. 16:15:24 asn: not going into details, but there are some assumptions in the simulation that need further work. 16:15:28 ack 16:15:44 ohmygodel: sounds interesting. 16:15:52 ok 16:15:52 ohmygodel: will look at that. 16:15:59 so, status report 16:16:00 karsten: please proceedf 16:16:19 I also started reading the anonstats thread and have some vague feedback, 16:16:29 but I need to sort that a bit before replying. 16:16:42 (got delayed from CIs..) 16:16:45 i have not had time to read the last two replies on that thread :( 16:16:46 would love to hear thoughts on AnonStats3 if anybody has them :-) 16:16:57 and last, I added some tasks to the etherpad. 16:17:13 ohmygodel: AnonStats3 was hardest to understand, 16:17:19 but I should give it another try. 16:17:25 that's all from me. 16:17:28 thanks 16:17:32 dgoulet: syverson: ? 16:17:33 karsten: i am interested in the bias problem also 16:17:42 I'll go. 16:17:45 syverson: ok 16:17:49 in case you are still figuring it out or just want to share later 16:17:54 ohmygodel: I'll reply to the thread and explain it better. 16:18:01 cool thx 16:18:14 icodemachine: I don't follow how your 'System ENV Variable' is related to stem. If you provide a tor_cmd argument stem will use that. If not then we simply call 'tor' and it'll only work if tor is in your path. 16:18:22 Other than chatting with ohmygodel some, I spent the week writing a paper on using onionsites for authentication w/ Griffin. Done. 16:18:32 great 16:18:55 I.e. didn't do much relevant really. 16:18:56 i would like to read that :) 16:19:01 ok 16:19:09 atagar: Sorry, I specified it wrong. I meant my Path Variable 16:19:24 I guess it's just me now :) 16:19:27 dgoulet: yes 16:19:36 asn: I can send you a copy. I assume saint is OK w/ that. 16:19:46 I organized my thought for an HS health measurement tool, arma's idea, for which I've identified data we need to collect in the HS subsystem (testing network) to achieve it. 16:20:19 right now working on identifying what the control protocol can offer us and extending it right now with an option that will allow us to dump the HS descriptor 16:20:22 content* 16:20:27 icodemachine: Ok, then I'm not sure. You can take a look at what stem's doing under the covers at: https://gitweb.torproject.org/stem.git/tree/stem/process.py#n34 16:20:32 it's not terribly complicated 16:21:01 so going in that direction and start the thinking on how to run scenarios in a testing network to collect that data. done 16:21:12 dgoulet: is it HS health measurement? or HSDir health measurement? 16:21:17 dgoulet: is the health thing in the roadmap ? 16:21:19 atagar: Yeah, sorry i just checked the source. So, the Tor browser bundle must be present in the directory of the python script.. I thought it had to exist in my Path Variable.. 16:21:20 the former seems super big. the latter seems approachable. 16:21:31 asn: HSDir for now yes sorry 16:21:32 ohmygodel: yes 16:21:35 icodemachine: I gotta focus on other things now. If you have specific questions though then let me know. 16:21:42 ohmygodel: as part of the "performance / correctness /testing" section 16:21:50 dgoulet: any update on shadow+lttng 16:21:54 asn: you mean the wiki tasklist 16:21:59 i meant the etherpad 16:22:05 dgoulet: from lttng devs 16:22:10 although the relationship between the two isnt totally clear to me :-) 16:22:13 ohmygodel: ah i thought it was also on the etherpad. 16:22:13 robgjansen: I have a meeting with them this week! ;) 16:22:14 ohmygodel: :) 16:22:32 dgoulet: great! i'm interested in the result of that meeting :) 16:22:55 ohmygodel: not yet added it to the pad, I should tho 16:23:20 one of the reason we want that is to be able to measure the relation between relay churn in the consensus and reachability of HS 16:23:31 dgoulet: great, please do 16:23:41 great. 16:23:46 so status report phase done. 16:23:49 let's move to the next phase. 16:24:04 does anyone have explicit discussion topics apart from "Wtf we do next 3 months?" 16:24:12 atagar: Thanks,I will go deeper and let you know if i find any bugs. 16:24:13 also, _when_ is the next SponsorR meeting on April? 16:24:19 is it start of april or end of april? 16:24:32 we are talking about Feb, March, and how much of April? 16:24:32 indeed, good question. 16:24:34 April 13-17 16:24:46 ok. so and a little bit of April. 16:24:59 so basically it's a little bit over 2 months. 16:25:21 (if we want to finalize early, instead of rushing the last 3 days like this time) 16:25:28 yes. 16:25:31 let's avoid that. 16:25:34 ok, now let's see https://etherpad.mozilla.org/LDiWZpI1sz-roadmap 16:25:41 I lied when I said I didn't do anything relevant. I also (along w/ Aaron and Rob) crafted a message to the PM asking for prioritization guidance on our tasks. He hasn't responded yet. 16:26:05 on NRLs tasks? or on Tor tasks? 16:26:19 (yeah FYI, round 2 of interviews for the PM are this week with the first one in 30 min ;) 16:26:20 NRL, but it's relevant for Tor. 16:26:33 ok 16:26:42 but different PMs here 16:26:44 sounds like two different PMs. 16:26:45 ;) 16:26:52 ok 16:26:55 yeah sorry, syverson made me tought about it hehe 16:27:08 please CC us on anything relevant to Tor deliverables. 16:27:15 We expect(?) him to say to spend a fair effort on Peerflow. That will involve TPI. 16:27:30 so, oh wow the etherpad has many lines on statistics. 16:27:52 most of them are the specific statistics that we should try and gather 16:27:53 not all of them are on the first level. 16:28:10 i wrote down that list when dgoulet, syverson, arma, and i were talking at the sponsor r meeting 16:28:19 ok 16:28:20 ohmygodel: btw, see my comments. 16:28:48 maybe we can kill two of them right away. :) 16:28:55 so, just to make sure, this is our roadmap right? not the one we send to sponsor. 16:29:03 this is for our own consumption. 16:29:06 yeah karsten, i dont recally why those were of interest 16:29:13 are we expected to send anything now? 16:29:15 i believe roger mentioned them for some reason 16:29:21 i don't think so. 16:29:31 karsten: only nick has asked for sponsor deliverables so far. 16:29:51 i will remove colors from the pad. don't hate me. 16:30:00 ohmygodel: I don't understand how we should implement them. 16:30:09 but also they should be very closely related to other IP and RP statistics, and so assuming that Tor is behaving how we expect, they may not have much marginal value 16:30:30 - number of relays that served as IPs # how's this useful? -KL 16:30:35 - number of relays that served as RPs # how's this useful? -KL 16:30:44 so for # of IPs 16:30:45 how would we even measure that? 16:30:47 maybe let's not discuss specific stats now? 16:30:53 ok 16:30:54 sorry 16:30:58 no it's alright9~. 16:30:59 e.g. establish_introduction_point cells 16:31:12 it's just that there are many in there, and I'm not sure if it's worth our time riught now. 16:31:15 ohmygodel: ahha 16:31:23 asn: D'oh I was just looking at which were karsten's comments. 16:31:26 asn: no, I agree with you. 16:31:33 agreed, lets focus on things for the entire group 16:31:43 ok. so the top level stuff of the statistics section. 16:31:46 is basically 16:31:58 wrapping up what we have so far. making the report better. fixing the rushed code. etc. 16:32:01 writing the blog post. etc. 16:32:10 and also incorporating them on metrics. 16:32:23 so that's alright-ish. 16:32:29 so now, the new stuff that needs to be done are: 16:32:32 I'd like to have some coding help on the metrics part. 16:32:36 just saying. 16:32:39 karsten: ack. 16:32:48 - decide whether to enable stats by default. 16:33:15 - think about more stats and implement them. accompanied by a list of most useful stats. 16:33:26 - and R&D on stats aggregation. 16:33:40 - and also R&D on peerflow (which arma said that it can be considered stats work) 16:34:02 i think these are way enough for 2 months. 16:34:23 we should probably prioritize them somehow. 16:34:36 yes 16:34:39 not necessarily now, but just working off the list might be bad. 16:35:02 i think the list above is a bit prioritized maybe. not sure where peerflow goes in the list. 16:35:49 and remember, that more stats. apart from the overhead of obfuscation and security analysis, also come with a tech report. 16:35:52 where's peerflow in the list? 16:35:53 so it's extra time. 16:35:54 asn: i agree that the order you gave is an ideal prioritization 16:36:06 in the list on the pad. 16:36:14 it's in the security section. 16:36:30 but it can also be moved to the stats section, if we think it matters. 16:36:32 IIUC. 16:36:36 ah, found it. 16:37:13 i kind of think that *I* _don't want_ to decide whether to enable stats by default. 16:37:17 but that's a discussion for another day. 16:37:30 so should we go deeper in this category? 16:37:35 or should we move to next catgegory? 16:37:44 what's the goal? 16:37:48 of this meeting? 16:37:51 good point. 16:37:54 asn: im surprised there is a debate about enabling stats by defaul 16:37:56 yes, with regard to the etherpad. 16:38:01 i thought that was always the plan 16:38:10 it won't stay on the etherpad forever, I guess. 16:38:15 what do we make out of it? 16:38:18 tickets? 16:38:19 it also seems like something arma could render a quick decision 16:38:24 on 16:38:27 karsten: tickets yes. 16:38:36 karsten: and maybe [tor-dev] post? and maybe update the wiki9$? 16:38:46 with priorities assigned somehow? 16:38:51 karsten: i think it's also for personal consumption. so that we each puyt stuff in our TODO list, and be done for the next months with roadmapiing. 16:38:54 karsten: ye 16:39:16 ohmygodel: i agree, that when I started working on this I assumed that enabling the stats by default was the plan. 16:39:26 ohmygodel: but now, I don't want to take this decision. 16:39:40 ohmygodel: a few people from the community have expressed concern about stats, and myself am statsophobic. 16:39:58 asn: ok, makes sense! 16:40:12 this seems like something that roger might say "what's the worst thing that can happen. enable them." 16:40:23 and if he does say so, I'm OK with it. 16:40:50 but anyway, this doesn't seem like a big project time-wise. 16:40:56 Yet another instance of major Tor direction by status of arma-whimsy. 16:41:17 I wonder how he feels about that, but that's a topic for another time too. 16:41:46 hey syverson, i would take arma as benevolent dictator over many other models of decision-making :-) 16:42:32 also, I'm not sure if deciding on these statistics to be enabled by default, should mean that the next statistics should be enabled by default. 16:42:41 but whatever. 16:42:41 asn: good point. 16:43:02 It's not his control. It's responses to quickly considered offhand remarks I'm concerned about, but as I said, another time. 16:43:04 (in the code we only have the HidServStatistics torrc option. not the NewHidServStatistics torrc option ;)) 16:43:24 so OK. let's move from the statistics category. 16:43:29 i think it has all the deliverables we want. 16:43:39 yup. 16:43:43 but I'm certain that we won't be able to cover all of them in 2 months. 16:43:51 except if me, karsten and dgoulet work just on this section. 16:44:02 yes, it's a lot. 16:44:22 so let's move to the performance section. 16:44:25 ping dgoulet 16:44:33 yello 16:45:01 so the perfomrnace section includes various testing projects. like lttng and shadow and chutney tests. 16:45:05 half of them are robgjansen in there right now, basically using shadow for that 16:45:20 these are all nice and i think dgoulet has a hang of them. 16:45:27 the protocol changes stuff is for the future 16:45:29 other part is HS testing in a network, but that's OK if we slip for April since this is also SponsorS 16:45:38 ok 16:45:49 and finally, the encrypted services part 16:45:55 then we have encrypted services and tor2web mode. 16:46:02 protocol changes sounds time consuming. 16:46:07 which I have no idea if that fits in SponsorR even... 16:46:08 i would like to do some work on encrypted services. 16:46:11 i wasnt and still am not quite sure why these things are listed here instead of the wiki 16:46:23 but i added the things i thought were relevant to sponsor r 16:46:32 and various protocol changes are 16:46:35 robgjansen: it's a staging area. 16:46:41 i have not yet looked at the 5 hop proposal 16:46:56 asn: great, that clarifies things. so we’ll move these things to the wiki after this meeting, right? 16:46:57 the "shorter path length" thing? 16:46:57 in my copious free time i truly want to 16:46:59 yes 16:47:04 ohmygodel: after this or the next meeting, I think. 16:47:09 ok 16:47:19 yes I also want research on shorter path length. 16:47:21 so i think those things could be moved to the next phase 16:47:24 not sure if we can fit it till April. 16:47:28 yea 16:47:37 maybe we can fit a proposal on encrypted services. 16:47:38 (should we add names to tasks?) 16:47:40 asn: i would be happy to work on this with you after april! 16:47:58 not sure about encrypted srevices 16:47:59 that is ,take roger's proposal and make it into a proper proposal. maybe even with implementation plan. 16:48:00 i lean to no 16:48:05 oh 16:48:10 as a project you mean for april? 16:48:13 you don't like? 16:48:22 no sorry, i mean i dont think we will have enough time 16:48:29 i like the idea 16:48:30 ah ok 16:48:38 i think that dgoulet will be busy with benchmarking 16:48:43 true 16:48:49 i will be busy with peerflow until march for sure 16:49:02 and we should write up the benchmark results in a tech report 16:49:09 yes true. 16:49:18 hopefully we have shadow sims to include in the benchmark resutl 16:49:19 i could work on it, but I imagine that I will have to work on other things. 16:49:32 i commented out the "shorter path legnth" and the "tor2web mode" part. 16:49:32 and you are involved with stats tasks, no? 16:49:38 robgjansen: i guess so 16:49:47 asn is really everywhere ;) 16:49:59 also, dgoulet is doing the health thing now 16:50:25 so yeah, i think encrypted services and path length changes could wait until after april 16:50:28 the "Reachability" section has a weird name. 16:50:34 yeah I can help in some part, as long as we prioritize our deliverables 16:50:41 is there a place for future? 16:50:49 asn: feel free to fix it 16:50:54 maybe put it at the end of the list? or just remove it? 16:51:05 robgjansen: future is infinite. there is this list: https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorRtasklist 16:51:25 ok, well structure the pad as you wish 16:51:38 it can be things that we absolutely want to deliver in april 16:51:47 then future tasks can go on the wiki page 16:51:49 yes, the pad should only include April tasks in the end IMO. 16:51:52 i leave it up to you 16:51:55 ok 16:51:59 :) 16:52:15 so i thought this was all getting migrated to the wiki :-? 16:52:23 I removed the reachavbility section entirely. 16:52:30 the hsdir health measurement thing exists on the correctness section already. 16:53:17 oh my pad was stalled... that explains it 16:53:34 ok. move on section. 16:53:37 i think peerflow should stay in the securty section, especially since i dont think this is being shown to the PM or anything 16:53:42 robgjansen: yes 16:53:59 so correctness includes bugfixing and the hsdir health thing. 16:54:01 that's alright. 16:54:10 and now we have "security" section. 16:54:19 oh sorry i jumped ahead (non-bold titles!) 16:54:31 I added #8244 in there, because armadev implied that it can be in scope for SponsorR. 16:54:40 it's giga-project but we can do some progress on it. 16:54:53 asn: mostly my email I sent yesterday is about Correctness so that section is well seperated in tasks 16:55:06 then there is, #8243 which is small project but requires a few days of thinking. 16:55:16 all of these are basically HSDir security. 16:55:30 then there is #13989 that nickm is working on, and I plan to help him with. 16:55:38 #13989 came out of our NSF proposal 16:55:48 is that good or bad? 16:56:05 i mean the one we just submitted but is not yet funded 16:56:15 not sure how that is a sponsorR thing? 16:56:29 asn: i personally think that guard security isnt helpful to show progress for sponsor R. roger may have some argument otherwise. 16:56:40 ok. i added that to the list because I plan to work on it anyway. 16:56:43 i mean, to the extent that it is important anywhere, it is important here 16:56:49 ok. 16:56:52 then i remove it from the list. 16:56:53 whatever. 16:57:09 and then we have peerflow 16:57:19 sounds good (im definitely working on non-sponsor-r things also :-) 16:57:36 there is a general, make HSes more robust and secure for crawling. 16:57:45 guard security fits in there. 16:57:46 yeah somehow i am working on 200% man years 16:57:58 with peerflow, we still have *at least* another month of research 16:58:01 ok. and that settles the security section. 16:58:11 What unit is % ? ;) 16:58:20 (we at nrl) 16:58:36 and then there is the last section about the Opt-in HS publishing thing. 16:58:43 syverson: i dont get your joke 16:58:45 and i wonder who would begin work on transitioning to tor 16:59:02 ohmygodel: for peerflow? 16:59:07 yes 16:59:10 i spend 2 man years of effort for every man year... isn't that 200% effort 16:59:19 ugg, n/m 16:59:21 well we at tor should look at the paper, and figure out the deployment/implementation details 16:59:24 and whether it's viable. 16:59:38 Please ignore. Don't want to waste time on aside joke explanation. 16:59:41 right, so i think that should be considered low priority atm, and TBD in a month or so 16:59:46 yes 16:59:48 ok 16:59:54 hrm wait 16:59:57 dgoulet: yes 17:00:06 didn't the Opt-in thing, we agreed on having it for April meeting? 17:00:15 hm 17:00:19 we agreed on having it what? 17:00:20 to show that to our sponsor 17:00:23 having it deployed? 17:00:26 having it specified? 17:00:33 implemented/deployed yeah 17:00:34 i don't think I agreed on having it deployed by April. 17:00:51 I don't think that's realistic even. 17:00:55 there are lots of details to iron out. 17:00:59 keep release cycles in mind. 17:01:06 0.2.6.x freeze, etc. 17:01:07 Hrm I remember vaguely something about agreeing at the SponsosrR meeting that we should do it for the next quaterly meeting 17:01:18 ohmygodel, robgjansen, syverson: you don't recall? 17:01:28 dgoulet: yeah but by "do it", I think we meant "R&D" not "final product". 17:01:29 well we can change that also... 17:01:40 yeah i recall something being said by somebody from tor, dont recall specifics 17:01:43 asn: yeah doesn't seem doable anyway in the timeframe 17:01:46 well, let's start assigning some names :) 17:01:53 that leads me to believe that the specifics are flexible 17:02:07 what is "the opt-in thing"? 17:02:08 should we assign some names to the etherpad? 17:02:10 robgjansen: ah. 17:02:11 asn: how about we all add our name to all things we'd like to do? 17:02:25 my opinion is that a proposal would be more than adequate on that task 17:02:26 robgjansen: it's the one where HSes will submit their onion address to a central directory, if they opt-in to do so. 17:02:27 asn: regardless of whether that means 100%, 200%, or 300% man years? :) 17:02:36 ohmygodel: +1 17:02:42 karsten: +1 17:02:43 ohmygodel: yes 17:02:44 and then we prioritize? or what's the process? 17:02:53 karsten: :D 17:03:18 ok. 17:03:23 let's look at our sections 17:03:29 and if you don't agree with the prioritzation 17:03:36 (that is, with the order they appear on the screen) 17:03:41 (vertical order) 17:03:44 change it. 17:03:45 or say it here. 17:04:35 so my prioritization (and perhaps that of robgjansen and syverson), will be tentative contingent on feedback from chris (the PM), assuming that is forthcoming… 17:05:21 or if not we go with our recommendation 17:05:26 which is peerflow 17:05:39 and i forgot the rest tbh 17:05:44 fuck. i'm again all over the statistics session. 17:06:19 dgoulet how do you feel about your sections? 17:06:22 where would you want help? 17:06:38 ohmygodel: basically the stuff we are already doing now we will continue, unless chris slaps us 17:07:05 this is almost like a real dev meeting, with colored stickers and all that. 17:07:05 robgjansen: yes 17:07:06 fwiw guys, no way I can do all these things in 2 months. if I want to continue being a uni student and write a thesis. 17:07:17 i will have to load balance adaptively. 17:07:18 asn: ! 17:07:24 asn: I added my name on what I can/want do, I also want to help anyone that needs help with the code side of little-t tor 17:07:25 asn: is there a possibility to bring on somebody else from tor? 17:07:46 maybe aagbsn? 17:07:56 roger said there is no funding for additiona people. 17:08:01 i don't even know if aagbsn has funding from Tor. 17:08:13 tor is getting $700K+ from Sponsor R, not to mention funding from Sponsors S and Q 17:08:24 we have plenty of funds for this stuff, *especially* the stats stuff 17:08:34 karsten: what help do you need with the metrics.tpo part? 17:08:34 the budget is currently being worked on for 2015 so we simply have no idea right now 17:08:44 arma has already said that we should ask nickm and andrea for help if we need it 17:08:48 It doesn't have to be new funding. It could be a recognition that allocation of people/man-hours needs adjustment for exiting funding. 17:08:52 ohmygodel: good point. 17:08:57 i suggest we start asking for help now 17:09:09 asn: put all new hidserv-stats and consensuses in a database, run some aggregation script, extract daily averages that we can graph. 17:09:41 asn: so, excluding the website part. I can handle that. but from processing descriptors to writing a .csv file. 17:09:58 ok 17:10:21 asn: java, python, whatever. postgres, mongodb, whatever. 17:10:42 karsten: what do you mean "from processing descriptors to writing a .csv file" ? basically parsing the consensus and put it in a csv format? 17:10:49 extra info descs 17:10:50 probably 17:11:15 dgoulet: pretty much what my java program does, 17:11:37 so the "add more stats" section is the hard one. 17:11:38 but in a way that can handle newly published descriptors without re-processing them all again. 17:12:00 because it requires some research time, and some development time, and then lots of extra time for crunching stats and writing reports. 17:12:13 and i'm not sure who can help with this. 17:12:13 dgoulet: https://gitweb.torproject.org/karsten/metrics-tasks.git/tree/task-13192/src/java/ExtrapolateHidServStats.java?h=task-13192 17:12:22 the issue with funding is there needs a PM to manage it with deliverables 17:12:25 me and karsten are a good pair here. but there are all the other things that need to be done with us too. 17:12:38 the he said she said game with funding wont end happy i think 17:13:04 robgjansen: ye 17:13:08 robgjansen: we are working hard at getting one :) 17:13:13 robgjansen: we don't have PM so this process is hella ad-hoc 17:13:21 it doesnt help that the budget isnt done 17:13:22 asn: how about we try and interest a new person in the last task (“for statistics that currently can't be collected safely…”) 17:13:23 right 17:13:36 that is relatievly self-contained and separate from the other stats tasks 17:13:39 someone who will analyze the AnonStats things? 17:13:45 yes 17:13:51 i remember we get this for free, by analyzing peerflow? 17:13:58 or not? 17:14:11 because peerflow is a superset of anonstats. 17:14:25 asn: we are now just using noisy stats in “basic” peerflow, to make it not depend on such a solution 17:14:37 (one of the changes i referred to earlier) 17:14:48 ohmygodel: who could this new person be? 17:14:54 nickm 17:14:59 aaron gibson 17:15:03 ohmygodel: i know george danezis is into this problem. but he is like a professor or something. 17:15:18 so i don't think we can just interest him out of the sudden. 17:15:25 asn: i have exchanged several emails with danezis recently (like, up to yesterday) 17:15:46 he has his own crazy AnonStats thing. 17:15:53 which doesn't require an anonymity channel, IIRC. 17:15:57 or something. 17:15:59 You mean Privex? 17:16:02 no. 17:16:03 he is a good collaborator for longer-term research, but not so much for something by april IMO 17:16:05 another one. 17:16:10 ohmygodel: yeah exactly. 17:17:03 ok 17:17:04 so 17:17:09 i don't think we can solve this problem today. 17:17:11 instead. 17:17:21 i think we have processed this list a good amount today. 17:17:26 and we know approximately what we are going to be working on. 17:17:38 for the next week, let's prioritize. and figure out who is going to be doing what. 17:17:43 and also try to interest additional people. 17:17:47 so that by next week we can start work. 17:17:48 asn: have we not already prioritized ? 17:17:51 yes 17:18:01 we pretty much have 17:18:02 and figured out who will be doing what ? 17:18:05 yes 17:18:10 but there are too many things to be done 17:18:34 well lets start on the highest priority tasks 17:18:35 keep them in scope 17:18:46 yes 17:18:55 and ask around for help 17:19:19 we can also start working on things now. 17:19:21 for example, i dont see why we cant have reports and blog posts done in a week 17:19:49 yes that's a good start. 17:19:50 well, 17:20:00 also the extra statistics discussion needs to continue. 17:20:12 (btw, this week i'm extra hosed. i will be better after tuesday next week.) 17:20:24 we can publish reports and maybe work on better reports later. 17:20:36 for example, we might just leave out confidence intervals. 17:20:37 i can draft up a blog post fast, I think. 17:20:45 this "discuss getting the current stats enabled by default in tor 0.2.6", should we start an email thread about this or .. ? 17:20:51 like in half a day or so. 17:20:58 dgoulet: great idea 17:21:03 dgoulet: yes let's do it. 17:21:10 ok I can start it 17:21:13 thx 17:21:36 is the perfomrnace section prioritized? 17:22:08 asn: I would say yes, they are all at the same level but the most important is the first one 17:22:17 getting large scale experiment running 17:22:28 karsten: “we might just leave out confidence intervals” - agreed, this might just have to be postponed or left experimental for now 17:23:05 robgjansen: btw, if you have a good idea for the Opt-in project thing. 17:23:07 ohmygodel: yeah. I just want to wrap that up somehow, so that I don't start at zero next time. I feel we discussed a few important things there. 17:23:08 for "discuss collecting additional statistics and implement if considered safe, including especially the following", we can rely on the tech report and deepen the analysis on the one that are listed in the pad right now, I can start on that also this week 17:23:14 robgjansen: sorry, i meant a good name. 17:23:35 dgoulet: yes. it's a good idea to start this early. 17:23:44 i will also join you very soon on this. 17:24:04 karsten: you want me to start writing a blog post on our findings? 17:24:10 karsten: i don't think it should be very big. 17:24:15 karsten: or you can do it if you want, really. 17:24:17 asn: sure, I might resurrect a pad for this because I don't see other better way to brainstorm each of them... will think about it 17:24:24 Could Juha work on the opt-in stuff? 17:24:27 asn: not atm, maybe syverson does as i think he did some work on this? 17:24:36 syverson: hm. juha will have to work on the ahmia side. 17:24:37 asn: if you could start that, please do. I'd like to work more on the report. 17:24:42 syverson: i'm not sure if he would do very well on the little-t-tor side. 17:24:49 karsten: ok. 17:24:54 asn: thanks! 17:25:09 karsten: this week I'm goign to FOSDEM and I also have to fix the guardfraction stuff, but I will try to have something ready by next Monday or so. 17:25:14 i think it's doable. 17:25:17 dgoulet: is fixing the #13192 unit tests something you might enjoy? 17:25:23 dgoulet: well, not fixing. extending. 17:25:29 asn: Ah, in that sense I agree (and don't have a good suggestion for working on little t-tor). 17:25:33 dgoulet: i think we already have good evaluation of these stats. most are unfortunately not possible to collect without additional infrastructure. but we should be able to decide on this very quickly. 17:25:36 dgoulet: there's a pending patch from teor (I think) that needs attention. 17:25:42 karsten: I sure can offload you! :) 17:25:47 syverson: but yeah, after we figure out the architecture (like, _where_ do the HSes announce themselves), 17:25:56 some work will need to be done on the ahmia side to incorporate them int their database 17:26:00 dgoulet: neat! 17:26:10 juha is going to be at sri after may so that's great news. 17:26:13 ohmygodel: right I thought so thus we should straight away work on that and fix it since it should be quick 17:26:42 Right. This also connects with SRI having set up crawling detectors already. 17:27:04 dgoulet: sounds good. you are setting up an etherpad to sort this out over the next week or two. is that correct? 17:27:08 on that, the hs health tool apparently should help SRI with there crawler 17:27:25 ohmygodel: I will probably do a pad yes, easiest way for that I think 17:28:01 so, this meeting is ajourned? 17:28:03 dgoulet: great. send me a link when it’s up please. i’ll be able to quickly add in my thoughts. 17:28:13 yes, I think we are approaching the end of htis meeting. 17:28:16 robgjansen: one more question 17:28:19 ohmygodel: ack 17:28:30 how are we asking around for additional help ? 17:28:53 i admit that neglected to invite aaron gibson to this meeting. sorry! somebody should at least do that. 17:28:57 SRI set up the detectors at my prompting. Maybe I could get them interested in setting up the voluntary sign-up too. I have talked to Phil about it in the past. 17:29:15 and i can certainly bring it up with nickm when we talk this afternoon 17:29:38 great, theres the plan 17:30:01 (by “it”, i mean getting some help from him or someone else he might recommend, especially for creating a stats aggregation system) 17:30:28 ohmygodel: yep, i think he may have a good recommendation 17:30:28 but im not a tor person so it seems a bit odd 17:30:36 i will also think of more people. 17:31:03 but it's hard to do this, when you don't know if there is funding for them. 17:31:08 and when all tor people are overloaded by default. 17:31:43 ok anyway 17:31:44 right, the only other idea i have is andrea 17:31:45 90 minutes closing in. 17:31:53 arma explicitly said that she is beieng funded by sponsor r 17:32:09 true 17:32:15 i dont really know her areas of interest and expertise 17:32:31 maybe someone else can figure out how she might want to help out ? 17:32:38 yes that's a good idea. 17:32:44 That is an example of what I meant about re-prioritizing people w/in existing funding. 17:32:47 athena might be helpful here. 17:32:50 syverson: ye 17:32:59 ok 17:33:07 ok so meeting over. 17:33:10 as long as we have concrete tasks for her, yeah doable 17:33:11 over next week , walk over the list. 17:33:18 figure out whether you have enough time to do whatever. 17:33:22 plan over your deliverables. 17:33:25 find more people to work on them. 17:33:32 or even start working on them ,if you feel like it. 17:33:42 so that by next week we have a precise list that we can copy to the wiki. 17:33:50 i hope this is ar easonable plan. 17:33:54 till then. 17:33:55 sounds good asn 17:33:56 #endmeeting