14:29:41 <karsten> #startmeeting metrics team 14:29:41 <MeetBot> Meeting started Thu Aug 17 14:29:41 2017 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:29:41 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:29:47 <iwakeh> hi! 14:29:49 <karsten> iwakeh: hi! 14:30:20 <karsten> https://storm.torproject.org/shared/Ou-1QRctynWbF4yedi-MfDsjImFMFSIEP20fbVGCPRa <- agenda pad 14:30:37 <karsten> lots of topics. 14:30:48 <iwakeh> true 14:30:58 <iwakeh> should we re-order, prioritize? 14:30:59 <karsten> shall we start before more topics appear? ;) 14:31:06 <karsten> sure! 14:31:06 <iwakeh> beter is, 14:31:14 <iwakeh> but they could appear anyway :-) 14:31:33 <karsten> want to start with whatever is highest priority for you? 14:31:51 <karsten> (some can be really fast) 14:32:14 <iwakeh> CollecTor 1.2.1 14:32:22 <karsten> ok. 14:32:23 <iwakeh> should it be released? 14:32:41 <karsten> no objections. 14:32:48 <karsten> it's running fine. 14:32:53 <iwakeh> tomorrow? 14:32:55 <karsten> it's a tiny change, but that's what patch releases are for. 14:32:57 <karsten> yep. 14:33:07 <iwakeh> ok, topic solved ;-) 14:33:28 <karsten> yep. 14:33:39 <iwakeh> I reordered 14:33:49 <karsten> okay, next is exonerator? 14:34:04 * iwakeh reviewing #16596 14:34:13 <karsten> there's the branch with four commits. 14:34:18 <karsten> that ticket, yes. 14:34:28 <iwakeh> I'm at it. 14:34:28 <karsten> there will be another commit with a move to JSP. 14:34:31 <karsten> cool! 14:34:44 <karsten> the JSP move will be necessary in order to move the page over to metrics-web. 14:34:58 <karsten> (or is there an easier way that uses our existing top.jsp and bottom.jsp?) 14:35:01 <iwakeh> ok 14:35:12 <iwakeh> I can look when reviewing. 14:35:23 <karsten> ok. that fifth commit does not exist yet. 14:35:28 <iwakeh> There should be some 'magic' available. 14:35:30 <karsten> I started rewriting the servlet as JSP. 14:35:37 <karsten> okay, I'll wait for you then. 14:36:21 <iwakeh> the db could be separated from the move to metrics-web, if time is an issue. 14:36:36 <karsten> yes. 14:36:59 <iwakeh> I figure that the db has the higher prio? 14:37:17 <karsten> not necessarily. it's the bigger task. 14:37:37 <karsten> that could mean it's okay to finish the smaller task and then do the bigger one. 14:37:37 <iwakeh> ok 14:38:07 <karsten> okay. I didn't start with the db parts yet, so there's nothing to discuss today. 14:38:16 <karsten> next topic? 14:38:21 <iwakeh> yep. 14:38:26 <karsten> collector mirrors. 14:38:39 <karsten> I'm not sure what to do with mirrors. 14:38:44 <karsten> unofficial mirrors, that is. 14:38:45 <iwakeh> I removed the not function notice. 14:38:51 <karsten> yes, that's fine. 14:39:00 <iwakeh> Yeah, maybe just a collection to list? 14:39:13 <iwakeh> collection of unofficial mirrors. 14:39:28 <karsten> it doesn't hurt, but maybe it doesn't help, either. 14:39:38 <iwakeh> hmm true 14:39:46 <karsten> and we should think about whether we indicate to mirror operators that they're contributing something useful or not. 14:40:04 <karsten> I also don't want to take something away from them. 14:40:24 <karsten> (the ability to contribute) 14:40:25 <iwakeh> That would need a ruleset 14:40:31 <iwakeh> for good mirrors. 14:41:04 <iwakeh> Or, we just don't list them? 14:41:13 <karsten> possibly. not certain yet. 14:41:26 <karsten> we can still provide the sources and good installation instructions. 14:41:35 <iwakeh> so far, there is no queue for mirror operation. 14:41:42 <karsten> but if we're not doing anything with the mirrors, there's no real point in having them. 14:41:55 <karsten> true. 14:41:59 <iwakeh> yes, we could sync? 14:42:15 <karsten> not sure if we want that. 14:42:21 <karsten> we haven't done that in the past. 14:42:25 <karsten> we considered doing it. 14:42:29 <karsten> but then we didn't. 14:42:43 <karsten> I think we're in a good situation with two hosts that we control. 14:42:47 <iwakeh> Maybe we should investigate how much that improves? 14:42:54 <karsten> there might be even three of them in the future. but we should run them. 14:43:03 <iwakeh> I mean, how much another mirror contributes? 14:43:13 <iwakeh> yes, we should run them. 14:43:21 <iwakeh> There could be a 14:43:33 <iwakeh> non-Tor network of sync'ing mirrors. 14:43:51 <karsten> but what's the goal? 14:44:03 <karsten> traffic is not really scarce. 14:44:08 <iwakeh> Test and use CollecTor at most. 14:44:23 <iwakeh> yes, not really a goal. 14:44:44 <karsten> okay, no need to decide anything today. but maybe worth thinking about. 14:44:50 <iwakeh> and the traffic might be too much. But, we did call for operators a while ago. 14:45:24 <iwakeh> Could be toned down to test network and CollecTors in there. 14:46:08 <karsten> ok. 14:46:18 <iwakeh> next? 14:46:22 <karsten> yes. 14:46:27 <karsten> * geolocation databases (karsten) 14:46:36 <karsten> see the short thread on tor-dev@. 14:46:37 <iwakeh> I skimmed the web page. 14:47:10 <iwakeh> Just another semi-commercial offer without further comparison. 14:47:46 <iwakeh> I wonder why these things are updated so much? 14:47:47 <karsten> so, there's a paper that compares their database to maxmind's. 14:47:57 <iwakeh> Didn't read yet. 14:48:06 <karsten> what things are updated? 14:48:16 <iwakeh> Well, the differences that 14:48:31 <iwakeh> become apparent when comparing onionoo's details. 14:49:02 <iwakeh> Just minor coordinates and spelling and a big city has no 'real' point referencing it. 14:49:21 <karsten> ah. 14:49:27 <karsten> you refer to minor updates. 14:49:33 <karsten> I don't think we care much about those. 14:49:41 <karsten> it's wrong countries that we care about. 14:49:48 <iwakeh> yes, but they introduce differences unnecessarily. 14:49:59 <karsten> right. 14:50:01 <iwakeh> ok, wrong countries or new ones is fine. 14:50:06 <iwakeh> or cities. 14:50:22 <iwakeh> I'll take a look at hte paper. 14:50:35 <karsten> we should think about a few things here. 14:50:48 <karsten> how much of a priority is it to switch geolocation databases? 14:50:52 <karsten> this affects a lot of things. 14:50:59 <karsten> it affects onionoo, but that's just minor. 14:51:11 <iwakeh> Higher priority is to really archive old geoip dbs used. 14:51:18 <karsten> it also affects src/config/geoip[6] that we ship with all tors. 14:51:29 <iwakeh> yep. 14:51:35 <karsten> and the archive is relevant, yes. 14:51:44 <karsten> though I think we could do something with archive.org. 14:52:02 <iwakeh> at some point we intended to use these (src/config/tor)? 14:52:06 <karsten> another question is how we can evaluate accuracy ourselves. 14:52:13 <karsten> for onionoo? yes. 14:52:20 <karsten> or for other things in metrics? yes. 14:52:31 <karsten> they're based on maxmind's files, too. 14:52:39 <iwakeh> but accuracy is important too. 14:52:45 <karsten> it is! 14:53:01 <iwakeh> we should define what accuracy means to Tor/Tor Metrics here. 14:53:07 <karsten> so, we care about accuracy for relay IPs and accuracy for client IPs. 14:53:50 <karsten> we could evaluate relay IP accuracy by using two databases in onionoo and showing both results to atlas users. 14:53:52 <iwakeh> accurate to be in the correct city/country. 14:54:33 <karsten> but there's no good way to evaluate client IP accuracy. 14:54:55 <karsten> and that's what we care a bit more about. 14:54:59 <iwakeh> well, if the relays are correct that's a start. 14:55:06 <karsten> it's a start, yes. 14:55:23 <iwakeh> That could maybe be leveraged to client ips? 14:55:24 <karsten> I mean, we could also rely on that paper and maybe others. 14:55:33 <iwakeh> True, 14:55:44 <iwakeh> should we investigate? 14:56:10 <karsten> I'd say, if we want to investigate, now's a better time than in 3 or 6 months. 14:56:30 <karsten> because then we won't have support by the ip2location folks. 14:56:35 <karsten> probably. 14:56:54 <iwakeh> Well, I could get started and read the referenced paper and what else I find 14:56:58 <iwakeh> without 14:57:04 <iwakeh> major searching. 14:57:15 <karsten> sounds good to me. 14:57:26 <iwakeh> Just to get an idea what ought to be done. 14:57:56 <karsten> ok. 14:58:15 <iwakeh> Onionoo second back-end: #23244 14:58:15 <karsten> great, sounds good! next? 14:58:17 <iwakeh> :-) 14:58:25 <karsten> okay, what's the status there? 14:58:48 <iwakeh> I listed some differences (as trac permitted). 14:59:06 <iwakeh> Needs feedback, if there is a show stopper. 14:59:06 <karsten> should I update the geoip file on omeiense? 14:59:09 <iwakeh> but, 14:59:21 <iwakeh> I think backup would be fine for sure. 14:59:27 <iwakeh> and, rotation 14:59:35 <iwakeh> might be ok, but will 14:59:40 <iwakeh> always have differences 14:59:44 <karsten> true. 14:59:52 <iwakeh> b/c of the different update times. 15:00:43 <karsten> yes, we can't really get rid of that issue. 15:00:56 <karsten> "onionoo consensus"... 15:01:02 <iwakeh> I think there is not a major reason to switch geoip db, but the comparison would get easier. 15:01:03 <karsten> noooo. 15:01:14 <iwakeh> yes, consensus! 15:01:15 <karsten> yes, and it needs an update anyway. 15:01:24 <karsten> I just didn't want to change it in the middle of your analysis. 15:01:33 <karsten> but I'll do that later today if you don't mind. 15:01:38 <iwakeh> ok, fine with me. just copy from the hetzner host. 15:01:50 <karsten> note that it might not affect non-running relays. 15:02:04 <iwakeh> true. 15:02:31 <iwakeh> I can take a look once the geo-db is the same. 15:02:33 <karsten> so, do you want to do more analysis after that? 15:02:39 <iwakeh> yes 15:02:52 <iwakeh> maybe the diffs get very small. 15:03:09 <karsten> should we aim for a specific date for starting rotation? 15:03:17 <karsten> maybe not friday night though. 15:03:23 <karsten> monday, for example? 15:03:32 <iwakeh> next monday? 15:03:41 <iwakeh> or rather in a week? 15:03:47 <karsten> in a few days. 15:03:52 <karsten> 21st 15:03:57 <iwakeh> wednesday? 15:04:10 <iwakeh> oh well, why not. 15:04:18 <iwakeh> 2017-8-21 15:04:20 <karsten> I couldn't do much on thursday or friday. 15:04:29 <iwakeh> that's fine. 15:04:30 <karsten> yes, monday 21st sounds good. 15:04:43 <karsten> (btw, I won't be around next thu for the meeting.) 15:04:59 <iwakeh> the differences found so far shouldn't show a lot in Onionoo clients anyway. 15:05:31 <karsten> cool! 15:05:48 <karsten> next topic? 15:06:02 <iwakeh> yes 15:06:09 <karsten> * next regular CollecTor and Onionoo releases (should these be planned?) 15:06:20 <karsten> the webstats part is highest priority, right? 15:06:32 <iwakeh> True, I'm waiting. 15:06:36 <karsten> oh! 15:06:41 <karsten> what about the spec ticket? 15:06:58 <karsten> I should say that I'm not the master of my inbox anymore. 15:07:03 <iwakeh> Yes, you said there is more to come? 15:07:08 <iwakeh> hehe 15:07:11 <karsten> argh. 15:07:34 <karsten> well, not exactly. but that we need to find out a few things. 15:07:35 <iwakeh> Right, I can take the spec-ticket and finalize a first draft (should be small) 15:08:02 <iwakeh> If we agree, I revise some of the webstats code accordingly. 15:08:03 <karsten> would that be similar to the bridge descriptor spec? 15:08:24 <iwakeh> Smaller , 15:08:40 <iwakeh> too small for xml/xslt. 15:08:49 <karsten> okay. 15:09:03 <iwakeh> after all it's only loglines. 15:09:10 <iwakeh> I usually, 15:09:29 <iwakeh> choose some tickets from the next releases, going 15:10:03 <iwakeh> from needs_review etc to new, when I'm waiting. 15:10:32 <iwakeh> That is currently onionoo and collector. 15:10:47 <iwakeh> but no need to set any release dates now. 15:10:47 <karsten> ok, for the spec, you should be able to reuse large parts of the python script. 15:11:00 <karsten> though the documentation there does not always match the code. 15:12:52 <karsten> alright, continuing on the pad. 15:12:54 <karsten> #endmeeting