14:29:41 #startmeeting metrics team 14:29:41 Meeting started Thu Aug 17 14:29:41 2017 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:29:41 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:29:47 hi! 14:29:49 iwakeh: hi! 14:30:20 https://storm.torproject.org/shared/Ou-1QRctynWbF4yedi-MfDsjImFMFSIEP20fbVGCPRa <- agenda pad 14:30:37 lots of topics. 14:30:48 true 14:30:58 should we re-order, prioritize? 14:30:59 shall we start before more topics appear? ;) 14:31:06 sure! 14:31:06 beter is, 14:31:14 but they could appear anyway :-) 14:31:33 want to start with whatever is highest priority for you? 14:31:51 (some can be really fast) 14:32:14 CollecTor 1.2.1 14:32:22 ok. 14:32:23 should it be released? 14:32:41 no objections. 14:32:48 it's running fine. 14:32:53 tomorrow? 14:32:55 it's a tiny change, but that's what patch releases are for. 14:32:57 yep. 14:33:07 ok, topic solved ;-) 14:33:28 yep. 14:33:39 I reordered 14:33:49 okay, next is exonerator? 14:34:04 * iwakeh reviewing #16596 14:34:13 there's the branch with four commits. 14:34:18 that ticket, yes. 14:34:28 I'm at it. 14:34:28 there will be another commit with a move to JSP. 14:34:31 cool! 14:34:44 the JSP move will be necessary in order to move the page over to metrics-web. 14:34:58 (or is there an easier way that uses our existing top.jsp and bottom.jsp?) 14:35:01 ok 14:35:12 I can look when reviewing. 14:35:23 ok. that fifth commit does not exist yet. 14:35:28 There should be some 'magic' available. 14:35:30 I started rewriting the servlet as JSP. 14:35:37 okay, I'll wait for you then. 14:36:21 the db could be separated from the move to metrics-web, if time is an issue. 14:36:36 yes. 14:36:59 I figure that the db has the higher prio? 14:37:17 not necessarily. it's the bigger task. 14:37:37 that could mean it's okay to finish the smaller task and then do the bigger one. 14:37:37 ok 14:38:07 okay. I didn't start with the db parts yet, so there's nothing to discuss today. 14:38:16 next topic? 14:38:21 yep. 14:38:26 collector mirrors. 14:38:39 I'm not sure what to do with mirrors. 14:38:44 unofficial mirrors, that is. 14:38:45 I removed the not function notice. 14:38:51 yes, that's fine. 14:39:00 Yeah, maybe just a collection to list? 14:39:13 collection of unofficial mirrors. 14:39:28 it doesn't hurt, but maybe it doesn't help, either. 14:39:38 hmm true 14:39:46 and we should think about whether we indicate to mirror operators that they're contributing something useful or not. 14:40:04 I also don't want to take something away from them. 14:40:24 (the ability to contribute) 14:40:25 That would need a ruleset 14:40:31 for good mirrors. 14:41:04 Or, we just don't list them? 14:41:13 possibly. not certain yet. 14:41:26 we can still provide the sources and good installation instructions. 14:41:35 so far, there is no queue for mirror operation. 14:41:42 but if we're not doing anything with the mirrors, there's no real point in having them. 14:41:55 true. 14:41:59 yes, we could sync? 14:42:15 not sure if we want that. 14:42:21 we haven't done that in the past. 14:42:25 we considered doing it. 14:42:29 but then we didn't. 14:42:43 I think we're in a good situation with two hosts that we control. 14:42:47 Maybe we should investigate how much that improves? 14:42:54 there might be even three of them in the future. but we should run them. 14:43:03 I mean, how much another mirror contributes? 14:43:13 yes, we should run them. 14:43:21 There could be a 14:43:33 non-Tor network of sync'ing mirrors. 14:43:51 but what's the goal? 14:44:03 traffic is not really scarce. 14:44:08 Test and use CollecTor at most. 14:44:23 yes, not really a goal. 14:44:44 okay, no need to decide anything today. but maybe worth thinking about. 14:44:50 and the traffic might be too much. But, we did call for operators a while ago. 14:45:24 Could be toned down to test network and CollecTors in there. 14:46:08 ok. 14:46:18 next? 14:46:22 yes. 14:46:27 * geolocation databases (karsten) 14:46:36 see the short thread on tor-dev@. 14:46:37 I skimmed the web page. 14:47:10 Just another semi-commercial offer without further comparison. 14:47:46 I wonder why these things are updated so much? 14:47:47 so, there's a paper that compares their database to maxmind's. 14:47:57 Didn't read yet. 14:48:06 what things are updated? 14:48:16 Well, the differences that 14:48:31 become apparent when comparing onionoo's details. 14:49:02 Just minor coordinates and spelling and a big city has no 'real' point referencing it. 14:49:21 ah. 14:49:27 you refer to minor updates. 14:49:33 I don't think we care much about those. 14:49:41 it's wrong countries that we care about. 14:49:48 yes, but they introduce differences unnecessarily. 14:49:59 right. 14:50:01 ok, wrong countries or new ones is fine. 14:50:06 or cities. 14:50:22 I'll take a look at hte paper. 14:50:35 we should think about a few things here. 14:50:48 how much of a priority is it to switch geolocation databases? 14:50:52 this affects a lot of things. 14:50:59 it affects onionoo, but that's just minor. 14:51:11 Higher priority is to really archive old geoip dbs used. 14:51:18 it also affects src/config/geoip[6] that we ship with all tors. 14:51:29 yep. 14:51:35 and the archive is relevant, yes. 14:51:44 though I think we could do something with archive.org. 14:52:02 at some point we intended to use these (src/config/tor)? 14:52:06 another question is how we can evaluate accuracy ourselves. 14:52:13 for onionoo? yes. 14:52:20 or for other things in metrics? yes. 14:52:31 they're based on maxmind's files, too. 14:52:39 but accuracy is important too. 14:52:45 it is! 14:53:01 we should define what accuracy means to Tor/Tor Metrics here. 14:53:07 so, we care about accuracy for relay IPs and accuracy for client IPs. 14:53:50 we could evaluate relay IP accuracy by using two databases in onionoo and showing both results to atlas users. 14:53:52 accurate to be in the correct city/country. 14:54:33 but there's no good way to evaluate client IP accuracy. 14:54:55 and that's what we care a bit more about. 14:54:59 well, if the relays are correct that's a start. 14:55:06 it's a start, yes. 14:55:23 That could maybe be leveraged to client ips? 14:55:24 I mean, we could also rely on that paper and maybe others. 14:55:33 True, 14:55:44 should we investigate? 14:56:10 I'd say, if we want to investigate, now's a better time than in 3 or 6 months. 14:56:30 because then we won't have support by the ip2location folks. 14:56:35 probably. 14:56:54 Well, I could get started and read the referenced paper and what else I find 14:56:58 without 14:57:04 major searching. 14:57:15 sounds good to me. 14:57:26 Just to get an idea what ought to be done. 14:57:56 ok. 14:58:15 Onionoo second back-end: #23244 14:58:15 great, sounds good! next? 14:58:17 :-) 14:58:25 okay, what's the status there? 14:58:48 I listed some differences (as trac permitted). 14:59:06 Needs feedback, if there is a show stopper. 14:59:06 should I update the geoip file on omeiense? 14:59:09 but, 14:59:21 I think backup would be fine for sure. 14:59:27 and, rotation 14:59:35 might be ok, but will 14:59:40 always have differences 14:59:44 true. 14:59:52 b/c of the different update times. 15:00:43 yes, we can't really get rid of that issue. 15:00:56 "onionoo consensus"... 15:01:02 I think there is not a major reason to switch geoip db, but the comparison would get easier. 15:01:03 noooo. 15:01:14 yes, consensus! 15:01:15 yes, and it needs an update anyway. 15:01:24 I just didn't want to change it in the middle of your analysis. 15:01:33 but I'll do that later today if you don't mind. 15:01:38 ok, fine with me. just copy from the hetzner host. 15:01:50 note that it might not affect non-running relays. 15:02:04 true. 15:02:31 I can take a look once the geo-db is the same. 15:02:33 so, do you want to do more analysis after that? 15:02:39 yes 15:02:52 maybe the diffs get very small. 15:03:09 should we aim for a specific date for starting rotation? 15:03:17 maybe not friday night though. 15:03:23 monday, for example? 15:03:32 next monday? 15:03:41 or rather in a week? 15:03:47 in a few days. 15:03:52 21st 15:03:57 wednesday? 15:04:10 oh well, why not. 15:04:18 2017-8-21 15:04:20 I couldn't do much on thursday or friday. 15:04:29 that's fine. 15:04:30 yes, monday 21st sounds good. 15:04:43 (btw, I won't be around next thu for the meeting.) 15:04:59 the differences found so far shouldn't show a lot in Onionoo clients anyway. 15:05:31 cool! 15:05:48 next topic? 15:06:02 yes 15:06:09 * next regular CollecTor and Onionoo releases (should these be planned?) 15:06:20 the webstats part is highest priority, right? 15:06:32 True, I'm waiting. 15:06:36 oh! 15:06:41 what about the spec ticket? 15:06:58 I should say that I'm not the master of my inbox anymore. 15:07:03 Yes, you said there is more to come? 15:07:08 hehe 15:07:11 argh. 15:07:34 well, not exactly. but that we need to find out a few things. 15:07:35 Right, I can take the spec-ticket and finalize a first draft (should be small) 15:08:02 If we agree, I revise some of the webstats code accordingly. 15:08:03 would that be similar to the bridge descriptor spec? 15:08:24 Smaller , 15:08:40 too small for xml/xslt. 15:08:49 okay. 15:09:03 after all it's only loglines. 15:09:10 I usually, 15:09:29 choose some tickets from the next releases, going 15:10:03 from needs_review etc to new, when I'm waiting. 15:10:32 That is currently onionoo and collector. 15:10:47 but no need to set any release dates now. 15:10:47 ok, for the spec, you should be able to reuse large parts of the python script. 15:11:00 though the documentation there does not always match the code. 15:12:52 alright, continuing on the pad. 15:12:54 #endmeeting