14:29:41 <karsten> #startmeeting metrics team
14:29:41 <MeetBot> Meeting started Thu Aug 17 14:29:41 2017 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:29:41 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:29:47 <iwakeh> hi!
14:29:49 <karsten> iwakeh: hi!
14:30:20 <karsten> https://storm.torproject.org/shared/Ou-1QRctynWbF4yedi-MfDsjImFMFSIEP20fbVGCPRa <- agenda pad
14:30:37 <karsten> lots of topics.
14:30:48 <iwakeh> true
14:30:58 <iwakeh> should we re-order, prioritize?
14:30:59 <karsten> shall we start before more topics appear? ;)
14:31:06 <karsten> sure!
14:31:06 <iwakeh> beter is,
14:31:14 <iwakeh> but they could appear anyway :-)
14:31:33 <karsten> want to start with whatever is highest priority for you?
14:31:51 <karsten> (some can be really fast)
14:32:14 <iwakeh> CollecTor 1.2.1
14:32:22 <karsten> ok.
14:32:23 <iwakeh> should it be released?
14:32:41 <karsten> no objections.
14:32:48 <karsten> it's running fine.
14:32:53 <iwakeh> tomorrow?
14:32:55 <karsten> it's a tiny change, but that's what patch releases are for.
14:32:57 <karsten> yep.
14:33:07 <iwakeh> ok, topic solved ;-)
14:33:28 <karsten> yep.
14:33:39 <iwakeh> I reordered
14:33:49 <karsten> okay, next is exonerator?
14:34:04 * iwakeh reviewing #16596
14:34:13 <karsten> there's the branch with four commits.
14:34:18 <karsten> that ticket, yes.
14:34:28 <iwakeh> I'm at it.
14:34:28 <karsten> there will be another commit with a move to JSP.
14:34:31 <karsten> cool!
14:34:44 <karsten> the JSP move will be necessary in order to move the page over to metrics-web.
14:34:58 <karsten> (or is there an easier way that uses our existing top.jsp and bottom.jsp?)
14:35:01 <iwakeh> ok
14:35:12 <iwakeh> I can look when reviewing.
14:35:23 <karsten> ok. that fifth commit does not exist yet.
14:35:28 <iwakeh> There should be some 'magic' available.
14:35:30 <karsten> I started rewriting the servlet as JSP.
14:35:37 <karsten> okay, I'll wait for you then.
14:36:21 <iwakeh> the db could be separated from the move to metrics-web, if time is an issue.
14:36:36 <karsten> yes.
14:36:59 <iwakeh> I figure that the db has the higher prio?
14:37:17 <karsten> not necessarily. it's the bigger task.
14:37:37 <karsten> that could mean it's okay to finish the smaller task and then do the bigger one.
14:37:37 <iwakeh> ok
14:38:07 <karsten> okay. I didn't start with the db parts yet, so there's nothing to discuss today.
14:38:16 <karsten> next topic?
14:38:21 <iwakeh> yep.
14:38:26 <karsten> collector mirrors.
14:38:39 <karsten> I'm not sure what to do with mirrors.
14:38:44 <karsten> unofficial mirrors, that is.
14:38:45 <iwakeh> I removed the not function notice.
14:38:51 <karsten> yes, that's fine.
14:39:00 <iwakeh> Yeah, maybe just a collection to list?
14:39:13 <iwakeh> collection of unofficial mirrors.
14:39:28 <karsten> it doesn't hurt, but maybe it doesn't help, either.
14:39:38 <iwakeh> hmm true
14:39:46 <karsten> and we should think about whether we indicate to mirror operators that they're contributing something useful or not.
14:40:04 <karsten> I also don't want to take something away from them.
14:40:24 <karsten> (the ability to contribute)
14:40:25 <iwakeh> That would need a ruleset
14:40:31 <iwakeh> for good mirrors.
14:41:04 <iwakeh> Or, we just don't list them?
14:41:13 <karsten> possibly. not certain yet.
14:41:26 <karsten> we can still provide the sources and good installation instructions.
14:41:35 <iwakeh> so far, there is no queue for mirror operation.
14:41:42 <karsten> but if we're not doing anything with the mirrors, there's no real point in having them.
14:41:55 <karsten> true.
14:41:59 <iwakeh> yes, we could sync?
14:42:15 <karsten> not sure if we want that.
14:42:21 <karsten> we haven't done that in the past.
14:42:25 <karsten> we considered doing it.
14:42:29 <karsten> but then we didn't.
14:42:43 <karsten> I think we're in a good situation with two hosts that we control.
14:42:47 <iwakeh> Maybe we should investigate how much that improves?
14:42:54 <karsten> there might be even three of them in the future. but we should run them.
14:43:03 <iwakeh> I mean, how much another mirror contributes?
14:43:13 <iwakeh> yes, we should run them.
14:43:21 <iwakeh> There could be a
14:43:33 <iwakeh> non-Tor network of sync'ing mirrors.
14:43:51 <karsten> but what's the goal?
14:44:03 <karsten> traffic is not really scarce.
14:44:08 <iwakeh> Test and use CollecTor at most.
14:44:23 <iwakeh> yes, not really a goal.
14:44:44 <karsten> okay, no need to decide anything today. but maybe worth thinking about.
14:44:50 <iwakeh> and the traffic might be too much. But, we did call for operators a while ago.
14:45:24 <iwakeh> Could be toned down to test network and CollecTors in there.
14:46:08 <karsten> ok.
14:46:18 <iwakeh> next?
14:46:22 <karsten> yes.
14:46:27 <karsten> * geolocation databases (karsten)
14:46:36 <karsten> see the short thread on tor-dev@.
14:46:37 <iwakeh> I skimmed the web page.
14:47:10 <iwakeh> Just another semi-commercial offer without further comparison.
14:47:46 <iwakeh> I wonder why these things are updated so much?
14:47:47 <karsten> so, there's a paper that compares their database to maxmind's.
14:47:57 <iwakeh> Didn't read yet.
14:48:06 <karsten> what things are updated?
14:48:16 <iwakeh> Well, the differences that
14:48:31 <iwakeh> become apparent when comparing onionoo's details.
14:49:02 <iwakeh> Just minor coordinates and spelling and a big city has no 'real' point referencing it.
14:49:21 <karsten> ah.
14:49:27 <karsten> you refer to minor updates.
14:49:33 <karsten> I don't think we care much about those.
14:49:41 <karsten> it's wrong countries that we care about.
14:49:48 <iwakeh> yes, but they introduce differences unnecessarily.
14:49:59 <karsten> right.
14:50:01 <iwakeh> ok, wrong countries or new ones is fine.
14:50:06 <iwakeh> or cities.
14:50:22 <iwakeh> I'll take a look at hte paper.
14:50:35 <karsten> we should think about a few things here.
14:50:48 <karsten> how much of a priority is it to switch geolocation databases?
14:50:52 <karsten> this affects a lot of things.
14:50:59 <karsten> it affects onionoo, but that's just minor.
14:51:11 <iwakeh> Higher priority is to really archive old geoip dbs used.
14:51:18 <karsten> it also affects src/config/geoip[6] that we ship with all tors.
14:51:29 <iwakeh> yep.
14:51:35 <karsten> and the archive is relevant, yes.
14:51:44 <karsten> though I think we could do something with archive.org.
14:52:02 <iwakeh> at some point we intended to use these (src/config/tor)?
14:52:06 <karsten> another question is how we can evaluate accuracy ourselves.
14:52:13 <karsten> for onionoo? yes.
14:52:20 <karsten> or for other things in metrics? yes.
14:52:31 <karsten> they're based on maxmind's files, too.
14:52:39 <iwakeh> but accuracy is important too.
14:52:45 <karsten> it is!
14:53:01 <iwakeh> we should define what accuracy means to Tor/Tor Metrics here.
14:53:07 <karsten> so, we care about accuracy for relay IPs and accuracy for client IPs.
14:53:50 <karsten> we could evaluate relay IP accuracy by using two databases in onionoo and showing both results to atlas users.
14:53:52 <iwakeh> accurate to be in the correct city/country.
14:54:33 <karsten> but there's no good way to evaluate client IP accuracy.
14:54:55 <karsten> and that's what we care a bit more about.
14:54:59 <iwakeh> well, if the relays are correct that's a start.
14:55:06 <karsten> it's a start, yes.
14:55:23 <iwakeh> That could maybe be leveraged to client ips?
14:55:24 <karsten> I mean, we could also rely on that paper and maybe others.
14:55:33 <iwakeh> True,
14:55:44 <iwakeh> should we investigate?
14:56:10 <karsten> I'd say, if we want to investigate, now's a better time than in 3 or 6 months.
14:56:30 <karsten> because then we won't have support by the ip2location folks.
14:56:35 <karsten> probably.
14:56:54 <iwakeh> Well, I could get started and read the referenced paper and what else I find
14:56:58 <iwakeh> without
14:57:04 <iwakeh> major searching.
14:57:15 <karsten> sounds good to me.
14:57:26 <iwakeh> Just to get an idea what ought to be done.
14:57:56 <karsten> ok.
14:58:15 <iwakeh> Onionoo second back-end: #23244
14:58:15 <karsten> great, sounds good! next?
14:58:17 <iwakeh> :-)
14:58:25 <karsten> okay, what's the status there?
14:58:48 <iwakeh> I listed some differences (as trac permitted).
14:59:06 <iwakeh> Needs feedback, if there is a show stopper.
14:59:06 <karsten> should I update the geoip file on omeiense?
14:59:09 <iwakeh> but,
14:59:21 <iwakeh> I think backup would be fine for sure.
14:59:27 <iwakeh> and, rotation
14:59:35 <iwakeh> might be ok, but will
14:59:40 <iwakeh> always have differences
14:59:44 <karsten> true.
14:59:52 <iwakeh> b/c of the different update times.
15:00:43 <karsten> yes, we can't really get rid of that issue.
15:00:56 <karsten> "onionoo consensus"...
15:01:02 <iwakeh> I think there is not a major reason to switch geoip db, but the comparison would get easier.
15:01:03 <karsten> noooo.
15:01:14 <iwakeh> yes, consensus!
15:01:15 <karsten> yes, and it needs an update anyway.
15:01:24 <karsten> I just didn't want to change it in the middle of your analysis.
15:01:33 <karsten> but I'll do that later today if you don't mind.
15:01:38 <iwakeh> ok, fine with me. just copy from the hetzner host.
15:01:50 <karsten> note that it might not affect non-running relays.
15:02:04 <iwakeh> true.
15:02:31 <iwakeh> I can take a look once the geo-db is the same.
15:02:33 <karsten> so, do you want to do more analysis after that?
15:02:39 <iwakeh> yes
15:02:52 <iwakeh> maybe the diffs get very small.
15:03:09 <karsten> should we aim for a specific date for starting rotation?
15:03:17 <karsten> maybe not friday night though.
15:03:23 <karsten> monday, for example?
15:03:32 <iwakeh> next monday?
15:03:41 <iwakeh> or rather in a week?
15:03:47 <karsten> in a few days.
15:03:52 <karsten> 21st
15:03:57 <iwakeh> wednesday?
15:04:10 <iwakeh> oh well, why not.
15:04:18 <iwakeh> 2017-8-21
15:04:20 <karsten> I couldn't do much on thursday or friday.
15:04:29 <iwakeh> that's fine.
15:04:30 <karsten> yes, monday 21st sounds good.
15:04:43 <karsten> (btw, I won't be around next thu for the meeting.)
15:04:59 <iwakeh> the differences found so far shouldn't show a lot in Onionoo clients anyway.
15:05:31 <karsten> cool!
15:05:48 <karsten> next topic?
15:06:02 <iwakeh> yes
15:06:09 <karsten> * next regular CollecTor and Onionoo releases (should these be planned?)
15:06:20 <karsten> the webstats part is highest priority, right?
15:06:32 <iwakeh> True, I'm waiting.
15:06:36 <karsten> oh!
15:06:41 <karsten> what about the spec ticket?
15:06:58 <karsten> I should say that I'm not the master of my inbox anymore.
15:07:03 <iwakeh> Yes, you said there is more to come?
15:07:08 <iwakeh> hehe
15:07:11 <karsten> argh.
15:07:34 <karsten> well, not exactly. but that we need to find out a few things.
15:07:35 <iwakeh> Right, I can take the spec-ticket and finalize a first draft (should be small)
15:08:02 <iwakeh> If we agree, I revise some of the webstats code accordingly.
15:08:03 <karsten> would that be similar to the bridge descriptor spec?
15:08:24 <iwakeh> Smaller ,
15:08:40 <iwakeh> too small for xml/xslt.
15:08:49 <karsten> okay.
15:09:03 <iwakeh> after all it's only loglines.
15:09:10 <iwakeh> I usually,
15:09:29 <iwakeh> choose some tickets from the next releases, going
15:10:03 <iwakeh> from needs_review etc to new, when I'm waiting.
15:10:32 <iwakeh> That is currently onionoo and collector.
15:10:47 <iwakeh> but no need to set any release dates now.
15:10:47 <karsten> ok, for the spec, you should be able to reuse large parts of the python script.
15:11:00 <karsten> though the documentation there does not always match the code.
15:12:52 <karsten> alright, continuing on the pad.
15:12:54 <karsten> #endmeeting