14:00:08 <karsten> #startmeeting metrics team
14:00:08 <MeetBot> Meeting started Thu Aug  4 14:00:08 2016 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:08 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:00:26 <karsten> hi anathema_db. you're here for the meeting?
14:00:29 <anathema_db> yep
14:00:40 <karsten> and you already found the agenda pad: https://pad.riseup.net/p/zUNzEIFRq5S4
14:01:10 <anathema_db> it seems so :)
14:01:17 <karsten> and iwakeh can't connect to oftc..
14:01:34 <anathema_db> :/
14:01:39 <anathema_db> so just me and you today ?
14:01:52 <karsten> well, let's wait a bit, maybe that gets resolved.
14:02:06 <karsten> also, let's collect topics until :05.
14:02:20 <anathema_db> sure
14:03:12 <anathema_db> so, I wrote down a couple of things
14:03:30 <anathema_db> one that I was working since this morning
14:04:22 <iwakeh> Hi there :-)
14:04:24 <anathema_db> hurray
14:04:27 <anathema_db> hi iwakeh
14:04:28 <karsten> hi iwakeh!
14:04:42 <iwakeh> They block Tor here :-(
14:04:51 <karsten> boo!
14:05:01 <karsten> anything else that should go on the agenda?
14:05:50 <iwakeh> all listed, I think.
14:06:20 <karsten> alright!
14:06:26 <karsten> * Monthly team report (karsten)
14:06:35 <karsten> I wrote mail about that to the list.
14:06:49 <iwakeh> any replies, yet?
14:07:09 <karsten> tl;dr: let's write monthly reports, and let's start with july 2016. any more input than what we already have from the MOSS work?
14:07:12 <karsten> none.
14:07:24 <iwakeh> ok.
14:08:04 <karsten> I also guess this will become clearer next month.
14:08:22 <karsten> and there's always the chance to mention progress next month, even if it started last month.
14:08:35 <anathema_db> so even progresses are ok?
14:08:48 <anathema_db> I thought we should include only "finished" stuff
14:08:56 <karsten> ah, bad wording on my part.
14:08:57 <anathema_db> "finished" or "released"
14:09:00 <karsten> yep.
14:09:30 <iwakeh> a release is a measurable progress ;-)
14:09:38 <anathema_db> aha
14:09:41 <karsten> let's focus on completed stuff in these reports.
14:10:20 <anathema_db> do we have any?
14:10:25 <karsten> and let's just see how this works out and whether we need to adjust criteria.
14:10:34 <karsten> any releases?
14:10:47 <anathema_db> something that will go into the monthly report
14:10:52 <karsten> ah, yes.
14:10:53 <anathema_db> (just a sneaky preview)
14:11:14 <karsten> - Added a new graph to Tor Metrics that shows a possible range of the number of clients by country and transport and which reveals the most popular pluggable transports in any given country [2] (#19544).
14:11:41 <iwakeh> Takes a looong time to preview everything.
14:11:50 <anathema_db> cool
14:12:33 <karsten> alright, let's see how this first report goes and maybe talk more next week or around end of august to make the next one even better.
14:12:42 <anathema_db> agree
14:12:50 <iwakeh> fine.
14:12:56 <karsten> ok.
14:13:20 <karsten> * CollecTor release(s) (iwakeh)
14:13:23 <iwakeh> #19813
14:13:33 <iwakeh> shall we stick to the 10th?
14:13:57 <karsten> sure!
14:14:16 <karsten> just move my new tickets out of 1.0.0 if they threaten the release date.
14:14:36 <iwakeh> Right, except for the small bugfixes.
14:15:07 <iwakeh> For 1.1.0 shall we aim at a date?
14:15:09 <karsten> yep, many small fixes.
14:15:23 <karsten> hmm
14:15:26 <iwakeh> (good that these keeep trickling in ...)
14:15:49 <karsten> should we discuss the 1.1.0 release date next week after 1.0.0 is released?
14:15:58 <iwakeh> well, before 31st. Ok.
14:16:14 <karsten> yes, before end of month.
14:16:45 <iwakeh> That's it from me.
14:16:51 <karsten> ok!
14:16:58 <karsten> * onionoo-ng - an Onionoo python implementation (anathema)
14:17:05 <karsten> how's this going? :)
14:17:19 <anathema_db> I've sent an email
14:17:38 <anathema_db> to give you an update and the first release of the project
14:17:56 <karsten> okay, I should read that.
14:18:00 <anathema_db> so
14:18:10 <anathema_db> karsten: yeah :)
14:18:14 <anathema_db> just here as a reference: https://github.com/davinerd/onionoo-ng
14:18:22 <anathema_db> and this is live: http://138.201.90.124:8080/details
14:19:22 <iwakeh> http://138.201.90.124:8080/summary
14:19:28 <iwakeh> 405: Method Not Allowed
14:19:30 <anathema_db> there are some differences from the original protocol (positive and negative) that I'll discuss maybe in list
14:19:39 <anathema_db> iwakeh: I implemented only details
14:19:40 <Lunar> irl: keep me for now, please
14:19:55 <anathema_db> as discusses with karsten at the beginning
14:19:58 <iwakeh> ok.
14:19:59 <karsten> is it difficult to implement the others?
14:20:03 <anathema_db> nah
14:20:06 <karsten> ok, great.
14:20:14 <karsten> should we talk about the differences to the current protocol?
14:20:17 <anathema_db> details was the 'difficult' part
14:20:19 <karsten> yep.
14:20:22 <anathema_db> sure, here?
14:20:31 <karsten> sure. we're only 20 minutes into the meeting.
14:20:39 <anathema_db> :)
14:20:42 <karsten> - results are returned for both bridges and relays when using 'limit' and 'offset'
14:20:52 <karsten> so, limit 20 gives you 20 bridges and 20 relays?
14:20:56 <anathema_db> yep
14:21:08 <iwakeh> Maybe this: "bridges_published": null,
14:21:26 <anathema_db> iwakeh: I wrote about that in another email
14:21:38 <iwakeh> no, I meant to ask here.
14:22:00 <iwakeh> discuss, that question here.
14:22:04 <anathema_db> well, should we talk about protocol differences or missed implementation? :)
14:22:15 <anathema_db> one at the time
14:22:23 <karsten> regarding limit and offset, the idea was to support result pagination.
14:22:34 <karsten> happy to wait.
14:22:49 <karsten> so, bridges_published is just a missed implementation that will be fixed later?
14:22:50 <iwakeh> no, protocol first.
14:22:57 <anathema_db> aha
14:23:03 <iwakeh> :-) I interrupted.
14:23:20 <karsten> okay, back to limit and offset? :)
14:23:25 <anathema_db> ok
14:23:35 <karsten> ok. so, I don't know what's most useful for clients.
14:23:35 <anathema_db> so what do you mean by 'result pagination' ?
14:23:46 <karsten> ah, show 10 results per page, give me page 5.
14:23:53 <karsten> so, limit = 10, offset = 40.
14:23:54 <anathema_db> like elasticsearch scroll
14:24:03 <karsten> uhmm, maybe?
14:24:20 <anathema_db> yeah elasticsearch can use pagination
14:24:31 <karsten> okay, so that's the idea behind those parameters.
14:24:45 <anathema_db> it's already implemented
14:24:46 <karsten> trouble is, if we change semantics of parameters, that's bad.
14:25:05 <karsten> we'd have to add new parameters, or raise the protocol version and tell clients they're behind.
14:25:13 <karsten> but we should only do that for really important changes.
14:25:18 <anathema_db> if you say offset=40 limit=10, onionoo-ng will skip the first 40 results and returns the next 10 results
14:25:19 <karsten> this one doesn't seem as important.
14:25:37 <karsten> ok?
14:25:48 <anathema_db> yeah we should minimise semantic changes
14:25:49 <karsten> but doesn't the current protocol do the same?
14:26:05 <karsten> I thought you'd return 10 bridges and 10 relays.
14:26:10 <anathema_db> yes
14:26:15 <anathema_db> but if you do details?limit=10
14:26:23 <anathema_db> current protocol will give you only 10 relays.
14:26:26 <anathema_db> and 0 bridges
14:26:27 <karsten> yep.
14:26:33 <karsten> because you wanted 10 results.
14:26:37 <karsten> and relays come first.
14:26:42 <anathema_db> 10 results total, yes
14:26:46 <karsten> if you want bridges, you'd say type=bridge.
14:27:04 <anathema_db> I interpreted 10 results of both
14:27:15 <karsten> okay, let's talk more about that using email.
14:27:34 <anathema_db> of course that's open implementation, we can change that, but I thought it was more useful
14:27:35 <karsten> ok?
14:27:45 <karsten> yes, we'll have to think about that.
14:27:50 <karsten> I'm open to improving the protocol.
14:27:53 <iwakeh> Sorry, found that offset alone triggers 500.
14:27:55 <anathema_db> sure
14:28:07 <anathema_db> let me check iwakeh
14:28:26 <iwakeh> http://138.201.90.124:8080/details?offset=3
14:28:36 <anathema_db> yeah will check later, thanks
14:28:40 <karsten> - 'order' parameter's value can be any field, so it's not limited to the 'consensum_weight'
14:28:44 <karsten> great!
14:29:03 <anathema_db> :D
14:29:17 <karsten> - 'fields' parameter's value can be any field
14:29:20 <karsten> not sure I understand.
14:29:25 <anathema_db> I did some tests but yes, the idea was to give it to you so you can test it
14:30:04 <anathema_db> karsten: yeah my bad
14:30:14 <karsten> ok.
14:30:18 <anathema_db> I thought the original protocol could only support few fields
14:30:19 <karsten> in theory, that should already be the case.
14:30:20 <karsten> ok.
14:30:23 <anathema_db> but you can specify any field
14:30:26 <anathema_db> yep
14:30:27 <karsten> - 'lookup' is not implemented: I was not able to find a difference between 'lookup' and 'fingerprint': can you provide some real examples?
14:30:46 <karsten> I think fingerprint is not limited to relays and bridges running in the past week.
14:30:48 <anathema_db> yeah I did not understand the difference between lookup and fingerprint
14:30:52 <karsten> I'd have to look, too.
14:30:57 <karsten> - 'search' does not implement: "any 4 hex characters of a space-separated fingerprint" and "beginning of a base64-encoded fingerprint without trailing equal signs": I was not able to find any relevant case for those
14:30:58 <anathema_db> cool, thanks
14:31:15 <karsten> sometimes users find fingerprint parts and paste them in.
14:31:27 <karsten> for example, fingerprints are often printed in blocks of 4 hex chars.
14:31:39 <karsten> and if you add them, onionoo will think it's 10 x 4 search terms.
14:31:50 <anathema_db> yeah but so it should support "abcd 0123 abcd" ?
14:32:00 <karsten> it's useful, yes.
14:32:02 <anathema_db> but also "abcd" ?
14:32:05 <karsten> yes.
14:32:15 <anathema_db> ok, gotcha
14:32:18 <karsten> note that search terms are boolean-and-ed.
14:32:23 <anathema_db> and what about the base64 encoding ?
14:32:27 <anathema_db> yep
14:32:29 <anathema_db> done that
14:32:32 <karsten> this is not to support searches for "0123" above.
14:32:38 <karsten> which would make little sense.
14:32:48 <karsten> base64
14:32:53 <karsten> is contained in raw consensuses.
14:33:08 <karsten> https://collector.torproject.org/recent/relay-descriptors/consensuses/2016-08-04-14-00-00-consensus
14:33:13 <karsten> r PDrelay1 AAFJ5u9xAqrKlpDW6N0pMhJLlKs 6kNxzNyUwLoRAlCxQTo2cK4QN0A 2016-08-04 10:54:12 95.215.44.189 8080 0
14:33:25 <karsten> if you want to find that relay without converting that base64 to hex,
14:33:33 <karsten> just search for AAFJ5u9xAqrKlpDW6N0pMhJLlKs
14:33:42 <karsten> this is useful for people debugging the network.
14:34:01 <anathema_db> aah ok
14:34:07 <anathema_db> now it's clear, thanks
14:34:13 <karsten> okay, let me read that mail in more detail and reply.
14:34:18 <anathema_db> sure!
14:34:28 <anathema_db> to go back to iwakeh: it will be implemente
14:34:29 <karsten> the goal should be to make as few backward-incompatible changes as possible.
14:34:31 <anathema_db> *implemented
14:34:43 <karsten> ideally zero. :)
14:34:48 <anathema_db> but I was not able to find the "source" of that field
14:34:53 <anathema_db> karsten: agree :)
14:35:19 <anathema_db> note: I think I can get the same info by asking for the most recent node in the dataset
14:35:32 <anathema_db> (the one with the most recent "last_seen" timestamp)
14:35:49 <anathema_db> but the Onionoo's approach is different, just I was not able to locate the code
14:35:55 <anathema_db> (it's all written in the email)
14:36:18 <karsten> right, onionoo doesn't read all details files.
14:36:24 <karsten> it reads the summary file.
14:36:55 <anathema_db> for the relays_published?
14:37:02 <anathema_db> it reads the consunsus
14:37:15 <anathema_db> but the class is defined outside the project I think
14:37:18 <karsten> wait, there are two pieces of onionoo.
14:37:22 <anathema_db> hu
14:37:28 <karsten> the first reads the consensus and other descriptors and writes the files I gave you.
14:37:36 <karsten> the second reads those files and answers client requests.
14:37:43 <anathema_db> hu
14:38:01 <karsten> you're replacing the second part.
14:38:04 <anathema_db> so…which field fills the "relays_published" in the output ?
14:38:24 <anathema_db> I mean, where in the snaptshop data I can find that value ?
14:38:49 <karsten> max value of last_seen would work.
14:38:58 <anathema_db> ok
14:39:06 <karsten> the current onionoo looks at the summary file to determine that.
14:39:11 <anathema_db> so it's already done (in dev env)
14:39:12 <karsten> whereas you'd look at the details files.
14:39:41 <anathema_db> it's easy and fast to do in elasticsearch so no problem
14:39:45 <karsten> ok.
14:39:46 <anathema_db> already implemented a stub
14:39:55 <anathema_db> just not published as I was unsure about that
14:40:14 <karsten> cool, let me respond to your mail, hopefully tomorrow.
14:40:18 <anathema_db> no rush
14:40:32 <karsten> :)
14:40:33 <karsten> moving on?
14:40:37 <anathema_db> just as a note
14:40:50 <anathema_db> at the moment, there is no webserver in front of onionoo-ng
14:41:00 <anathema_db> so you're interfacing with the raw app
14:41:30 <anathema_db> I don't think we need varnish or whatever
14:41:36 <anathema_db> but more tests need to be done
14:41:48 <anathema_db> we can move on
14:41:51 <karsten> oh, for a production environment it would probably make sense.
14:41:59 <karsten> reduces the load so much.
14:42:17 <karsten> okay, moving on.
14:42:31 <iwakeh> sorry, I got disconnected.
14:42:33 <karsten> (when did we lose iwakeh?)
14:42:42 <karsten> ah, 14:41.
14:42:47 <anathema_db> the varnish thing
14:42:49 <iwakeh> ok :-)
14:43:14 <anathema_db> well, maybe earlier :)
14:43:22 <karsten> will be in the logs!
14:43:26 <karsten> moving on:
14:43:28 <karsten> * onionoo-ng dataset integration into metrics.torproject.org (or: OnionStats next step) ?? (anathema)
14:43:41 <anathema_db> yeah so
14:44:28 <anathema_db> the original idea behind OnionStats was to play with the data, allowing flexibility in creating graphs
14:44:33 <anathema_db> stat graphs
14:45:03 <anathema_db> so users can interactively create the graphs they want
14:45:04 <karsten> ok.
14:45:29 <karsten> that would be different data though.
14:45:39 <anathema_db> yes
14:45:40 <anathema_db> exact
14:45:54 <karsten> and, more data.
14:46:09 <anathema_db> so, as onionoo use the data from CollecTor, I think we have a good amount of 'raw' data there
14:46:26 <karsten> I'd think that this onionoo-ng project might keep you busy for longer than you'll expect. :)
14:46:39 <anathema_db> sure, my point is:
14:46:42 <karsten> making it run smoothly might take a bit.
14:46:55 <karsten> and guess what happens when people notice that somebody adds features:
14:46:59 <iwakeh> ensuring data integrity.
14:47:00 <karsten> they ask for even more features!
14:47:06 <iwakeh> true.
14:47:49 <anathema_db> we can leverage ElasticSearch (like, the one used in onionoo-ng but with more data) to create dynamic metric graphs
14:48:30 <anathema_db> just as an example, by using a simple query in ES I was able to plot a bar graph with relays divided by country
14:49:03 <karsten> right, but that's just for the current network.
14:49:11 <karsten> which is a limitation of onionoo data.
14:49:19 <anathema_db> and with a single click, I was able to plot a graph with the number of platform, divided by country
14:49:29 <anathema_db> well, I think it's ok to stay with the current network
14:49:42 <anathema_db> because I wanna see _right know_ the metrics of Tor network
14:49:53 <anathema_db> however, we can integrate more data
14:50:06 <karsten> true, but then you want to see if that's normal and whether it looked like that in the past or not.
14:50:21 <karsten> yes, I'd say let's complete the current project first before moving on to the next.
14:50:35 <karsten> for example, we didn't talk about deploying your code yet.
14:50:49 <karsten> I wonder if you'd want to run your own onionoo instance.
14:51:05 <karsten> and have people use that.
14:51:29 <iwakeh> An Onionoo mirror.
14:51:57 <karsten> yep, just one that provides more options, like sorting by different things than consensus weight.
14:52:02 <karsten> maybe that requires changes to atlas, too.
14:52:07 <karsten> so, your version of atlas.
14:52:28 <anathema_db> ok, so
14:52:33 <karsten> but even without those additions, it would be good to test your code by having actual users use it.
14:52:47 <anathema_db> I don't think we should deduplicate stuff
14:52:57 <anathema_db> *duplicate
14:53:01 <iwakeh> It needs more instances for that data.
14:53:11 <iwakeh> Decentralize.
14:53:18 <karsten> ah, setting up onionoo mirrors is something we'd like to do anyway.
14:53:30 <anathema_db> no I was talking about atals
14:53:33 <anathema_db> *atlas
14:53:51 <iwakeh> But they should serve the same data or we all are busy explaining differences.
14:53:56 <karsten> so, in this case we'd just need a slightly modified atlas version that points to your onionoo server.
14:54:07 <anathema_db> karsten: exact
14:54:08 <karsten> just a different url.
14:54:32 <anathema_db> I'll work to make onionoo-ng as more backward compatible as possible
14:54:37 <karsten> we could extend that atlas version to make use of features that are only in your instance, but we don't have to.
14:54:50 <anathema_db> in the meanwhile, we can test it to find any bugs
14:54:54 <karsten> we should!
14:55:21 <karsten> the thing is, if you provide a unique feature, you'll have more testers.
14:55:46 <anathema_db> is it a good thing or a bad thing? :)
14:55:49 <karsten> hah
14:55:52 <iwakeh> depends
14:56:02 <anathema_db> I know :P
14:56:07 <iwakeh> :-)
14:56:29 <anathema_db> we also didn't talk about the new feature we'd love to see in the new onionoo protocol version
14:56:37 <anathema_db> one is the sorting thing
14:56:47 <karsten> yep.
14:56:47 <anathema_db> then?
14:56:51 <anathema_db> *next
14:56:54 <karsten> very good question.
14:57:06 <karsten> I haven't looked at the open tickets for a while.
14:57:14 <karsten> and they might not have the answer.
14:57:19 <anathema_db> I still need to register. shameonme
14:57:38 <karsten> for trac? yes, you should. though you can look without registering
14:57:39 <karsten> .
14:57:40 <iwakeh> you can read anonymously ;-)
14:57:50 <anathema_db> yeah but I'd like to be notified of any new ticket
14:57:57 <anathema_db> or any changes
14:58:03 <karsten> there's a mailing list for that.
14:58:11 <karsten> tor-bugs@
14:58:12 <anathema_db> just to catch up with all the stuff
14:58:14 <karsten> you can subscribe to that.
14:58:19 <anathema_db> ah ok, cool, thanks
14:58:27 <karsten> np.
14:58:32 <karsten> okay, how about we talk more next week?
14:58:37 <anathema_db> sure!
14:58:46 <anathema_db> I'll implement the missing parts
14:58:57 <karsten> I'll test and reply to your latest email.
14:59:00 <anathema_db> then we'll talk via email
14:59:03 <karsten> btw, mind if we move that back to the mailing list?
14:59:12 <anathema_db> no problem karsten
14:59:15 <karsten> (I admit, it was my fault that it went from there to private email.)
14:59:16 <anathema_db> feel free to do it
14:59:19 <karsten> okay.
14:59:39 <anathema_db> great
14:59:42 <karsten> cool. thanks for a great meeting, anathema_db and iwakeh!
14:59:47 <anathema_db> thanks to you guys
14:59:48 <karsten> talk to you more in a week.
15:00:00 <karsten> #endmeeting