14:00:08 <karsten> #startmeeting metrics team 14:00:08 <MeetBot> Meeting started Thu Aug 4 14:00:08 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:08 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:26 <karsten> hi anathema_db. you're here for the meeting? 14:00:29 <anathema_db> yep 14:00:40 <karsten> and you already found the agenda pad: https://pad.riseup.net/p/zUNzEIFRq5S4 14:01:10 <anathema_db> it seems so :) 14:01:17 <karsten> and iwakeh can't connect to oftc.. 14:01:34 <anathema_db> :/ 14:01:39 <anathema_db> so just me and you today ? 14:01:52 <karsten> well, let's wait a bit, maybe that gets resolved. 14:02:06 <karsten> also, let's collect topics until :05. 14:02:20 <anathema_db> sure 14:03:12 <anathema_db> so, I wrote down a couple of things 14:03:30 <anathema_db> one that I was working since this morning 14:04:22 <iwakeh> Hi there :-) 14:04:24 <anathema_db> hurray 14:04:27 <anathema_db> hi iwakeh 14:04:28 <karsten> hi iwakeh! 14:04:42 <iwakeh> They block Tor here :-( 14:04:51 <karsten> boo! 14:05:01 <karsten> anything else that should go on the agenda? 14:05:50 <iwakeh> all listed, I think. 14:06:20 <karsten> alright! 14:06:26 <karsten> * Monthly team report (karsten) 14:06:35 <karsten> I wrote mail about that to the list. 14:06:49 <iwakeh> any replies, yet? 14:07:09 <karsten> tl;dr: let's write monthly reports, and let's start with july 2016. any more input than what we already have from the MOSS work? 14:07:12 <karsten> none. 14:07:24 <iwakeh> ok. 14:08:04 <karsten> I also guess this will become clearer next month. 14:08:22 <karsten> and there's always the chance to mention progress next month, even if it started last month. 14:08:35 <anathema_db> so even progresses are ok? 14:08:48 <anathema_db> I thought we should include only "finished" stuff 14:08:56 <karsten> ah, bad wording on my part. 14:08:57 <anathema_db> "finished" or "released" 14:09:00 <karsten> yep. 14:09:30 <iwakeh> a release is a measurable progress ;-) 14:09:38 <anathema_db> aha 14:09:41 <karsten> let's focus on completed stuff in these reports. 14:10:20 <anathema_db> do we have any? 14:10:25 <karsten> and let's just see how this works out and whether we need to adjust criteria. 14:10:34 <karsten> any releases? 14:10:47 <anathema_db> something that will go into the monthly report 14:10:52 <karsten> ah, yes. 14:10:53 <anathema_db> (just a sneaky preview) 14:11:14 <karsten> - Added a new graph to Tor Metrics that shows a possible range of the number of clients by country and transport and which reveals the most popular pluggable transports in any given country [2] (#19544). 14:11:41 <iwakeh> Takes a looong time to preview everything. 14:11:50 <anathema_db> cool 14:12:33 <karsten> alright, let's see how this first report goes and maybe talk more next week or around end of august to make the next one even better. 14:12:42 <anathema_db> agree 14:12:50 <iwakeh> fine. 14:12:56 <karsten> ok. 14:13:20 <karsten> * CollecTor release(s) (iwakeh) 14:13:23 <iwakeh> #19813 14:13:33 <iwakeh> shall we stick to the 10th? 14:13:57 <karsten> sure! 14:14:16 <karsten> just move my new tickets out of 1.0.0 if they threaten the release date. 14:14:36 <iwakeh> Right, except for the small bugfixes. 14:15:07 <iwakeh> For 1.1.0 shall we aim at a date? 14:15:09 <karsten> yep, many small fixes. 14:15:23 <karsten> hmm 14:15:26 <iwakeh> (good that these keeep trickling in ...) 14:15:49 <karsten> should we discuss the 1.1.0 release date next week after 1.0.0 is released? 14:15:58 <iwakeh> well, before 31st. Ok. 14:16:14 <karsten> yes, before end of month. 14:16:45 <iwakeh> That's it from me. 14:16:51 <karsten> ok! 14:16:58 <karsten> * onionoo-ng - an Onionoo python implementation (anathema) 14:17:05 <karsten> how's this going? :) 14:17:19 <anathema_db> I've sent an email 14:17:38 <anathema_db> to give you an update and the first release of the project 14:17:56 <karsten> okay, I should read that. 14:18:00 <anathema_db> so 14:18:10 <anathema_db> karsten: yeah :) 14:18:14 <anathema_db> just here as a reference: https://github.com/davinerd/onionoo-ng 14:18:22 <anathema_db> and this is live: http://138.201.90.124:8080/details 14:19:22 <iwakeh> http://138.201.90.124:8080/summary 14:19:28 <iwakeh> 405: Method Not Allowed 14:19:30 <anathema_db> there are some differences from the original protocol (positive and negative) that I'll discuss maybe in list 14:19:39 <anathema_db> iwakeh: I implemented only details 14:19:40 <Lunar> irl: keep me for now, please 14:19:55 <anathema_db> as discusses with karsten at the beginning 14:19:58 <iwakeh> ok. 14:19:59 <karsten> is it difficult to implement the others? 14:20:03 <anathema_db> nah 14:20:06 <karsten> ok, great. 14:20:14 <karsten> should we talk about the differences to the current protocol? 14:20:17 <anathema_db> details was the 'difficult' part 14:20:19 <karsten> yep. 14:20:22 <anathema_db> sure, here? 14:20:31 <karsten> sure. we're only 20 minutes into the meeting. 14:20:39 <anathema_db> :) 14:20:42 <karsten> - results are returned for both bridges and relays when using 'limit' and 'offset' 14:20:52 <karsten> so, limit 20 gives you 20 bridges and 20 relays? 14:20:56 <anathema_db> yep 14:21:08 <iwakeh> Maybe this: "bridges_published": null, 14:21:26 <anathema_db> iwakeh: I wrote about that in another email 14:21:38 <iwakeh> no, I meant to ask here. 14:22:00 <iwakeh> discuss, that question here. 14:22:04 <anathema_db> well, should we talk about protocol differences or missed implementation? :) 14:22:15 <anathema_db> one at the time 14:22:23 <karsten> regarding limit and offset, the idea was to support result pagination. 14:22:34 <karsten> happy to wait. 14:22:49 <karsten> so, bridges_published is just a missed implementation that will be fixed later? 14:22:50 <iwakeh> no, protocol first. 14:22:57 <anathema_db> aha 14:23:03 <iwakeh> :-) I interrupted. 14:23:20 <karsten> okay, back to limit and offset? :) 14:23:25 <anathema_db> ok 14:23:35 <karsten> ok. so, I don't know what's most useful for clients. 14:23:35 <anathema_db> so what do you mean by 'result pagination' ? 14:23:46 <karsten> ah, show 10 results per page, give me page 5. 14:23:53 <karsten> so, limit = 10, offset = 40. 14:23:54 <anathema_db> like elasticsearch scroll 14:24:03 <karsten> uhmm, maybe? 14:24:20 <anathema_db> yeah elasticsearch can use pagination 14:24:31 <karsten> okay, so that's the idea behind those parameters. 14:24:45 <anathema_db> it's already implemented 14:24:46 <karsten> trouble is, if we change semantics of parameters, that's bad. 14:25:05 <karsten> we'd have to add new parameters, or raise the protocol version and tell clients they're behind. 14:25:13 <karsten> but we should only do that for really important changes. 14:25:18 <anathema_db> if you say offset=40 limit=10, onionoo-ng will skip the first 40 results and returns the next 10 results 14:25:19 <karsten> this one doesn't seem as important. 14:25:37 <karsten> ok? 14:25:48 <anathema_db> yeah we should minimise semantic changes 14:25:49 <karsten> but doesn't the current protocol do the same? 14:26:05 <karsten> I thought you'd return 10 bridges and 10 relays. 14:26:10 <anathema_db> yes 14:26:15 <anathema_db> but if you do details?limit=10 14:26:23 <anathema_db> current protocol will give you only 10 relays. 14:26:26 <anathema_db> and 0 bridges 14:26:27 <karsten> yep. 14:26:33 <karsten> because you wanted 10 results. 14:26:37 <karsten> and relays come first. 14:26:42 <anathema_db> 10 results total, yes 14:26:46 <karsten> if you want bridges, you'd say type=bridge. 14:27:04 <anathema_db> I interpreted 10 results of both 14:27:15 <karsten> okay, let's talk more about that using email. 14:27:34 <anathema_db> of course that's open implementation, we can change that, but I thought it was more useful 14:27:35 <karsten> ok? 14:27:45 <karsten> yes, we'll have to think about that. 14:27:50 <karsten> I'm open to improving the protocol. 14:27:53 <iwakeh> Sorry, found that offset alone triggers 500. 14:27:55 <anathema_db> sure 14:28:07 <anathema_db> let me check iwakeh 14:28:26 <iwakeh> http://138.201.90.124:8080/details?offset=3 14:28:36 <anathema_db> yeah will check later, thanks 14:28:40 <karsten> - 'order' parameter's value can be any field, so it's not limited to the 'consensum_weight' 14:28:44 <karsten> great! 14:29:03 <anathema_db> :D 14:29:17 <karsten> - 'fields' parameter's value can be any field 14:29:20 <karsten> not sure I understand. 14:29:25 <anathema_db> I did some tests but yes, the idea was to give it to you so you can test it 14:30:04 <anathema_db> karsten: yeah my bad 14:30:14 <karsten> ok. 14:30:18 <anathema_db> I thought the original protocol could only support few fields 14:30:19 <karsten> in theory, that should already be the case. 14:30:20 <karsten> ok. 14:30:23 <anathema_db> but you can specify any field 14:30:26 <anathema_db> yep 14:30:27 <karsten> - 'lookup' is not implemented: I was not able to find a difference between 'lookup' and 'fingerprint': can you provide some real examples? 14:30:46 <karsten> I think fingerprint is not limited to relays and bridges running in the past week. 14:30:48 <anathema_db> yeah I did not understand the difference between lookup and fingerprint 14:30:52 <karsten> I'd have to look, too. 14:30:57 <karsten> - 'search' does not implement: "any 4 hex characters of a space-separated fingerprint" and "beginning of a base64-encoded fingerprint without trailing equal signs": I was not able to find any relevant case for those 14:30:58 <anathema_db> cool, thanks 14:31:15 <karsten> sometimes users find fingerprint parts and paste them in. 14:31:27 <karsten> for example, fingerprints are often printed in blocks of 4 hex chars. 14:31:39 <karsten> and if you add them, onionoo will think it's 10 x 4 search terms. 14:31:50 <anathema_db> yeah but so it should support "abcd 0123 abcd" ? 14:32:00 <karsten> it's useful, yes. 14:32:02 <anathema_db> but also "abcd" ? 14:32:05 <karsten> yes. 14:32:15 <anathema_db> ok, gotcha 14:32:18 <karsten> note that search terms are boolean-and-ed. 14:32:23 <anathema_db> and what about the base64 encoding ? 14:32:27 <anathema_db> yep 14:32:29 <anathema_db> done that 14:32:32 <karsten> this is not to support searches for "0123" above. 14:32:38 <karsten> which would make little sense. 14:32:48 <karsten> base64 14:32:53 <karsten> is contained in raw consensuses. 14:33:08 <karsten> https://collector.torproject.org/recent/relay-descriptors/consensuses/2016-08-04-14-00-00-consensus 14:33:13 <karsten> r PDrelay1 AAFJ5u9xAqrKlpDW6N0pMhJLlKs 6kNxzNyUwLoRAlCxQTo2cK4QN0A 2016-08-04 10:54:12 95.215.44.189 8080 0 14:33:25 <karsten> if you want to find that relay without converting that base64 to hex, 14:33:33 <karsten> just search for AAFJ5u9xAqrKlpDW6N0pMhJLlKs 14:33:42 <karsten> this is useful for people debugging the network. 14:34:01 <anathema_db> aah ok 14:34:07 <anathema_db> now it's clear, thanks 14:34:13 <karsten> okay, let me read that mail in more detail and reply. 14:34:18 <anathema_db> sure! 14:34:28 <anathema_db> to go back to iwakeh: it will be implemente 14:34:29 <karsten> the goal should be to make as few backward-incompatible changes as possible. 14:34:31 <anathema_db> *implemented 14:34:43 <karsten> ideally zero. :) 14:34:48 <anathema_db> but I was not able to find the "source" of that field 14:34:53 <anathema_db> karsten: agree :) 14:35:19 <anathema_db> note: I think I can get the same info by asking for the most recent node in the dataset 14:35:32 <anathema_db> (the one with the most recent "last_seen" timestamp) 14:35:49 <anathema_db> but the Onionoo's approach is different, just I was not able to locate the code 14:35:55 <anathema_db> (it's all written in the email) 14:36:18 <karsten> right, onionoo doesn't read all details files. 14:36:24 <karsten> it reads the summary file. 14:36:55 <anathema_db> for the relays_published? 14:37:02 <anathema_db> it reads the consunsus 14:37:15 <anathema_db> but the class is defined outside the project I think 14:37:18 <karsten> wait, there are two pieces of onionoo. 14:37:22 <anathema_db> hu 14:37:28 <karsten> the first reads the consensus and other descriptors and writes the files I gave you. 14:37:36 <karsten> the second reads those files and answers client requests. 14:37:43 <anathema_db> hu 14:38:01 <karsten> you're replacing the second part. 14:38:04 <anathema_db> so…which field fills the "relays_published" in the output ? 14:38:24 <anathema_db> I mean, where in the snaptshop data I can find that value ? 14:38:49 <karsten> max value of last_seen would work. 14:38:58 <anathema_db> ok 14:39:06 <karsten> the current onionoo looks at the summary file to determine that. 14:39:11 <anathema_db> so it's already done (in dev env) 14:39:12 <karsten> whereas you'd look at the details files. 14:39:41 <anathema_db> it's easy and fast to do in elasticsearch so no problem 14:39:45 <karsten> ok. 14:39:46 <anathema_db> already implemented a stub 14:39:55 <anathema_db> just not published as I was unsure about that 14:40:14 <karsten> cool, let me respond to your mail, hopefully tomorrow. 14:40:18 <anathema_db> no rush 14:40:32 <karsten> :) 14:40:33 <karsten> moving on? 14:40:37 <anathema_db> just as a note 14:40:50 <anathema_db> at the moment, there is no webserver in front of onionoo-ng 14:41:00 <anathema_db> so you're interfacing with the raw app 14:41:30 <anathema_db> I don't think we need varnish or whatever 14:41:36 <anathema_db> but more tests need to be done 14:41:48 <anathema_db> we can move on 14:41:51 <karsten> oh, for a production environment it would probably make sense. 14:41:59 <karsten> reduces the load so much. 14:42:17 <karsten> okay, moving on. 14:42:31 <iwakeh> sorry, I got disconnected. 14:42:33 <karsten> (when did we lose iwakeh?) 14:42:42 <karsten> ah, 14:41. 14:42:47 <anathema_db> the varnish thing 14:42:49 <iwakeh> ok :-) 14:43:14 <anathema_db> well, maybe earlier :) 14:43:22 <karsten> will be in the logs! 14:43:26 <karsten> moving on: 14:43:28 <karsten> * onionoo-ng dataset integration into metrics.torproject.org (or: OnionStats next step) ?? (anathema) 14:43:41 <anathema_db> yeah so 14:44:28 <anathema_db> the original idea behind OnionStats was to play with the data, allowing flexibility in creating graphs 14:44:33 <anathema_db> stat graphs 14:45:03 <anathema_db> so users can interactively create the graphs they want 14:45:04 <karsten> ok. 14:45:29 <karsten> that would be different data though. 14:45:39 <anathema_db> yes 14:45:40 <anathema_db> exact 14:45:54 <karsten> and, more data. 14:46:09 <anathema_db> so, as onionoo use the data from CollecTor, I think we have a good amount of 'raw' data there 14:46:26 <karsten> I'd think that this onionoo-ng project might keep you busy for longer than you'll expect. :) 14:46:39 <anathema_db> sure, my point is: 14:46:42 <karsten> making it run smoothly might take a bit. 14:46:55 <karsten> and guess what happens when people notice that somebody adds features: 14:46:59 <iwakeh> ensuring data integrity. 14:47:00 <karsten> they ask for even more features! 14:47:06 <iwakeh> true. 14:47:49 <anathema_db> we can leverage ElasticSearch (like, the one used in onionoo-ng but with more data) to create dynamic metric graphs 14:48:30 <anathema_db> just as an example, by using a simple query in ES I was able to plot a bar graph with relays divided by country 14:49:03 <karsten> right, but that's just for the current network. 14:49:11 <karsten> which is a limitation of onionoo data. 14:49:19 <anathema_db> and with a single click, I was able to plot a graph with the number of platform, divided by country 14:49:29 <anathema_db> well, I think it's ok to stay with the current network 14:49:42 <anathema_db> because I wanna see _right know_ the metrics of Tor network 14:49:53 <anathema_db> however, we can integrate more data 14:50:06 <karsten> true, but then you want to see if that's normal and whether it looked like that in the past or not. 14:50:21 <karsten> yes, I'd say let's complete the current project first before moving on to the next. 14:50:35 <karsten> for example, we didn't talk about deploying your code yet. 14:50:49 <karsten> I wonder if you'd want to run your own onionoo instance. 14:51:05 <karsten> and have people use that. 14:51:29 <iwakeh> An Onionoo mirror. 14:51:57 <karsten> yep, just one that provides more options, like sorting by different things than consensus weight. 14:52:02 <karsten> maybe that requires changes to atlas, too. 14:52:07 <karsten> so, your version of atlas. 14:52:28 <anathema_db> ok, so 14:52:33 <karsten> but even without those additions, it would be good to test your code by having actual users use it. 14:52:47 <anathema_db> I don't think we should deduplicate stuff 14:52:57 <anathema_db> *duplicate 14:53:01 <iwakeh> It needs more instances for that data. 14:53:11 <iwakeh> Decentralize. 14:53:18 <karsten> ah, setting up onionoo mirrors is something we'd like to do anyway. 14:53:30 <anathema_db> no I was talking about atals 14:53:33 <anathema_db> *atlas 14:53:51 <iwakeh> But they should serve the same data or we all are busy explaining differences. 14:53:56 <karsten> so, in this case we'd just need a slightly modified atlas version that points to your onionoo server. 14:54:07 <anathema_db> karsten: exact 14:54:08 <karsten> just a different url. 14:54:32 <anathema_db> I'll work to make onionoo-ng as more backward compatible as possible 14:54:37 <karsten> we could extend that atlas version to make use of features that are only in your instance, but we don't have to. 14:54:50 <anathema_db> in the meanwhile, we can test it to find any bugs 14:54:54 <karsten> we should! 14:55:21 <karsten> the thing is, if you provide a unique feature, you'll have more testers. 14:55:46 <anathema_db> is it a good thing or a bad thing? :) 14:55:49 <karsten> hah 14:55:52 <iwakeh> depends 14:56:02 <anathema_db> I know :P 14:56:07 <iwakeh> :-) 14:56:29 <anathema_db> we also didn't talk about the new feature we'd love to see in the new onionoo protocol version 14:56:37 <anathema_db> one is the sorting thing 14:56:47 <karsten> yep. 14:56:47 <anathema_db> then? 14:56:51 <anathema_db> *next 14:56:54 <karsten> very good question. 14:57:06 <karsten> I haven't looked at the open tickets for a while. 14:57:14 <karsten> and they might not have the answer. 14:57:19 <anathema_db> I still need to register. shameonme 14:57:38 <karsten> for trac? yes, you should. though you can look without registering 14:57:39 <karsten> . 14:57:40 <iwakeh> you can read anonymously ;-) 14:57:50 <anathema_db> yeah but I'd like to be notified of any new ticket 14:57:57 <anathema_db> or any changes 14:58:03 <karsten> there's a mailing list for that. 14:58:11 <karsten> tor-bugs@ 14:58:12 <anathema_db> just to catch up with all the stuff 14:58:14 <karsten> you can subscribe to that. 14:58:19 <anathema_db> ah ok, cool, thanks 14:58:27 <karsten> np. 14:58:32 <karsten> okay, how about we talk more next week? 14:58:37 <anathema_db> sure! 14:58:46 <anathema_db> I'll implement the missing parts 14:58:57 <karsten> I'll test and reply to your latest email. 14:59:00 <anathema_db> then we'll talk via email 14:59:03 <karsten> btw, mind if we move that back to the mailing list? 14:59:12 <anathema_db> no problem karsten 14:59:15 <karsten> (I admit, it was my fault that it went from there to private email.) 14:59:16 <anathema_db> feel free to do it 14:59:19 <karsten> okay. 14:59:39 <anathema_db> great 14:59:42 <karsten> cool. thanks for a great meeting, anathema_db and iwakeh! 14:59:47 <anathema_db> thanks to you guys 14:59:48 <karsten> talk to you more in a week. 15:00:00 <karsten> #endmeeting