14:00:08 #startmeeting metrics team 14:00:08 Meeting started Thu Aug 4 14:00:08 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:08 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:26 hi anathema_db. you're here for the meeting? 14:00:29 yep 14:00:40 and you already found the agenda pad: https://pad.riseup.net/p/zUNzEIFRq5S4 14:01:10 it seems so :) 14:01:17 and iwakeh can't connect to oftc.. 14:01:34 :/ 14:01:39 so just me and you today ? 14:01:52 well, let's wait a bit, maybe that gets resolved. 14:02:06 also, let's collect topics until :05. 14:02:20 sure 14:03:12 so, I wrote down a couple of things 14:03:30 one that I was working since this morning 14:04:22 Hi there :-) 14:04:24 hurray 14:04:27 hi iwakeh 14:04:28 hi iwakeh! 14:04:42 They block Tor here :-( 14:04:51 boo! 14:05:01 anything else that should go on the agenda? 14:05:50 all listed, I think. 14:06:20 alright! 14:06:26 * Monthly team report (karsten) 14:06:35 I wrote mail about that to the list. 14:06:49 any replies, yet? 14:07:09 tl;dr: let's write monthly reports, and let's start with july 2016. any more input than what we already have from the MOSS work? 14:07:12 none. 14:07:24 ok. 14:08:04 I also guess this will become clearer next month. 14:08:22 and there's always the chance to mention progress next month, even if it started last month. 14:08:35 so even progresses are ok? 14:08:48 I thought we should include only "finished" stuff 14:08:56 ah, bad wording on my part. 14:08:57 "finished" or "released" 14:09:00 yep. 14:09:30 a release is a measurable progress ;-) 14:09:38 aha 14:09:41 let's focus on completed stuff in these reports. 14:10:20 do we have any? 14:10:25 and let's just see how this works out and whether we need to adjust criteria. 14:10:34 any releases? 14:10:47 something that will go into the monthly report 14:10:52 ah, yes. 14:10:53 (just a sneaky preview) 14:11:14 - Added a new graph to Tor Metrics that shows a possible range of the number of clients by country and transport and which reveals the most popular pluggable transports in any given country [2] (#19544). 14:11:41 Takes a looong time to preview everything. 14:11:50 cool 14:12:33 alright, let's see how this first report goes and maybe talk more next week or around end of august to make the next one even better. 14:12:42 agree 14:12:50 fine. 14:12:56 ok. 14:13:20 * CollecTor release(s) (iwakeh) 14:13:23 #19813 14:13:33 shall we stick to the 10th? 14:13:57 sure! 14:14:16 just move my new tickets out of 1.0.0 if they threaten the release date. 14:14:36 Right, except for the small bugfixes. 14:15:07 For 1.1.0 shall we aim at a date? 14:15:09 yep, many small fixes. 14:15:23 hmm 14:15:26 (good that these keeep trickling in ...) 14:15:49 should we discuss the 1.1.0 release date next week after 1.0.0 is released? 14:15:58 well, before 31st. Ok. 14:16:14 yes, before end of month. 14:16:45 That's it from me. 14:16:51 ok! 14:16:58 * onionoo-ng - an Onionoo python implementation (anathema) 14:17:05 how's this going? :) 14:17:19 I've sent an email 14:17:38 to give you an update and the first release of the project 14:17:56 okay, I should read that. 14:18:00 so 14:18:10 karsten: yeah :) 14:18:14 just here as a reference: https://github.com/davinerd/onionoo-ng 14:18:22 and this is live: http://138.201.90.124:8080/details 14:19:22 http://138.201.90.124:8080/summary 14:19:28 405: Method Not Allowed 14:19:30 there are some differences from the original protocol (positive and negative) that I'll discuss maybe in list 14:19:39 iwakeh: I implemented only details 14:19:40 irl: keep me for now, please 14:19:55 as discusses with karsten at the beginning 14:19:58 ok. 14:19:59 is it difficult to implement the others? 14:20:03 nah 14:20:06 ok, great. 14:20:14 should we talk about the differences to the current protocol? 14:20:17 details was the 'difficult' part 14:20:19 yep. 14:20:22 sure, here? 14:20:31 sure. we're only 20 minutes into the meeting. 14:20:39 :) 14:20:42 - results are returned for both bridges and relays when using 'limit' and 'offset' 14:20:52 so, limit 20 gives you 20 bridges and 20 relays? 14:20:56 yep 14:21:08 Maybe this: "bridges_published": null, 14:21:26 iwakeh: I wrote about that in another email 14:21:38 no, I meant to ask here. 14:22:00 discuss, that question here. 14:22:04 well, should we talk about protocol differences or missed implementation? :) 14:22:15 one at the time 14:22:23 regarding limit and offset, the idea was to support result pagination. 14:22:34 happy to wait. 14:22:49 so, bridges_published is just a missed implementation that will be fixed later? 14:22:50 no, protocol first. 14:22:57 aha 14:23:03 :-) I interrupted. 14:23:20 okay, back to limit and offset? :) 14:23:25 ok 14:23:35 ok. so, I don't know what's most useful for clients. 14:23:35 so what do you mean by 'result pagination' ? 14:23:46 ah, show 10 results per page, give me page 5. 14:23:53 so, limit = 10, offset = 40. 14:23:54 like elasticsearch scroll 14:24:03 uhmm, maybe? 14:24:20 yeah elasticsearch can use pagination 14:24:31 okay, so that's the idea behind those parameters. 14:24:45 it's already implemented 14:24:46 trouble is, if we change semantics of parameters, that's bad. 14:25:05 we'd have to add new parameters, or raise the protocol version and tell clients they're behind. 14:25:13 but we should only do that for really important changes. 14:25:18 if you say offset=40 limit=10, onionoo-ng will skip the first 40 results and returns the next 10 results 14:25:19 this one doesn't seem as important. 14:25:37 ok? 14:25:48 yeah we should minimise semantic changes 14:25:49 but doesn't the current protocol do the same? 14:26:05 I thought you'd return 10 bridges and 10 relays. 14:26:10 yes 14:26:15 but if you do details?limit=10 14:26:23 current protocol will give you only 10 relays. 14:26:26 and 0 bridges 14:26:27 yep. 14:26:33 because you wanted 10 results. 14:26:37 and relays come first. 14:26:42 10 results total, yes 14:26:46 if you want bridges, you'd say type=bridge. 14:27:04 I interpreted 10 results of both 14:27:15 okay, let's talk more about that using email. 14:27:34 of course that's open implementation, we can change that, but I thought it was more useful 14:27:35 ok? 14:27:45 yes, we'll have to think about that. 14:27:50 I'm open to improving the protocol. 14:27:53 Sorry, found that offset alone triggers 500. 14:27:55 sure 14:28:07 let me check iwakeh 14:28:26 http://138.201.90.124:8080/details?offset=3 14:28:36 yeah will check later, thanks 14:28:40 - 'order' parameter's value can be any field, so it's not limited to the 'consensum_weight' 14:28:44 great! 14:29:03 :D 14:29:17 - 'fields' parameter's value can be any field 14:29:20 not sure I understand. 14:29:25 I did some tests but yes, the idea was to give it to you so you can test it 14:30:04 karsten: yeah my bad 14:30:14 ok. 14:30:18 I thought the original protocol could only support few fields 14:30:19 in theory, that should already be the case. 14:30:20 ok. 14:30:23 but you can specify any field 14:30:26 yep 14:30:27 - 'lookup' is not implemented: I was not able to find a difference between 'lookup' and 'fingerprint': can you provide some real examples? 14:30:46 I think fingerprint is not limited to relays and bridges running in the past week. 14:30:48 yeah I did not understand the difference between lookup and fingerprint 14:30:52 I'd have to look, too. 14:30:57 - 'search' does not implement: "any 4 hex characters of a space-separated fingerprint" and "beginning of a base64-encoded fingerprint without trailing equal signs": I was not able to find any relevant case for those 14:30:58 cool, thanks 14:31:15 sometimes users find fingerprint parts and paste them in. 14:31:27 for example, fingerprints are often printed in blocks of 4 hex chars. 14:31:39 and if you add them, onionoo will think it's 10 x 4 search terms. 14:31:50 yeah but so it should support "abcd 0123 abcd" ? 14:32:00 it's useful, yes. 14:32:02 but also "abcd" ? 14:32:05 yes. 14:32:15 ok, gotcha 14:32:18 note that search terms are boolean-and-ed. 14:32:23 and what about the base64 encoding ? 14:32:27 yep 14:32:29 done that 14:32:32 this is not to support searches for "0123" above. 14:32:38 which would make little sense. 14:32:48 base64 14:32:53 is contained in raw consensuses. 14:33:08 https://collector.torproject.org/recent/relay-descriptors/consensuses/2016-08-04-14-00-00-consensus 14:33:13 r PDrelay1 AAFJ5u9xAqrKlpDW6N0pMhJLlKs 6kNxzNyUwLoRAlCxQTo2cK4QN0A 2016-08-04 10:54:12 95.215.44.189 8080 0 14:33:25 if you want to find that relay without converting that base64 to hex, 14:33:33 just search for AAFJ5u9xAqrKlpDW6N0pMhJLlKs 14:33:42 this is useful for people debugging the network. 14:34:01 aah ok 14:34:07 now it's clear, thanks 14:34:13 okay, let me read that mail in more detail and reply. 14:34:18 sure! 14:34:28 to go back to iwakeh: it will be implemente 14:34:29 the goal should be to make as few backward-incompatible changes as possible. 14:34:31 *implemented 14:34:43 ideally zero. :) 14:34:48 but I was not able to find the "source" of that field 14:34:53 karsten: agree :) 14:35:19 note: I think I can get the same info by asking for the most recent node in the dataset 14:35:32 (the one with the most recent "last_seen" timestamp) 14:35:49 but the Onionoo's approach is different, just I was not able to locate the code 14:35:55 (it's all written in the email) 14:36:18 right, onionoo doesn't read all details files. 14:36:24 it reads the summary file. 14:36:55 for the relays_published? 14:37:02 it reads the consunsus 14:37:15 but the class is defined outside the project I think 14:37:18 wait, there are two pieces of onionoo. 14:37:22 hu 14:37:28 the first reads the consensus and other descriptors and writes the files I gave you. 14:37:36 the second reads those files and answers client requests. 14:37:43 hu 14:38:01 you're replacing the second part. 14:38:04 so…which field fills the "relays_published" in the output ? 14:38:24 I mean, where in the snaptshop data I can find that value ? 14:38:49 max value of last_seen would work. 14:38:58 ok 14:39:06 the current onionoo looks at the summary file to determine that. 14:39:11 so it's already done (in dev env) 14:39:12 whereas you'd look at the details files. 14:39:41 it's easy and fast to do in elasticsearch so no problem 14:39:45 ok. 14:39:46 already implemented a stub 14:39:55 just not published as I was unsure about that 14:40:14 cool, let me respond to your mail, hopefully tomorrow. 14:40:18 no rush 14:40:32 :) 14:40:33 moving on? 14:40:37 just as a note 14:40:50 at the moment, there is no webserver in front of onionoo-ng 14:41:00 so you're interfacing with the raw app 14:41:30 I don't think we need varnish or whatever 14:41:36 but more tests need to be done 14:41:48 we can move on 14:41:51 oh, for a production environment it would probably make sense. 14:41:59 reduces the load so much. 14:42:17 okay, moving on. 14:42:31 sorry, I got disconnected. 14:42:33 (when did we lose iwakeh?) 14:42:42 ah, 14:41. 14:42:47 the varnish thing 14:42:49 ok :-) 14:43:14 well, maybe earlier :) 14:43:22 will be in the logs! 14:43:26 moving on: 14:43:28 * onionoo-ng dataset integration into metrics.torproject.org (or: OnionStats next step) ?? (anathema) 14:43:41 yeah so 14:44:28 the original idea behind OnionStats was to play with the data, allowing flexibility in creating graphs 14:44:33 stat graphs 14:45:03 so users can interactively create the graphs they want 14:45:04 ok. 14:45:29 that would be different data though. 14:45:39 yes 14:45:40 exact 14:45:54 and, more data. 14:46:09 so, as onionoo use the data from CollecTor, I think we have a good amount of 'raw' data there 14:46:26 I'd think that this onionoo-ng project might keep you busy for longer than you'll expect. :) 14:46:39 sure, my point is: 14:46:42 making it run smoothly might take a bit. 14:46:55 and guess what happens when people notice that somebody adds features: 14:46:59 ensuring data integrity. 14:47:00 they ask for even more features! 14:47:06 true. 14:47:49 we can leverage ElasticSearch (like, the one used in onionoo-ng but with more data) to create dynamic metric graphs 14:48:30 just as an example, by using a simple query in ES I was able to plot a bar graph with relays divided by country 14:49:03 right, but that's just for the current network. 14:49:11 which is a limitation of onionoo data. 14:49:19 and with a single click, I was able to plot a graph with the number of platform, divided by country 14:49:29 well, I think it's ok to stay with the current network 14:49:42 because I wanna see _right know_ the metrics of Tor network 14:49:53 however, we can integrate more data 14:50:06 true, but then you want to see if that's normal and whether it looked like that in the past or not. 14:50:21 yes, I'd say let's complete the current project first before moving on to the next. 14:50:35 for example, we didn't talk about deploying your code yet. 14:50:49 I wonder if you'd want to run your own onionoo instance. 14:51:05 and have people use that. 14:51:29 An Onionoo mirror. 14:51:57 yep, just one that provides more options, like sorting by different things than consensus weight. 14:52:02 maybe that requires changes to atlas, too. 14:52:07 so, your version of atlas. 14:52:28 ok, so 14:52:33 but even without those additions, it would be good to test your code by having actual users use it. 14:52:47 I don't think we should deduplicate stuff 14:52:57 *duplicate 14:53:01 It needs more instances for that data. 14:53:11 Decentralize. 14:53:18 ah, setting up onionoo mirrors is something we'd like to do anyway. 14:53:30 no I was talking about atals 14:53:33 *atlas 14:53:51 But they should serve the same data or we all are busy explaining differences. 14:53:56 so, in this case we'd just need a slightly modified atlas version that points to your onionoo server. 14:54:07 karsten: exact 14:54:08 just a different url. 14:54:32 I'll work to make onionoo-ng as more backward compatible as possible 14:54:37 we could extend that atlas version to make use of features that are only in your instance, but we don't have to. 14:54:50 in the meanwhile, we can test it to find any bugs 14:54:54 we should! 14:55:21 the thing is, if you provide a unique feature, you'll have more testers. 14:55:46 is it a good thing or a bad thing? :) 14:55:49 hah 14:55:52 depends 14:56:02 I know :P 14:56:07 :-) 14:56:29 we also didn't talk about the new feature we'd love to see in the new onionoo protocol version 14:56:37 one is the sorting thing 14:56:47 yep. 14:56:47 then? 14:56:51 *next 14:56:54 very good question. 14:57:06 I haven't looked at the open tickets for a while. 14:57:14 and they might not have the answer. 14:57:19 I still need to register. shameonme 14:57:38 for trac? yes, you should. though you can look without registering 14:57:39 . 14:57:40 you can read anonymously ;-) 14:57:50 yeah but I'd like to be notified of any new ticket 14:57:57 or any changes 14:58:03 there's a mailing list for that. 14:58:11 tor-bugs@ 14:58:12 just to catch up with all the stuff 14:58:14 you can subscribe to that. 14:58:19 ah ok, cool, thanks 14:58:27 np. 14:58:32 okay, how about we talk more next week? 14:58:37 sure! 14:58:46 I'll implement the missing parts 14:58:57 I'll test and reply to your latest email. 14:59:00 then we'll talk via email 14:59:03 btw, mind if we move that back to the mailing list? 14:59:12 no problem karsten 14:59:15 (I admit, it was my fault that it went from there to private email.) 14:59:16 feel free to do it 14:59:19 okay. 14:59:39 great 14:59:42 cool. thanks for a great meeting, anathema_db and iwakeh! 14:59:47 thanks to you guys 14:59:48 talk to you more in a week. 15:00:00 #endmeeting