14:42:37 #startmeeting metrics team 14:42:37 Meeting started Thu May 4 14:42:37 2017 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:42:37 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:42:51 Aside from 'new graphs = new code', what CSV time-granularity format wouldn't require new code? 14:43:24 well, all graphs use 1 data point per UTC day. 14:43:34 so, graphing that would be easiest. 14:44:04 but some .csv files already grow quite big, and I don't know how much data you want to include per data point. 14:44:15 I was mainly thinking of possible ways to reduce that. 14:44:23 Oh, woops. Yes okay I got you know. (I forgot that 1 day _is_ a smoothed function of 24 consensii) 14:44:57 I think everything would work fine at day-level granularity 14:45:13 okay, then maybe start with that and possibly optimize later. 14:45:35 So my next question is what should the next steps be? 14:46:11 I want to make interactive graphs using javascript controls and d3.js. 14:46:41 So that's the clientside stuff. Would that be easy to port into the Metrics frontend? 14:48:25 there's no established process for adding graphs to metrics, so I'm not exactly sure what the next step would be. 14:48:42 I could imagine that we discuss the .csv file format a bit more in the next step. 14:48:52 like, can we avoid dynamic column sets? 14:49:17 and maybe there will be similar issues once we look at actual files. 14:49:19 or formats. 14:50:15 I guess I can refactor the schema... it will add additional clientside processing (which will slow graph generation ) though. Since we'll have to walk over the data and re-assemble it into something with dynamic columns for graphing in d3 14:50:18 regarding the client side, I could imagine just writing some R code based on your .csv file format and use the existing web code. 14:50:48 Hm... would that be something your team does if/when you want to adopt the graphs? 14:50:58 R code? yes. 14:51:18 Cool :) 14:51:20 see the last part in my mail where I wrote which parts we'd need help with. 14:51:52 Okay 14:51:57 This seems easy enough then... 14:52:15 The main thing I need to do is refactor the database schema 14:52:18 do you have a specific graph you want to start with, or do you think it's easier to do them all together? 14:52:35 regarding the database, can you use psql on henryi? 14:53:03 That is a question for weasel or someone similar 14:53:17 Fallback Directory Authorities Running is a trivial graph compared to any dirauth related 14:54:02 how do you know which directories are fallback directories? 14:54:06 from the tor sources? 14:54:30 I ask stem and stem pulls them dynamically from source i believe 14:54:51 After i refactor the schema I'll make my own interactive graphs, and then just show you the python generation code, the python convert-to-csv code, the d3 code; and you can decide if/when you want to adopt these. And if/when you do I'll help with the database schema, descriptions, and data format 14:54:52 okay, that makes it slightly more difficult for us to port this to metrics. 14:55:29 sounds like a fine plan. 14:55:52 cool 14:56:20 so, regarding graph choice, it might be best to start with data that is contained in votes and consensuses. 14:56:47 because we also don't have a good process for adding new data sources to collector/metrics yet. ;) 14:57:00 okay, shall we move on? 14:57:40 fine by me 14:57:54 great! 14:57:56 hiro: hey 14:58:01 hey 14:58:07 shall we quickly talk about onionperf? 14:58:09 sure 14:58:27 okay. :) looks like the three op-?? are quite stable now. 14:58:31 yes 14:58:39 op-hl, op-nl, op-us. 14:58:55 and we have 2-3 more in the queue, right? 14:59:01 the ideal dev-ops part here would be to do some orchestration I was starting to look into that before I took the time off to finish phd 14:59:18 (did you succeed? :)) 14:59:32 (yes submitted - it's basically over till the defence) 14:59:38 yay!! 14:59:41 I have to catch up with irl 14:59:51 but I think that is online already the -ab instance 15:00:04 I read something about issues there. 15:00:06 as the subdomain was created before my break 15:00:14 I didn't look though. 15:00:22 I will check the data, catchup it him and get back to you regarding that 15:00:26 great! 15:00:30 what about op-se? 15:00:38 will update the ticket anyway 15:00:40 we recently lost siv. 15:01:00 which was the torperf instance running on the op-se host. 15:01:24 yes have to catchup with ln5 too regarding that 15:01:45 where "lost" means there was a problem that I didn't want to fix anymore because we were moving over to op-se anyway. 15:01:50 so I took it out. 15:01:53 ok. 15:01:54 so I have also seen your ticket regarding the old tor-perf 15:02:11 that will be retired right? 15:02:18 all torperfs are retired by now. 15:02:25 moria, siv, and torperf (ferrinii). 15:02:35 ok got it 15:02:53 also the onionperf.tpo is retired by now 15:02:53 the last is phantomtrain. 15:02:57 okay, good. 15:03:13 I was wondering if we should leave phantomtrain out of collector/metrics and keep it as testing instance. 15:03:21 I didn't talk to rob about this plan yet. 15:03:36 testing for onionperf? 15:03:40 yes. 15:03:46 ok 15:03:53 he'll sure want to test new client models etc. 15:04:21 and that probably shouldn't happen on a production system. 15:04:44 well there is no problem on creating test instances on greenhost 15:04:45 also, we'd have to redo all the tarballs to include historic data from phantomtrain. 15:04:59 if we just want to have a testing environment 15:05:10 and that might be useful as well. but I think rob wants to test on his own machine. 15:05:12 we can create a op-dev 15:05:15 ah ok 15:05:41 but I realize that we should have this discussion together with rob. so I'll move this to email and copy you, okay? 15:06:07 by the way, note the "Server (beta)" option here: https://metrics.torproject.org/torperf.html 15:06:18 we're now plotting onion server performance. 15:06:36 it's still in beta, because it's not reviewed yet. but it exists. 15:07:03 so neat 15:07:27 hmm, maybe I need to look at the drop there: https://metrics.torproject.org/torperf.html?start=2017-02-03&end=2017-05-04&source=op-hk&server=onion&filesize=50kb 15:07:31 I have to check hk instance.. I see no data there 15:07:39 oh yes that's what I meant 15:08:07 it's consistent 15:08:14 I think we are missing some data 15:08:22 yes. looks like an issue with the new onionperf module. beta... 15:08:34 I'll look into that. probably a problem on the metrics-web side. 15:09:04 alright. so much about onionperf for today? 15:09:45 I think it's all about it 15:09:51 great. thanks! 15:10:04 thanks again 15:10:17 Samdney: still here? :) 15:10:19 https://trac.torproject.org/projects/tor/query?status=!closed&component=^Metrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=owner&col=type&col=priority&col=version&order=priority 15:10:20 yes 15:10:55 https://trac.torproject.org/projects/tor/query?status=accepted&status=assigned&status=merge_ready&status=needs_information&status=needs_review&status=needs_revision&status=new&status=reopened&component=%5EMetrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=status&col=type&order=priority 15:11:05 (different columns) 15:11:23 how's your java.nio? 15:12:27 mmm, my last jave code was "long" time ago. :) 15:12:45 I'm learning fast :) 15:13:32 I think it should work. 15:13:47 so, the easy part of the java.nio related ticket (or tickets?) is that you don't have to learn as much about metrics-lib before producing something useful. 15:14:31 https://trac.torproject.org/projects/tor/ticket/17831 15:14:40 ah 15:14:42 https://trac.torproject.org/projects/tor/ticket/21751 15:14:54 not exactly java.nio but related to performance improvements. 15:15:29 ok, I will have a look on this the next day and see what is the best for me to start 15:15:36 thank you for your suggestions 15:15:50 I don't know if they're good suggestions. 15:16:15 maybe take a look, don't sink too much time into them, and give me feedback what you had really expected from an easy ticket? 15:16:53 I'm just thinking that a ticket like #19640 might be more difficult if you're not as familiar with metrics-lib. 15:17:19 metrics-lib is ok. I spent some time with it :) 15:17:21 so, maybe start with #21751 which comes with a simple patch. 15:17:24 * hiro knows why there are drops on op-hk 15:17:44 (did I say simple?! I meant rudimentary!) 15:18:00 I will be afk in some minutes. Will look on this tomorrow. 15:18:05 Samdney: okay, cool. let me know how this goes! 15:18:08 hiro: oh? 15:18:22 ok, maybe will send you an email :) 15:18:38 sure! 15:19:08 so we have "good" data since the 11th of April, before it was the testing fase when I was not understanding what was happening between routing the traffic and having time outs 15:19:26 so that data was imported on collector but should be deleted 15:19:31 oh. 15:19:36 *phase 15:19:55 I really wonder why that got imported... 15:19:58 I saw iwakeh saying that. i.e. deleting the data 15:20:01 good catch! 15:20:04 yes, and I deleted it. 15:20:13 on all of them? 15:20:18 but maybe it was still on the metrics host.. 15:20:23 ahh i see 15:20:31 should be easy to fix. 15:20:38 yay 15:20:39 thanks for spotting that! 15:21:03 alright, are we done? 15:21:06 yep 15:21:22 perfect. thanks, and let's talk more next week! bye! 15:21:29 bye! 15:21:34 #endmeeting