14:42:37 <karsten> #startmeeting metrics team 14:42:37 <MeetBot> Meeting started Thu May 4 14:42:37 2017 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:42:37 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:42:51 <tjr> Aside from 'new graphs = new code', what CSV time-granularity format wouldn't require new code? 14:43:24 <karsten> well, all graphs use 1 data point per UTC day. 14:43:34 <karsten> so, graphing that would be easiest. 14:44:04 <karsten> but some .csv files already grow quite big, and I don't know how much data you want to include per data point. 14:44:15 <karsten> I was mainly thinking of possible ways to reduce that. 14:44:23 <tjr> Oh, woops. Yes okay I got you know. (I forgot that 1 day _is_ a smoothed function of 24 consensii) 14:44:57 <tjr> I think everything would work fine at day-level granularity 14:45:13 <karsten> okay, then maybe start with that and possibly optimize later. 14:45:35 <tjr> So my next question is what should the next steps be? 14:46:11 <tjr> I want to make interactive graphs using javascript controls and d3.js. 14:46:41 <tjr> So that's the clientside stuff. Would that be easy to port into the Metrics frontend? 14:48:25 <karsten> there's no established process for adding graphs to metrics, so I'm not exactly sure what the next step would be. 14:48:42 <karsten> I could imagine that we discuss the .csv file format a bit more in the next step. 14:48:52 <karsten> like, can we avoid dynamic column sets? 14:49:17 <karsten> and maybe there will be similar issues once we look at actual files. 14:49:19 <karsten> or formats. 14:50:15 <tjr> I guess I can refactor the schema... it will add additional clientside processing (which will slow graph generation ) though. Since we'll have to walk over the data and re-assemble it into something with dynamic columns for graphing in d3 14:50:18 <karsten> regarding the client side, I could imagine just writing some R code based on your .csv file format and use the existing web code. 14:50:48 <tjr> Hm... would that be something your team does if/when you want to adopt the graphs? 14:50:58 <karsten> R code? yes. 14:51:18 <tjr> Cool :) 14:51:20 <karsten> see the last part in my mail where I wrote which parts we'd need help with. 14:51:52 <tjr> Okay 14:51:57 <tjr> This seems easy enough then... 14:52:15 <tjr> The main thing I need to do is refactor the database schema 14:52:18 <karsten> do you have a specific graph you want to start with, or do you think it's easier to do them all together? 14:52:35 <karsten> regarding the database, can you use psql on henryi? 14:53:03 <tjr> That is a question for weasel or someone similar 14:53:17 <tjr> Fallback Directory Authorities Running is a trivial graph compared to any dirauth related 14:54:02 <karsten> how do you know which directories are fallback directories? 14:54:06 <karsten> from the tor sources? 14:54:30 <tjr> I ask stem and stem pulls them dynamically from source i believe 14:54:51 <tjr> After i refactor the schema I'll make my own interactive graphs, and then just show you the python generation code, the python convert-to-csv code, the d3 code; and you can decide if/when you want to adopt these. And if/when you do I'll help with the database schema, descriptions, and data format 14:54:52 <karsten> okay, that makes it slightly more difficult for us to port this to metrics. 14:55:29 <karsten> sounds like a fine plan. 14:55:52 <tjr> cool 14:56:20 <karsten> so, regarding graph choice, it might be best to start with data that is contained in votes and consensuses. 14:56:47 <karsten> because we also don't have a good process for adding new data sources to collector/metrics yet. ;) 14:57:00 <karsten> okay, shall we move on? 14:57:40 <tjr> fine by me 14:57:54 <karsten> great! 14:57:56 <karsten> hiro: hey 14:58:01 <hiro> hey 14:58:07 <karsten> shall we quickly talk about onionperf? 14:58:09 <hiro> sure 14:58:27 <karsten> okay. :) looks like the three op-?? are quite stable now. 14:58:31 <hiro> yes 14:58:39 <karsten> op-hl, op-nl, op-us. 14:58:55 <karsten> and we have 2-3 more in the queue, right? 14:59:01 <hiro> the ideal dev-ops part here would be to do some orchestration I was starting to look into that before I took the time off to finish phd 14:59:18 <karsten> (did you succeed? :)) 14:59:32 <hiro> (yes submitted - it's basically over till the defence) 14:59:38 <karsten> yay!! 14:59:41 <hiro> I have to catch up with irl 14:59:51 <hiro> but I think that is online already the -ab instance 15:00:04 <karsten> I read something about issues there. 15:00:06 <hiro> as the subdomain was created before my break 15:00:14 <karsten> I didn't look though. 15:00:22 <hiro> I will check the data, catchup it him and get back to you regarding that 15:00:26 <karsten> great! 15:00:30 <karsten> what about op-se? 15:00:38 <hiro> will update the ticket anyway 15:00:40 <karsten> we recently lost siv. 15:01:00 <karsten> which was the torperf instance running on the op-se host. 15:01:24 <hiro> yes have to catchup with ln5 too regarding that 15:01:45 <karsten> where "lost" means there was a problem that I didn't want to fix anymore because we were moving over to op-se anyway. 15:01:50 <karsten> so I took it out. 15:01:53 <karsten> ok. 15:01:54 <hiro> so I have also seen your ticket regarding the old tor-perf 15:02:11 <hiro> that will be retired right? 15:02:18 <karsten> all torperfs are retired by now. 15:02:25 <karsten> moria, siv, and torperf (ferrinii). 15:02:35 <hiro> ok got it 15:02:53 <hiro> also the onionperf.tpo is retired by now 15:02:53 <karsten> the last is phantomtrain. 15:02:57 <karsten> okay, good. 15:03:13 <karsten> I was wondering if we should leave phantomtrain out of collector/metrics and keep it as testing instance. 15:03:21 <karsten> I didn't talk to rob about this plan yet. 15:03:36 <hiro> testing for onionperf? 15:03:40 <karsten> yes. 15:03:46 <hiro> ok 15:03:53 <karsten> he'll sure want to test new client models etc. 15:04:21 <karsten> and that probably shouldn't happen on a production system. 15:04:44 <hiro> well there is no problem on creating test instances on greenhost 15:04:45 <karsten> also, we'd have to redo all the tarballs to include historic data from phantomtrain. 15:04:59 <hiro> if we just want to have a testing environment 15:05:10 <karsten> and that might be useful as well. but I think rob wants to test on his own machine. 15:05:12 <hiro> we can create a op-dev 15:05:15 <hiro> ah ok 15:05:41 <karsten> but I realize that we should have this discussion together with rob. so I'll move this to email and copy you, okay? 15:06:07 <karsten> by the way, note the "Server (beta)" option here: https://metrics.torproject.org/torperf.html 15:06:18 <karsten> we're now plotting onion server performance. 15:06:36 <karsten> it's still in beta, because it's not reviewed yet. but it exists. 15:07:03 <hiro> so neat 15:07:27 <karsten> hmm, maybe I need to look at the drop there: https://metrics.torproject.org/torperf.html?start=2017-02-03&end=2017-05-04&source=op-hk&server=onion&filesize=50kb 15:07:31 <hiro> I have to check hk instance.. I see no data there 15:07:39 <hiro> oh yes that's what I meant 15:08:07 <hiro> it's consistent 15:08:14 <hiro> I think we are missing some data 15:08:22 <karsten> yes. looks like an issue with the new onionperf module. beta... 15:08:34 <karsten> I'll look into that. probably a problem on the metrics-web side. 15:09:04 <karsten> alright. so much about onionperf for today? 15:09:45 <hiro> I think it's all about it 15:09:51 <karsten> great. thanks! 15:10:04 <hiro> thanks again 15:10:17 <karsten> Samdney: still here? :) 15:10:19 <karsten> https://trac.torproject.org/projects/tor/query?status=!closed&component=^Metrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=owner&col=type&col=priority&col=version&order=priority 15:10:20 <Samdney> yes 15:10:55 <karsten> https://trac.torproject.org/projects/tor/query?status=accepted&status=assigned&status=merge_ready&status=needs_information&status=needs_review&status=needs_revision&status=new&status=reopened&component=%5EMetrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=status&col=type&order=priority 15:11:05 <karsten> (different columns) 15:11:23 <karsten> how's your java.nio? 15:12:27 <Samdney> mmm, my last jave code was "long" time ago. :) 15:12:45 <Samdney> I'm learning fast :) 15:13:32 <Samdney> I think it should work. 15:13:47 <karsten> so, the easy part of the java.nio related ticket (or tickets?) is that you don't have to learn as much about metrics-lib before producing something useful. 15:14:31 <karsten> https://trac.torproject.org/projects/tor/ticket/17831 15:14:40 <Samdney> ah 15:14:42 <karsten> https://trac.torproject.org/projects/tor/ticket/21751 15:14:54 <karsten> not exactly java.nio but related to performance improvements. 15:15:29 <Samdney> ok, I will have a look on this the next day and see what is the best for me to start 15:15:36 <Samdney> thank you for your suggestions 15:15:50 <karsten> I don't know if they're good suggestions. 15:16:15 <karsten> maybe take a look, don't sink too much time into them, and give me feedback what you had really expected from an easy ticket? 15:16:53 <karsten> I'm just thinking that a ticket like #19640 might be more difficult if you're not as familiar with metrics-lib. 15:17:19 <Samdney> metrics-lib is ok. I spent some time with it :) 15:17:21 <karsten> so, maybe start with #21751 which comes with a simple patch. 15:17:24 * hiro knows why there are drops on op-hk 15:17:44 <karsten> (did I say simple?! I meant rudimentary!) 15:18:00 <Samdney> I will be afk in some minutes. Will look on this tomorrow. 15:18:05 <karsten> Samdney: okay, cool. let me know how this goes! 15:18:08 <karsten> hiro: oh? 15:18:22 <Samdney> ok, maybe will send you an email :) 15:18:38 <karsten> sure! 15:19:08 <hiro> so we have "good" data since the 11th of April, before it was the testing fase when I was not understanding what was happening between routing the traffic and having time outs 15:19:26 <hiro> so that data was imported on collector but should be deleted 15:19:31 <karsten> oh. 15:19:36 <hiro> *phase 15:19:55 <karsten> I really wonder why that got imported... 15:19:58 <hiro> I saw iwakeh saying that. i.e. deleting the data 15:20:01 <karsten> good catch! 15:20:04 <karsten> yes, and I deleted it. 15:20:13 <hiro> on all of them? 15:20:18 <karsten> but maybe it was still on the metrics host.. 15:20:23 <hiro> ahh i see 15:20:31 <karsten> should be easy to fix. 15:20:38 <hiro> yay 15:20:39 <karsten> thanks for spotting that! 15:21:03 <karsten> alright, are we done? 15:21:06 <hiro> yep 15:21:22 <karsten> perfect. thanks, and let's talk more next week! bye! 15:21:29 <Samdney> bye! 15:21:34 <karsten> #endmeeting