15:01:42 <karsten> #startmeeting metrics team 15:01:42 <MeetBot> Meeting started Thu Jul 25 15:01:42 2019 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:42 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 15:02:10 <karsten> https://storm.torproject.org/shared/5h1Goax5eNusxjXJ_Ty5Wl7hFR1uqCReUiN8xdlBG8T <- agenda pad 15:02:33 <karsten> anything else to add to the agenda before we start? 15:03:02 <irl> i will add more if we have time 15:03:11 <irl> lower priority things 15:03:17 <karsten> okay. 15:03:20 <djackson> Added an item for browser metrics stuff 15:03:34 <karsten> cool! 15:03:40 <acute> nothing from me 15:03:48 <karsten> okay. 15:03:53 <karsten> start with the first item: 15:03:56 <karsten> starting* 15:04:00 <karsten> Using an alternative build system for Metrics Java codebases (karsten) 15:04:08 <irl> oh dear 15:04:15 <karsten> this is from a discussion on one of the tickets where we update to debian buster libs. 15:04:30 <irl> #31197 15:04:42 <karsten> our current system has its limitations, as I had to realize over the past day or so. 15:05:15 <karsten> I'm having trouble getting libraries updated (manually) to make metrics-web and exonerator work. 15:05:33 <karsten> my current idea is to revert the metrics-base and metrics-lib changes. 15:05:47 <karsten> which will break the jenkins buster build (and fix the stretch build), 15:06:02 <karsten> and then we can figure out a better dependency management system for us without pressure. 15:06:17 <karsten> the current situation is that we cannot merge and deploy anything, which is bad. 15:06:40 <karsten> thoughts on this overall plan? 15:06:54 <irl> i think we still need to see this as urgent, we're not removing *that* much pressure 15:06:59 <irl> but reverting the changes seems sensible 15:07:35 <karsten> okay. 15:07:36 <irl> i don't think we should prioritise the fixes that will make jenkins work, but rather the ones that match our priorities 15:07:43 <irl> which is being able to maintain the codebases and run them 15:07:54 <karsten> agreed. 15:08:35 <karsten> okay, I'll move forward with that. it's a pity to revert, and it's a pity to give up 2 (?) days of work. but this won't succeed otherwise. 15:08:47 <irl> well, the work was needed to find out that we had the problem 15:08:50 <irl> better to have found it than not 15:08:58 <karsten> right. 15:09:16 <irl> for the future, you mentioned switching to maven perhaps 15:09:41 <karsten> yes, I didn't look much yet. any opinions on that vs. another system? 15:09:47 <irl> this is probably not a terrible idea as the debian java team is under-resourced and is increasingly going to be a limiting factor 15:09:48 <karsten> I think there are some systems that are closer related to ant. 15:10:11 <irl> maven seems ok, it has inheritance so we can still maintain metrics-base of we want to 15:10:19 <irl> the alternative i would seriously consider is gradle 15:10:32 <irl> it is not as mature as maven but it does have a large userbase so we do still get network effect 15:10:54 <karsten> okay, good to know. 15:11:15 <karsten> I think maven comes with a bigger change to overall project directories and all that. 15:11:30 <karsten> which isn't bad per se, but which takes time. 15:11:33 <irl> if we have to do it then we have to do it 15:11:51 <karsten> I'll look into those and we can discuss how to proceed from there. 15:11:56 <irl> ok 15:12:12 <karsten> cool! 15:12:15 <karsten> next topic: 15:12:20 <karsten> Non-metrics, but sponsor, work (irl) 15:12:45 <irl> at the meeting there were some anti-censorship team tasks that i have agreed to do as they are sponsor work 15:13:04 <irl> mostly this agenda item is to make you aware that i will have some time not working on metrics to complete those tasks 15:13:12 <karsten> okay. 15:13:17 <irl> it's not much but it might be a couple of week's of points 15:13:51 <irl> some of it is on bridgedb so at least i will be more familiar with that later 15:13:54 <karsten> sounds good. when we do sprints of 1 or 2 (?) weeks, we'll include that in the planning. 15:14:01 <irl> ok cool 15:14:30 <irl> that's all for this topic 15:14:41 <karsten> sounds good! 15:15:14 <karsten> GitLab and CI (irl) 15:15:36 <irl> we have moved onionperf into gitlab 15:15:38 <irl> https://dip.torproject.org/torproject/metrics/onionperf 15:15:45 <irl> also the CI for onionperf is running in gitlab 15:16:03 <irl> the canonical location for the repo is still git.tpo but it gets mirrored there and CI runs on every commit 15:16:11 <karsten> nice! 15:16:14 <irl> when i figure out how to do it we can also do this for merge requests 15:16:40 <irl> i know gaba is keen that we migrate issue tracking there too, and probably for onionperf this is an easy first project to do that with 15:16:49 <karsten> oh! 15:16:51 <irl> acute now has an ldap account so can access gitlab and git.tpo 15:17:08 <karsten> migrating issue tracking seems like a major step. 15:17:15 <irl> i think gitlab has some kanban like thing that goes across projects 15:17:28 <irl> i don't want to do this for other codebases in this 6-month roadmap 15:17:36 <irl> but to try out the issue tracking we can do it for onionperf 15:17:38 <karsten> okay. 15:18:02 <irl> we can also look at adding other projects here for CI 15:18:14 <irl> the first one would be metrics-lib 15:18:17 <karsten> like metrics-lib... heh 15:18:33 <irl> the way the CI works is we can run whatever commands we want in any docker container we want 15:18:40 <karsten> yes, let's do that. in particular now that we're breaking the jenkins build again. 15:18:50 <irl> for onionperf we use a debian stable VM, install it and then run the tests 15:19:07 <irl> so it's pretty flexible 15:19:10 <karsten> how different is that from what we're doing with jenkins and metrics-lib? 15:19:37 <irl> the main difference is that the CI steps are maintained in the codebase and we can edit them just by committing a new config 15:19:52 <karsten> okay. 15:19:55 <irl> so we can experiment a bit to make sure CI is serving us, rather than us serving CI 15:20:33 <karsten> sounds great to me! 15:20:56 <irl> that's all on gitlab and ci unless there are more questions 15:21:03 <karsten> that would best happen after we have a better build/dependency management system in place. 15:21:20 <irl> yes 15:21:24 <karsten> no questions from me. curious how this works! 15:21:44 <karsten> alright, moving to: 15:21:47 <karsten> Scaling simulations (irl) 15:22:13 <irl> so this is something that might be quite a bit of upfront work for us but later makes everything a lot easier 15:22:36 <irl> we want to perform simulations like changing fast/guard cutoffs for voting and seeing how it affects network capacity 15:23:00 <irl> we can write scripts that do this as one-off things, but we really need a framework for this as we're going to be doing it a lot 15:23:33 <irl> i turned a small ticket into a 10 points ticket to allow us time to get tooling together for this, probably using an sql database 15:24:14 <karsten> okay. 15:24:15 <djackson> (sorry, do you have a link to the ticket hand?y) 15:24:17 <irl> the vision i talked about with mikeperry is that other teams can just get the data they need out of metrics by crafting an sql query, or through some other interface we provide, it shouldn't always be that we need to help 15:24:26 <irl> maybe there's a ticket 15:24:49 <karsten> can we give them a database dump? 15:24:53 <irl> the task is called "Use emulated consensus with historical OnionPerf data to predict Tor performance with modified consensus Fast/Guard cutoff values" but I don't know if a ticket exists 15:25:14 <irl> no, not a database dump, we would maintain the database and provide tools to perform the query they need 15:25:41 <karsten> hmm, okay. 15:26:04 <irl> i'm not sure exactly what it looks like yet, but it's going to be different to what we've done in the past 15:26:27 <karsten> indeed. 15:26:30 <irl> maven/gradle might actually really help here allowing more rapid development/prototyping 15:26:39 <irl> not having to worry so much about the dependencies being in debian 15:26:51 <djackson> fwiw, I think having a single source of truth database (compared to csvs etc) would be super handy. 15:27:14 <irl> it's not a single source of truth database, that's collector 15:27:15 <karsten> I'm just careful that adding a new service for this means we'll have another service to maintain. 15:27:28 <djackson> I don't think you need to have live database access though, automated daily dumps would be frequent enough without the hassle of securing it 15:27:37 <karsten> I very much agree that having a database rather than reprocessing files every time makes sense. 15:27:42 <irl> but it's got summary/metadata information for the last X time (whatever capacity we can manage) to allow simple queries 15:28:19 <irl> data would expire from it, although if you want to rent an AWS instance and put the entire history in RAM then our tool should let you do this 15:28:45 <irl> this is not only useful for scaling but also for sponsor either 28 or 31 don't remember which 15:28:52 <irl> the one click anti censorship thing 15:28:53 <karsten> maybe we could start with a database dump and think about better interfaces as step 2. 15:29:03 <irl> yeah that would be an ok starting point 15:29:21 <djackson> I am obviously unfamiliar with a lot of the details, but wouldn't you end up having to maintain the colltector->DB pipeline? Long run seems like it would be easier to have everything go into the DB directly 15:29:53 <irl> collector is the actual documents that we saw in the tor network, the database would be derived from those documents but we still want to keep the originals 15:30:08 <irl> the database would not hold data older than, say, 1 month 15:30:12 <irl> to keep it fast 15:30:44 <djackson> Uh, you can fit all onion perf data forever in a sqlite DB and its fast on the average laptop :) 15:30:51 <djackson> (and all consensus data) 15:31:00 <irl> and votes and relay and extra info descriptors? 15:31:00 <karsten> there's more on collector. :) 15:31:40 <irl> we might also then have DB->other things in the network health team pipelines 15:31:47 <djackson> Votes probably a little harder, but not out of reach I think 15:32:22 <karsten> we can figure out those questions while working on it, I guess. 15:32:31 <irl> we could have different retention policies for different things 15:32:36 <djackson> Yeah, sorry :) 15:33:22 <irl> so when mikeperry comes to talk to you about this task, you'll have the context and won't be surprised (: 15:33:27 <karsten> hehe 15:33:30 <karsten> sounds cool! 15:33:51 <karsten> okay, moving on? 15:33:55 <irl> yeah 15:34:02 <karsten> Update on browser metrics (djackson) 15:34:11 <djackson> So I've been working on gathering some user experience metrics using the Tor Browser. 15:34:24 <djackson> ather than measuring raw network latency/throughput like onionperf, the intent is to measure how the Tor network performs from the perspective of users. 15:34:41 <djackson> I've been building tooling / doing integration work with WebPageTest, which is a testing/orcehstration suite for Firefox/Chrome/Edge/etc. Shouldn't be too much work to connect it up to the Tor Browser. 15:34:57 <djackson> Example output: https://www.webpagetest.org/result/190725_BA_e075e435a914bd66cbf7b07813ed719c/ 15:35:15 <djackson> Also spits out JSON / machine readable results for proper experiments/batch testing 15:35:26 <djackson> Hopefully some more details and preliminary results soon. 15:35:32 <irl> does it give you a HAR archive? 15:35:37 <djackson> Yep 15:35:42 <irl> ok cool 15:35:53 <djackson> This leads into my next question 15:36:05 <djackson> Any specific requests/suggestions for data? 15:36:17 <djackson> Feel free to have a think and email me or whatever. 15:36:33 <karsten> how is this related to matt's work on selenium? 15:36:48 <djackson> That was my starting point, but this should supersede it. 15:37:01 <djackson> the tl;dr is that selenium is great for network metrics 15:37:04 <djackson> but not good for visual metrics 15:37:21 <djackson> WPT gives stuff like % visual completion at various points, interactivity, etc 15:37:42 <djackson> And supports traffic shaping for various conditions and stuff. It's a much more complete package. 15:37:47 <irl> are these not just metrics you can compute from events you can see in selenium? 15:38:07 <djackson> Sure, but this already has the tooling built and the edge cases handled 15:38:22 <djackson> It grabs the videos of the page loading and does the inference etc. 15:38:31 <irl> ok, is it automatable? 15:38:32 <djackson> Yes 15:38:37 <djackson> Already have that bit up and running 15:38:52 <karsten> how would we match these requests with tor controller events? 15:39:09 <djackson> That is undecided. My current thinking is to leverage Tor Button 15:39:28 <djackson> Tor Button can write the events into the DOM and WPT can grab them by executing javascript (existing functionality) 15:39:36 <djackson> but I am open to better ideas! 15:39:51 <djackson> (and Tor button already matches requests to circuits for me) 15:40:30 <karsten> did you show this to matt? 15:40:38 <djackson> Yes, you are cc'd on the mail exchange :) 15:40:49 <karsten> oh, heh, it's in a pile... ;) 15:40:53 <djackson> I figured :) 15:41:03 <djackson> More details to come 15:41:12 <karsten> cool! 15:41:15 <djackson> but as I said, it you have specific ideas/requests/suggestions, do email me 15:41:28 <djackson> I currently plan to grab some visual metrics and a screenshot of the final page 15:41:43 <djackson> but if say you want 100 000 HAR archives of Tor Browser page loads, that can be done I think 15:42:18 <irl> my initial thoughts would be to fetch bbc.com or some other busy page and get the atf time and just see what sort of variation is going on there 15:42:52 <irl> is it similar to onionperf in the variation? is it better or worse? 15:42:55 <karsten> wait, but this is using firefox, not tor browser, right? 15:43:18 <djackson> So I currently have it working with Firefox over Tor. But am actively working on Tor Browser 15:43:23 <karsten> ah. 15:43:24 <djackson> It's fiddly, but getting there 15:43:29 <karsten> okay. 15:43:38 <djackson> irl: the atf time? 15:44:03 <irl> time until the content above the fold has been rendered and a user could consume/interact with it excluding 3rd party advertisements 15:44:20 <djackson> ah :) yes that's gathered 15:44:43 <irl> i think it would be an interesting distribution to compare 15:44:44 <djackson> excluding ads is harder, but I have a Ublock integration sorted so could diff-test the two :) 15:45:04 <irl> often a banner ad can be the thing that takes forever to load but the user doesn't care about it 15:45:13 <djackson> Absolutely 15:45:16 <irl> so as far as the user is concerned the page is loaded 15:45:37 <irl> i guess if it never loads then your tool considers that it was fully loaded because no dom event fired? 15:46:12 <djackson> Yeah. Visual completion is defined by looking at the end state and working backwards 15:46:32 <djackson> 'end state' is hard to define of course, but am trusting their metrics 15:46:41 <irl> so late banner ads and scrolling carousels are things we should watch out for 15:46:50 <djackson> Suspect it could be thrown off by auto playing videos, but need to test 15:46:55 <djackson> yeah indeed 15:47:09 <irl> there was a french university group that looked at removing these false positives 15:47:14 <irl> i will try and dig up the paper for you 15:47:33 <djackson> Summary: Work ongoing. Please feel free to email me requests/suggestions/reading material. :) 15:47:38 <irl> cool 15:47:38 <djackson> That would be great, thanks! 15:47:43 <karsten> nice! 15:47:48 <karsten> thanks for the update! 15:47:51 <djackson> np 15:48:05 <karsten> Lower priority things, as time permits (irl) 15:48:18 <irl> i guess we can do one of these 15:48:28 <irl> onionperf experiments 15:48:47 <irl> as part of the scaling work, we're going to deploy some onionperf instances that are experimental to try out some wacky ideas 15:48:56 <irl> acute and mikeperry are going to be running this mostly 15:49:49 <acute> yes, I will deploy a collector for these results, and we will have different tor clients running 15:50:09 <acute> the testbed will be automated with ansible 15:50:22 <karsten> how will the tor clients be different? 15:50:32 <acute> (based on the same playbooks we use for the current onionperf) 15:50:44 <djackson> This sounds super interesting. It would be great to compare the results with the analysis I sent to tor-scaling. 15:51:32 <acute> different configurations 15:52:30 <djackson> Is there some ticket or place to follow along? 15:52:47 <irl> i think tickets will appear as we get into turning the roadmap into tickets 15:53:22 <djackson> Okay. Any idea on the rough timeline? Are we talking weeks? Months? 15:53:27 <irl> acute should probably also subscribe to tor-scaling and send updates there 15:53:45 <irl> (unless it's a sekrit list, but i think it is not now) 15:53:50 <acute> from the roadmapping session, we're looking at the following 2-3 months 15:53:56 <djackson> Okay, great :) 15:54:08 <karsten> sounds exciting! 15:54:27 <acute> yes! 15:54:34 <djackson> I think tor-scaling is semi public. You can register if you can find the right link. but the link is hidden in the basement behind the door marked beware of the lion 15:55:34 <karsten> cool! can we talk about the other lower priority things next week? 15:55:40 <irl> yes 15:55:43 <karsten> approaching the 60 minutes mark... 15:55:48 <karsten> okay! 15:56:09 <karsten> lots of stuff going on. it shows that this was the first meeting in july. ;) 15:56:25 <karsten> thanks, everyone, see you all next week! bye! 15:56:33 <irl> bye! 15:56:42 <karsten> #endmeeting