15:01:42 <karsten> #startmeeting metrics team
15:01:42 <MeetBot> Meeting started Thu Jul 25 15:01:42 2019 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:42 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:02:10 <karsten> https://storm.torproject.org/shared/5h1Goax5eNusxjXJ_Ty5Wl7hFR1uqCReUiN8xdlBG8T <- agenda pad
15:02:33 <karsten> anything else to add to the agenda before we start?
15:03:02 <irl> i will add more if we have time
15:03:11 <irl> lower priority things
15:03:17 <karsten> okay.
15:03:20 <djackson> Added an item for browser metrics stuff
15:03:34 <karsten> cool!
15:03:40 <acute> nothing from me
15:03:48 <karsten> okay.
15:03:53 <karsten> start with the first item:
15:03:56 <karsten> starting*
15:04:00 <karsten> Using an alternative build system for Metrics Java codebases (karsten)
15:04:08 <irl> oh dear
15:04:15 <karsten> this is from a discussion on one of the tickets where we update to debian buster libs.
15:04:30 <irl> #31197
15:04:42 <karsten> our current system has its limitations, as I had to realize over the past day or so.
15:05:15 <karsten> I'm having trouble getting libraries updated (manually) to make metrics-web and exonerator work.
15:05:33 <karsten> my current idea is to revert the metrics-base and metrics-lib changes.
15:05:47 <karsten> which will break the jenkins buster build (and fix the stretch build),
15:06:02 <karsten> and then we can figure out a better dependency management system for us without pressure.
15:06:17 <karsten> the current situation is that we cannot merge and deploy anything, which is bad.
15:06:40 <karsten> thoughts on this overall plan?
15:06:54 <irl> i think we still need to see this as urgent, we're not removing *that* much pressure
15:06:59 <irl> but reverting the changes seems sensible
15:07:35 <karsten> okay.
15:07:36 <irl> i don't think we should prioritise the fixes that will make jenkins work, but rather the ones that match our priorities
15:07:43 <irl> which is being able to maintain the codebases and run them
15:07:54 <karsten> agreed.
15:08:35 <karsten> okay, I'll move forward with that. it's a pity to revert, and it's a pity to give up 2 (?) days of work. but this won't succeed otherwise.
15:08:47 <irl> well, the work was needed to find out that we had the problem
15:08:50 <irl> better to have found it than not
15:08:58 <karsten> right.
15:09:16 <irl> for the future, you mentioned switching to maven perhaps
15:09:41 <karsten> yes, I didn't look much yet. any opinions on that vs. another system?
15:09:47 <irl> this is probably not a terrible idea as the debian java team is under-resourced and is increasingly going to be a limiting factor
15:09:48 <karsten> I think there are some systems that are closer related to ant.
15:10:11 <irl> maven seems ok, it has inheritance so we can still maintain metrics-base of we want to
15:10:19 <irl> the alternative i would seriously consider is gradle
15:10:32 <irl> it is not as mature as maven but it does have a large userbase so we do still get network effect
15:10:54 <karsten> okay, good to know.
15:11:15 <karsten> I think maven comes with a bigger change to overall project directories and all that.
15:11:30 <karsten> which isn't bad per se, but which takes time.
15:11:33 <irl> if we have to do it then we have to do it
15:11:51 <karsten> I'll look into those and we can discuss how to proceed from there.
15:11:56 <irl> ok
15:12:12 <karsten> cool!
15:12:15 <karsten> next topic:
15:12:20 <karsten> Non-metrics, but sponsor, work (irl)
15:12:45 <irl> at the meeting there were some anti-censorship team tasks that i have agreed to do as they are sponsor work
15:13:04 <irl> mostly this agenda item is to make you aware that i will have some time not working on metrics to complete those tasks
15:13:12 <karsten> okay.
15:13:17 <irl> it's not much but it might be a couple of week's of points
15:13:51 <irl> some of it is on bridgedb so at least i will be more familiar with that later
15:13:54 <karsten> sounds good. when we do sprints of 1 or 2 (?) weeks, we'll include that in the planning.
15:14:01 <irl> ok cool
15:14:30 <irl> that's all for this topic
15:14:41 <karsten> sounds good!
15:15:14 <karsten> GitLab and CI (irl)
15:15:36 <irl> we have moved onionperf into gitlab
15:15:38 <irl> https://dip.torproject.org/torproject/metrics/onionperf
15:15:45 <irl> also the CI for onionperf is running in gitlab
15:16:03 <irl> the canonical location for the repo is still git.tpo but it gets mirrored there and CI runs on every commit
15:16:11 <karsten> nice!
15:16:14 <irl> when i figure out how to do it we can also do this for merge requests
15:16:40 <irl> i know gaba is keen that we migrate issue tracking there too, and probably for onionperf this is an easy first project to do that with
15:16:49 <karsten> oh!
15:16:51 <irl> acute now has an ldap account so can access gitlab and git.tpo
15:17:08 <karsten> migrating issue tracking seems like a major step.
15:17:15 <irl> i think gitlab has some kanban like thing that goes across projects
15:17:28 <irl> i don't want to do this for other codebases in this 6-month roadmap
15:17:36 <irl> but to try out the issue tracking we can do it for onionperf
15:17:38 <karsten> okay.
15:18:02 <irl> we can also look at adding other projects here for CI
15:18:14 <irl> the first one would be metrics-lib
15:18:17 <karsten> like metrics-lib... heh
15:18:33 <irl> the way the CI works is we can run whatever commands we want in any docker container we want
15:18:40 <karsten> yes, let's do that. in particular now that we're breaking the jenkins build again.
15:18:50 <irl> for onionperf we use a debian stable VM, install it and then run the tests
15:19:07 <irl> so it's pretty flexible
15:19:10 <karsten> how different is that from what we're doing with jenkins and metrics-lib?
15:19:37 <irl> the main difference is that the CI steps are maintained in the codebase and we can edit them just by committing a new config
15:19:52 <karsten> okay.
15:19:55 <irl> so we can experiment a bit to make sure CI is serving us, rather than us serving CI
15:20:33 <karsten> sounds great to me!
15:20:56 <irl> that's all on gitlab and ci unless there are more questions
15:21:03 <karsten> that would best happen after we have a better build/dependency management system in place.
15:21:20 <irl> yes
15:21:24 <karsten> no questions from me. curious how this works!
15:21:44 <karsten> alright, moving to:
15:21:47 <karsten> Scaling simulations (irl)
15:22:13 <irl> so this is something that might be quite a bit of upfront work for us but later makes everything a lot easier
15:22:36 <irl> we want to perform simulations like changing fast/guard cutoffs for voting and seeing how it affects network capacity
15:23:00 <irl> we can write scripts that do this as one-off things, but we really need a framework for this as we're going to be doing it a lot
15:23:33 <irl> i turned a small ticket into a 10 points ticket to allow us time to get tooling together for this, probably using an sql database
15:24:14 <karsten> okay.
15:24:15 <djackson> (sorry, do you have a link to the ticket hand?y)
15:24:17 <irl> the vision i talked about with mikeperry is that other teams can just get the data they need out of metrics by crafting an sql query, or through some other interface we provide, it shouldn't always be that we need to help
15:24:26 <irl> maybe there's a ticket
15:24:49 <karsten> can we give them a database dump?
15:24:53 <irl> the task is called "Use emulated consensus with historical OnionPerf data to predict Tor performance with modified consensus Fast/Guard cutoff values" but I don't know if a ticket exists
15:25:14 <irl> no, not a database dump, we would maintain the database and provide tools to perform the query they need
15:25:41 <karsten> hmm, okay.
15:26:04 <irl> i'm not sure exactly what it looks like yet, but it's going to be different to what we've done in the past
15:26:27 <karsten> indeed.
15:26:30 <irl> maven/gradle might actually really help here allowing more rapid development/prototyping
15:26:39 <irl> not having to worry so much about the dependencies being in debian
15:26:51 <djackson> fwiw, I think having a single source of truth database (compared to csvs etc) would be super handy.
15:27:14 <irl> it's not a single source of truth database, that's collector
15:27:15 <karsten> I'm just careful that adding a new service for this means we'll have another service to maintain.
15:27:28 <djackson> I don't think you need to have live database access though, automated daily dumps would be frequent enough without the hassle of securing it
15:27:37 <karsten> I very much agree that having a database rather than reprocessing files every time makes sense.
15:27:42 <irl> but it's got summary/metadata information for the last X time (whatever capacity we can manage) to allow simple queries
15:28:19 <irl> data would expire from it, although if you want to rent an AWS instance and put the entire history in RAM then our tool should let you do this
15:28:45 <irl> this is not only useful for scaling but also for sponsor either 28 or 31 don't remember which
15:28:52 <irl> the one click anti censorship thing
15:28:53 <karsten> maybe we could start with a database dump and think about better interfaces as step 2.
15:29:03 <irl> yeah that would be an ok starting point
15:29:21 <djackson> I am obviously unfamiliar with a lot of the details, but wouldn't you end up having to maintain the colltector->DB pipeline? Long run seems like it would be easier to have everything go into the DB directly
15:29:53 <irl> collector is the actual documents that we saw in the tor network, the database would be derived from those documents but we still want to keep the originals
15:30:08 <irl> the database would not hold data older than, say, 1 month
15:30:12 <irl> to keep it fast
15:30:44 <djackson> Uh, you can fit all onion perf data forever in a sqlite DB and its fast on the average laptop :)
15:30:51 <djackson> (and all consensus data)
15:31:00 <irl> and votes and relay and extra info descriptors?
15:31:00 <karsten> there's more on collector. :)
15:31:40 <irl> we might also then have DB->other things in the network health team pipelines
15:31:47 <djackson> Votes probably a little harder, but not out of reach I think
15:32:22 <karsten> we can figure out those questions while working on it, I guess.
15:32:31 <irl> we could have different retention policies for different things
15:32:36 <djackson> Yeah, sorry :)
15:33:22 <irl> so when mikeperry comes to talk to you about this task, you'll have the context and won't be surprised (:
15:33:27 <karsten> hehe
15:33:30 <karsten> sounds cool!
15:33:51 <karsten> okay, moving on?
15:33:55 <irl> yeah
15:34:02 <karsten> Update on browser metrics (djackson)
15:34:11 <djackson> So I've been working on gathering some user experience metrics using the Tor Browser.
15:34:24 <djackson> ather than measuring raw network latency/throughput like onionperf, the intent is to measure how the Tor network performs from the perspective of users.
15:34:41 <djackson> I've been building tooling / doing integration work with WebPageTest, which is a testing/orcehstration suite for Firefox/Chrome/Edge/etc. Shouldn't be too much work to connect it up to the Tor Browser.
15:34:57 <djackson> Example output: https://www.webpagetest.org/result/190725_BA_e075e435a914bd66cbf7b07813ed719c/
15:35:15 <djackson> Also spits out JSON / machine readable results for proper experiments/batch testing
15:35:26 <djackson> Hopefully some more details and preliminary results soon.
15:35:32 <irl> does it give you a HAR archive?
15:35:37 <djackson> Yep
15:35:42 <irl> ok cool
15:35:53 <djackson> This leads into my next question
15:36:05 <djackson> Any specific requests/suggestions for data?
15:36:17 <djackson> Feel free to have a think and email me or whatever.
15:36:33 <karsten> how is this related to matt's work on selenium?
15:36:48 <djackson> That was my starting point, but this should supersede it.
15:37:01 <djackson> the tl;dr is that selenium is great for network metrics
15:37:04 <djackson> but not good for visual metrics
15:37:21 <djackson> WPT gives stuff like % visual completion at various points, interactivity, etc
15:37:42 <djackson> And supports traffic shaping for various conditions and stuff. It's a much more complete package.
15:37:47 <irl> are these not just metrics you can compute from events you can see in selenium?
15:38:07 <djackson> Sure, but this already has the tooling built and the edge cases handled
15:38:22 <djackson> It grabs the videos of the page loading and does the inference etc.
15:38:31 <irl> ok, is it automatable?
15:38:32 <djackson> Yes
15:38:37 <djackson> Already have that bit up and running
15:38:52 <karsten> how would we match these requests with tor controller events?
15:39:09 <djackson> That is undecided. My current thinking is to leverage Tor Button
15:39:28 <djackson> Tor Button can write the events into the DOM and WPT can grab them by executing javascript (existing functionality)
15:39:36 <djackson> but I am open to better ideas!
15:39:51 <djackson> (and Tor button already matches requests to circuits for me)
15:40:30 <karsten> did you show this to matt?
15:40:38 <djackson> Yes, you are cc'd on the mail exchange :)
15:40:49 <karsten> oh, heh, it's in a pile... ;)
15:40:53 <djackson> I figured :)
15:41:03 <djackson> More details to come
15:41:12 <karsten> cool!
15:41:15 <djackson> but as I said, it you have specific ideas/requests/suggestions, do email me
15:41:28 <djackson> I currently plan to grab some visual metrics and a screenshot of the final page
15:41:43 <djackson> but if say you want 100 000 HAR archives of Tor Browser page loads, that can be done I think
15:42:18 <irl> my initial thoughts would be to fetch bbc.com or some other busy page and get the atf time and just see what sort of variation is going on there
15:42:52 <irl> is it similar to onionperf in the variation? is it better or worse?
15:42:55 <karsten> wait, but this is using firefox, not tor browser, right?
15:43:18 <djackson> So I currently have it working with Firefox over Tor. But am actively working on Tor Browser
15:43:23 <karsten> ah.
15:43:24 <djackson> It's fiddly, but getting there
15:43:29 <karsten> okay.
15:43:38 <djackson> irl: the atf time?
15:44:03 <irl> time until the content above the fold has been rendered and a user could consume/interact with it excluding 3rd party advertisements
15:44:20 <djackson> ah :) yes that's gathered
15:44:43 <irl> i think it would be an interesting distribution to compare
15:44:44 <djackson> excluding ads is harder, but I have a Ublock integration sorted so could diff-test the two :)
15:45:04 <irl> often a banner ad can be the thing that takes forever to load but the user doesn't care about it
15:45:13 <djackson> Absolutely
15:45:16 <irl> so as far as the user is concerned the page is loaded
15:45:37 <irl> i guess if it never loads then your tool considers that it was fully loaded because no dom event fired?
15:46:12 <djackson> Yeah. Visual completion is defined by looking at the end state and working backwards
15:46:32 <djackson> 'end state' is hard to define of course, but am trusting their metrics
15:46:41 <irl> so late banner ads and scrolling carousels are things we should watch out for
15:46:50 <djackson> Suspect it could be thrown off by auto playing videos, but need to test
15:46:55 <djackson> yeah indeed
15:47:09 <irl> there was a french university group that looked at removing these false positives
15:47:14 <irl> i will try and dig up the paper for you
15:47:33 <djackson> Summary: Work ongoing. Please feel free to email me requests/suggestions/reading material. :)
15:47:38 <irl> cool
15:47:38 <djackson> That would be great, thanks!
15:47:43 <karsten> nice!
15:47:48 <karsten> thanks for the update!
15:47:51 <djackson> np
15:48:05 <karsten> Lower priority things, as time permits (irl)
15:48:18 <irl> i guess we can do one of these
15:48:28 <irl> onionperf experiments
15:48:47 <irl> as part of the scaling work, we're going to deploy some onionperf instances that are experimental to try out some wacky ideas
15:48:56 <irl> acute and mikeperry are going to be running this mostly
15:49:49 <acute> yes, I will deploy a collector for these results, and we will have different tor clients running
15:50:09 <acute> the testbed will be automated with ansible
15:50:22 <karsten> how will the tor clients be different?
15:50:32 <acute> (based on the same playbooks we use for the current onionperf)
15:50:44 <djackson> This sounds super interesting. It would be great to compare the results with the analysis I sent to tor-scaling.
15:51:32 <acute> different configurations
15:52:30 <djackson> Is there some ticket or place to follow along?
15:52:47 <irl> i think tickets will appear as we get into turning the roadmap into tickets
15:53:22 <djackson> Okay. Any idea on the rough timeline? Are we talking weeks? Months?
15:53:27 <irl> acute should probably also subscribe to tor-scaling and send updates there
15:53:45 <irl> (unless it's a sekrit list, but i think it is not now)
15:53:50 <acute> from the roadmapping session, we're looking at the following 2-3 months
15:53:56 <djackson> Okay, great :)
15:54:08 <karsten> sounds exciting!
15:54:27 <acute> yes!
15:54:34 <djackson> I think tor-scaling is semi public. You can register if you can find the right link. but the link is hidden in the basement behind the door marked beware of the lion
15:55:34 <karsten> cool! can we talk about the other lower priority things next week?
15:55:40 <irl> yes
15:55:43 <karsten> approaching the 60 minutes mark...
15:55:48 <karsten> okay!
15:56:09 <karsten> lots of stuff going on. it shows that this was the first meeting in july. ;)
15:56:25 <karsten> thanks, everyone, see you all next week! bye!
15:56:33 <irl> bye!
15:56:42 <karsten> #endmeeting