14:58:46 <karsten> #startmeeting metrics team meeting
14:58:46 <MeetBot> Meeting started Thu Sep  3 14:58:46 2020 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:58:46 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:58:53 <karsten> https://pad.riseup.net/p/tor-metricsteam-2020.1-keep <- pad
14:58:53 <acute> hi!
14:59:06 <gaba> hi!
14:59:07 <woswos> o/
14:59:10 <jnewsome> o/
14:59:13 <karsten> hi everyone!
15:00:13 <karsten> please add agenda topics!
15:02:24 <karsten> shall we start?
15:02:39 <karsten> let's start.
15:02:43 <karsten> OnionPerf 0.7 release and deployment
15:02:57 <karsten> we resolved #33399 as the last remaining issue for 0.7 and put out the release.
15:03:05 <karsten> (which really means git tag it.)
15:03:36 <karsten> acute and I spent quite a few hours yesterday and today on deploying 0.7 on three new hosts.
15:03:47 <karsten> while at the same time documentation what we did.
15:04:00 <karsten> we're currently blocked by not having DNS entries for the three new hosts.
15:04:08 <karsten> gaba: can you ping hiro about that once more?
15:04:39 <gaba> yes
15:04:56 <karsten> thanks! acute and I would like to finish this tomorrow in the UTC morning.
15:04:58 * gaba just sent hiro a msg
15:05:02 <karsten> thanks!
15:05:17 <karsten> long term we'll need a better way to add DNS entries than bug the admins.
15:05:23 <karsten> but that's for later.
15:05:52 <karsten> okay, that's the update on this. any questions?
15:06:44 <karsten> alright. next topic:
15:06:48 <karsten> Experimental runs (--drop-guards?)
15:07:03 <karsten> mikeperry: do you want to start running experiments soon?
15:07:39 <karsten> with the idea being that I'm still around for ~2 weeks and can help with any issues coming up.
15:07:52 <mikeperry> yah I was just looking at https://gitlab.torproject.org/tpo/metrics/onionperf/-/issues/40001 trying to find doc urls but many seem moved?
15:08:08 <karsten> that documentation does not exist yet, it's in the making.
15:08:17 <karsten> but there's a README.md in the onionperf repo.
15:08:27 <karsten> in theory that should be a good start.
15:09:08 <mikeperry> ok great
15:09:31 <karsten> what else do you need to start this?
15:10:47 <mikeperry> the droptimeouts branch
15:10:55 <karsten> that's the 0.7 release now.
15:10:56 <mikeperry> and some example graphing scripts
15:10:57 <karsten> or, master.
15:11:12 <karsten> well, master is the 0.7 release until there's going to be a 0.8 release in a few weeks.
15:11:39 <karsten> graphing scripts are part of onionperf's visualize mode. you don't need anything else for graphing.
15:12:03 <karsten> https://gitlab.torproject.org/tpo/metrics/onionperf#visualizing-measurement-results
15:12:35 <karsten> that mode produces a CSV and a PDF file with lots of graphs in it.
15:13:26 <mikeperry> ok cool
15:13:45 <karsten> okay, great, please go through that README and shout if anything important is missing.
15:14:13 <mikeperry> I am still finishing up the tor browser android proxy audit but by mid next week I should have time for this
15:14:32 <mikeperry> I think in the meantime I will just try to get the 0.7 onionperf release runing as-is and collecting data?
15:14:50 <karsten> yes, that's a good plan.
15:15:28 <karsten> cool!
15:15:30 <karsten> moving on?
15:16:33 <karsten> OnionPerf 0.8 open issues
15:16:42 <karsten> we have four issues open.
15:17:03 <karsten> and I think we have some good plans for all of them, even though none of them is trivial and quickly done.
15:17:15 <karsten> let's go through these issues:
15:17:22 <karsten> tpo/metrics/onionperf#33260
15:17:43 <karsten> acute: I just added another comment with a suggestion to improve usability.
15:18:06 <karsten> if you have 15 minutes, please take a look and let me know whether you like it. I'll then write more code.
15:18:17 <acute> ok, will do!
15:18:23 <karsten> thanks!
15:18:32 <karsten> tpo/metrics/onionperf#33421
15:18:48 <karsten> that's a beast, but a beast with a plan now.
15:19:10 <karsten> it would be great to have some early feedback on that plan, too.
15:19:20 <karsten> might require 30 minutes, though, and wouldn't be the final review.
15:20:26 <karsten> would be able to take a look at that, too? I'm aware that I'm stealing a lot of your time lately.
15:20:59 <karsten> or, wait, maybe we can ask mikeperry for an initial review of the concept there?
15:21:04 <acute> ok, no prblem
15:21:27 <acute> but also, that would be a good idea too!
15:21:53 <karsten> mikeperry: what do you think?
15:22:33 <karsten> acute: the part that you know better than mikeperry is how we would add guard information to analysis files.
15:22:40 <mikeperry> when I looked at just my control port, UP/DOWN did seem what we need. but I too did not get an UP for a while when connecting
15:23:00 <mikeperry> it might be the case that if you're not connected early enough in startup, you don't get that UP
15:23:06 <karsten> okay. we might have solved that by sending DROPGUARDS right at the start.
15:23:07 <mikeperry> and then have to DROPGUARDS? idk
15:23:08 <mikeperry> yah
15:23:16 <mikeperry> nice ok
15:23:22 <karsten> yes, that's already part of 0.7.
15:23:38 <karsten> I figured it probably doesn't do any harm to do it once more.
15:23:55 <gaba> karsten: it seems that the DNS entry is added already and resolved.
15:24:06 <karsten> gaba: oh? will check in a minute.
15:24:52 <karsten> mikeperry: do you want to take a look at the linked PDF to say if that's potentially useful?
15:25:50 <karsten> acute: and do you want to take a look at the commit to say if adding a list of guards next to circuits and streams can work?
15:26:01 <acute> will have a look as well, and also happy to review the code
15:26:14 <acute> yes!
15:26:22 <mikeperry> woah this number of guards over time graph is interesting
15:26:29 <mikeperry> tor actually spenidng a lot of time using 3 guards?
15:26:59 <karsten> acute: great! the code needs more work, but having an initial sense whether it's doing the right thing would be good. there will be another final review later. thanks!
15:27:02 <karsten> mikeperry: yep!
15:27:06 <mikeperry> or it said it had that many, but then only used 1?
15:27:11 <mikeperry> am confused about last two slides
15:27:39 <karsten> 3 guards means three GUARD UP events without any GUARD DOWN or GUARD DROPPED event in the middle.
15:28:23 <mikeperry> and then the next slide is actual use of the guard in a circuit?
15:28:46 <karsten> yes. that slide shows whether it used one of its 1/2/3 guards it had at the time.
15:29:06 <karsten> it doesn't say whether it used its first/second/third guard, though.
15:29:35 <mikeperry> but 1.0 means it stuck with that same choice the whole time?
15:30:11 <karsten> no, it just says that it used one of its guards in the circuit path in first position.
15:30:48 <mikeperry> hrm but tor always only uses 1 guard node per circuit :)
15:31:14 <karsten> yes, but 1.0 means it had a GUARD UP event for that first relay before it built the circuit.
15:31:30 <karsten> rather than picking some random other relay as guard.
15:31:47 <karsten> except for the first few hours where we didn't have GUARD UP events. (as discussed above.)
15:31:59 <karsten> I'm not saying that this is the most useful graph ever made! ;)
15:32:28 <karsten> what else are you interested in?
15:32:36 <karsten> we can make better graphs.
15:32:53 <karsten> as long as we have data.
15:33:00 <mikeperry> hrm I am still confused.. what we want to somehow graph or check is that the actual first hop in circuits is staying the same, and if not, we want a count of how many unique relays show up in that first hop position
15:33:41 <karsten> I see.
15:34:21 <karsten> I'll see what I can do with the data.
15:34:42 <mikeperry> and we also want to check that first position to make sure it agrees with GUARD UP (to watch for really scary bugs)
15:35:05 <karsten> that's what the last slide in the PDF is for at the moment.
15:35:35 <karsten> but I'll add another one for which guard was used if there have been more than one GUARD UP event before.
15:35:57 <karsten> that would show what tor does if it currently has 3 guards:
15:36:07 <karsten> a) does it stick to the latest one it picked?
15:36:13 <karsten> b) does it pick one of the three at random?
15:36:25 <karsten> c) does it go back to the second or first if the third fails?
15:36:33 <karsten> something like that.
15:37:05 <karsten> acute: feel free to wait with the initial review until I have done this.
15:37:36 <acute> ok
15:37:49 <mikeperry> yeah. the main thing we want to know is how many guards are in use in those circuits for each period between DROPGUARDS. having a distribution on guard choice in circuits may also be useful, but probably not immediately
15:38:58 <karsten> okay.
15:39:00 <mikeperry> like when we try out two guards, we will want to take a peek at that usage distribution. it should be uniform, but CBT will bias it if we do not count timed out circuits
15:39:54 <karsten> what's the torrc option for running with two guards again?
15:40:24 <karsten> maybe I should try that before making new graphs.
15:40:26 <acute> NumEntryGuards 2?
15:41:07 <karsten> in addition to UseEntryGuards 1, right?
15:41:32 <mikeperry> yah
15:41:44 <karsten> okay.
15:42:03 <karsten> alright, let's quickly touch on the other two issues:
15:42:10 <karsten> tpo/metrics/onionperf#33420
15:42:35 <karsten> I'd like to work on the guards thing first, because it might have an impact on this issue.
15:42:58 <karsten> I was thinking that we could include build timeout information just like we would add guards information to the analysis file.
15:43:09 <karsten> rather than including full parsed events.
15:43:51 <karsten> but that needs more thinking and coding and discussion and review.
15:44:17 <karsten> I'd hold that ticket in my hands until then.
15:44:22 <karsten> if that's okay. :)
15:44:45 <karsten> tpo/metrics/onionperf#40001
15:44:55 <karsten> acute and I discussed this today.
15:45:21 <karsten> our current plan, which we didn't put on the issue, is to ask for a new host for onionperf.torproject.org.
15:45:47 <karsten> that host would contain half a dozen static html pages with documentation and lists of experiments and such things.
15:46:08 <karsten> that host would also serve tarballs from past experimental onionperf instances and past long-running onionperf instances.
15:46:38 <karsten> the static html pages would be under version control in onionperf-web.git, created today.
15:46:53 <karsten> that's the plan.
15:47:11 <karsten> and we both realized how it's a bold plan that isn't implemented quickly.
15:47:36 <karsten> but the goal is to keep all relevant documentation parts and data in one place.
15:48:17 <karsten> okay, I guess we're proceeding with this plan until there's something to see, and bring it back for discussion here.
15:48:21 <woswos> Also the onionperf link in https://metrics.torproject.org/sources.html can point to that documentation. Currently it points to https://github.com/robgjansen/onionperf
15:48:47 <karsten> good idea. either to onionperf.torproject.org or to the gitlab project page.
15:48:53 * karsten makes a note.
15:48:58 <karsten> thanks!
15:49:34 <acute> nice spot :)
15:49:38 <karsten> okay, that's all on the four remaining issues for 0.8.
15:50:07 <karsten> anything else on that topic or on another topic?
15:51:04 <acute> the DNS entries are there!
15:51:10 <karsten> ah, yes, I had checked, too.
15:51:13 <karsten> thanks, hiro!
15:51:33 <karsten> that means we can finish deployment tomorrow, acute!
15:52:10 <karsten> alright, if there isn't anything else, let's end this meeting and meet again in a week from now!
15:52:15 <mikeperry> i am curious about the status of using onionperf with shadow
15:52:19 <mikeperry> if jnewsome is around
15:52:20 <karsten> oh.
15:52:34 <jnewsome> o/
15:52:35 <karsten> let's talk about that.
15:52:47 <karsten> in the remaining 5 minutes before we're being kicked out.
15:52:49 <mikeperry> I might also want to try that but probably not until I get the basics working with 0.7 and live network tor
15:53:06 <jnewsome> what are you trying to do exactly?
15:53:47 <mikeperry> jnewsome: run a tor network, put some synthetic traffic over it, measure the performance and other characteristics of some test clients (ideally onionperf)
15:54:12 <mikeperry> for things like this: https://trac.torproject.org/projects/tor/wiki/org/roadmaps/CoreTor/PerformanceExperiments
15:54:22 <jnewsome> gotcha
15:55:10 <jnewsome> normally you'd set up shadow to include some tgen instances to add/measure traffic
15:56:02 <jnewsome> my understanding is that onionperf does the same, but on the real network
15:56:24 <mikeperry> what is the best/easiest way to get comparible data betwen
15:56:26 <jnewsome> you wouldn't run onionperf itself under shadow; you'd add the corresponding tgen jobs into the shadow config
15:57:19 <jnewsome> Presumably you'd want to set up the same kind of tgen jobs as onionperf does, and then use the onionperf log parsing tools to parse the resulting tgen logs
15:57:30 <karsten> maybe we'll have to talk about getting comparable data out of the two as one of our next milestones.
15:57:49 <karsten> onionperf would also need the torctl logs for some analysis parts.
15:57:52 <mikeperry> ok. I don't need it immediately
15:57:55 <jnewsome> This is a good question though; I've been thinking I'd like to see how hard this currently is and streamline as needed
15:58:11 <mikeperry> but the goal is to be able to take our live testing and go back and make sure we're seeing the same things in shadow
15:58:11 <karsten> okay. I'm afraid we'll have to end the meeting now.
15:58:21 <karsten> yes, this makes a lot of sense.
15:58:32 <karsten> sorry, folks! more next week!
15:58:35 <jnewsome> o/
15:58:36 <karsten> #endmeeting