16:59:46 <ahf> #startmeeting Network team meeting, 20 december 2021
16:59:46 <MeetBot> Meeting started Mon Dec 20 16:59:46 2021 UTC.  The chair is ahf. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:59:46 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:59:49 <ahf> hello hello
16:59:56 <ahf> our pad is at https://pad.riseup.net/p/tor-netteam-2021.1-keep
17:00:20 <ahf> last meeting of 2021!
17:00:27 <juga> o/
17:00:40 * ahf pokes jnewsome, dgoulet, nickm, mikeperry, eta
17:00:53 <dgoulet> o/
17:01:13 <mikeperry> o/
17:01:23 <jnewsome> o/
17:01:24 <ahf> i am gonna do this real quick because i think everybody is trying to get their stuff done before checking out
17:01:40 <ahf> i don't see anything on the boards that require any special attention, it seems like people are OK there
17:01:54 * jnewsome is on vacation starting today but available to check in or answer q's
17:02:11 <nickm> hello
17:02:22 <ahf> jnewsome: ah! enjoy the vacation then! :-D
17:02:48 <ahf> dgoulet: anything we need to talk about re lsat week's releases? i added some bigger discussions that we can take about comms on releases in the new year
17:03:06 <dgoulet> nothing on my radar
17:03:39 <ahf> i don't see any tasks from other teams
17:03:59 <ahf> no announcements, no discussions
17:04:21 <ahf> the first meeting in 2022 will be monday the 10th at 17 UTC
17:04:36 <ahf> ian will start the 6th and start ramping up on arti things
17:04:40 <mikeperry> oh I had not updated the pad yet with s61
17:04:44 <ahf> mikeperry: you want to talk about s61?
17:04:53 <ahf> no worries
17:05:09 <mikeperry> sure
17:06:16 <mikeperry> jnewsome and hiro switched us to a new baseline for the sim, using a flooded network period in sept, and jnewsome created some sim parameters for consensus params
17:06:54 <mikeperry> I have basicaly tuned to within shadow's varience within runs. there's still some more tuning to double-check and re-run, but we have some pretty good results
17:07:38 <mikeperry> with 1ms KIST and the flooded network model (which should more accurately represent capacities), we're seeing some really high throughputs, and not uch impact on latency
17:07:48 <mikeperry> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/jobs/73790/artifacts/file/public/hk/tornet.plot.pages.pdf
17:07:53 <mikeperry> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/jobs/73790/artifacts/file/public/de/tornet.plot.pages.pdf
17:08:22 <mikeperry> I will continue to queue up some remaining whole-network sims over the break
17:08:53 <mikeperry> and in january, we will start looking at negotiation, circewma, final calibration, and then mixed network sims
17:09:05 <ahf> nice, really nice
17:09:13 <ahf> first time i look at this pdf
17:09:14 <dgoulet> epic
17:09:54 <mikeperry> yeah. 1ms KIST plus congestion control really takes the speed limits away
17:09:57 <ahf> so what is meant here by baseline? it's what a model should be able to do? and experiment is what tor is able to do?
17:10:32 <mikeperry> the baseline is a sim of 0.4.6 stock, using a network model derived from relay capacities during Rob's flooding experiment
17:10:47 <mikeperry> and public tor is a graph of the onionperfs from the live network during that period
17:11:15 <mikeperry> so there are still some small discrepen cies in the sim vs public tor, which we will dig into in Jan
17:11:20 <ahf> i see
17:11:32 <ahf> ya, not a big difference
17:12:15 <mikeperry> but yeah, some xfers have up to 300Mbit throughput, according to the sim
17:13:02 <mikeperry> and this causes very little/no additional latency
17:13:15 <mikeperry> circewma may occasionally be turning on still, though
17:14:00 <ahf> gonna be very exciting to see this out in the hands of our users \o/
17:14:10 <mikeperry> for reference, we are almost done eith round4 in the sim plan, thanks to the cloud runners jnewsome added the past couple weeks
17:14:13 <mikeperry> https://gitlab.torproject.org/mikeperry/tor/-/blob/cc_shadow_experiments_v2/SHADOW_EXPERIMENTS.txt#L586
17:14:41 <mikeperry> round 5 is our goal in January, before we release an alpha with negotiation and new default params
17:14:57 <mikeperry> round 6 is the sims to complete before a stable
17:15:42 <anarcat> mikeperry: how are runners?
17:15:44 <anarcat> jnewsome: ^
17:16:21 <jnewsome> anarcat: mostly good; the new one runs a lot slower for some reason, but I haven't had time to investigate
17:16:35 <jnewsome> we changed the label of it to 'shadow-small' for now
17:16:39 <mikeperry> anarcat: we put the two runners you made in 'shadow-small', since they are 4-5X slower than the beefy one we have in 'shadow'. jnewsome is using them for test sims
17:16:46 <jnewsome> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/issues/12
17:17:10 <mikeperry> I will run a sim in both 'shadow-small' and 'shadow' at the end of round4, just to see how the results compare between the two sets
17:17:17 <jnewsome> also I think we'll need to start garbage collecting old sim results from the persistent volume pretty soon
17:17:33 <mikeperry> there still is a bit of variance between runs in shadow in general tho. it is making the final tuning a bit tricky
17:17:33 <anarcat> the beefy one is a machine that is possibly a decade newer than the other
17:17:40 <anarcat> i am not surprised by 4-5x slowdown
17:17:53 <mikeperry> we are at the point where changing things is not really making as much difference as the variance in shadow
17:18:06 <mikeperry> which is good. it means we're pretty much as well-tuned as we can get
17:18:20 <mikeperry> at least, with the current shadow run length and run count
17:18:22 <jnewsome> we can reduce variance if we need to. easiest is to run more trials per experiment
17:18:50 <jnewsome> but using a larger network also makes a pretty big difference. simulating > 30m would also probably help (rob usually uses 60m)
17:19:06 <anarcat> chi-node-14 has 80 cores, chi-node-12 has 24... with your workload, it's bound to make at least a 4x difference, even disregarding the difference in CPU generation
17:19:27 <anarcat> chi-node-12: https://paste.anarc.at/publish/2021-12-20-Fweg1rPpketidZlNg2QTxZjReyIrFxJqRIBXZ9EGlcU/clipboard.txt
17:19:39 <mikeperry> yeah, in round5 in january, maybe we can try larger networks/more runs to investigate some of these ties
17:19:47 <anarcat> chi-node-14 https://paste.anarc.at/publish/2021-12-20-oeK7_ZvgYuyT8qbgv4c6DtiPHfoJ5znkgTJ-70t20gM/clipboard.txt
17:21:21 <ahf> anything else for today?
17:21:22 <jnewsome> anarcat: yeah, maybe that's enough to explain the performance difference
17:21:51 <mikeperry> geko and juga have anything? are they here today?
17:22:02 <mikeperry> juga has been examining sbws graphs
17:22:44 <juga> hey
17:22:49 <juga> no news today
17:22:55 <mikeperry> sbws looks good.. geko and I wanted to look at a gabelmoo graph from before and after the change, overlayed
17:22:56 <juga> still processing gabelmoo's data
17:23:00 <mikeperry> ok
17:23:08 <GeKo> i am
17:23:24 <GeKo> but not much in network health land...
17:23:30 <mikeperry> ok
17:23:59 <mikeperry> well that might be it then. I am very excited for january
17:24:40 <mikeperry> and 2022 in general
17:24:59 <ahf> ditto
17:25:24 <ahf> ok folks, let's call it then. hope everybody will have a really nice holiday, and thanks for all the awesome work this year \o/
17:25:35 <ahf> #endmeeting