16:59:57 <ahf> #startmeeting Network team meeting, 14th of june 2021
16:59:57 <MeetBot> Meeting started Mon Jun 14 16:59:57 2021 UTC.  The chair is ahf. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:59:57 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
17:00:03 <ahf> hello hello everybody
17:00:11 <nickm> oh hi!
17:00:13 <ahf> pad is at https://pad.riseup.net/p/tor-netteam-2021.1-keep
17:00:13 <mikeperry> o/
17:00:19 * nickm does ceremonial meeting dance
17:00:26 <ahf> we are missing david today who is out
17:00:45 <GeKo> o/
17:01:29 <ahf> how are folks gitlab dashboards looking?
17:01:43 <nickm> I'm happy with mine
17:02:07 <asn> o/
17:02:08 <ahf> i am not super happy with mine but that is my own fault
17:02:10 <asn> all good here updated
17:02:14 <ahf> excellent
17:03:29 <ahf> our two queries for releases looks the same as last week, with exception of the leap second ticket having gotten some attention since last meeting
17:04:25 * ahf is considering closing tor#40185 but that can be done post-meeting
17:05:07 <ahf> ok!
17:05:14 <asn> (ugh out of battery. relocating 3 mins brb)
17:05:19 <ahf> good luck
17:05:33 <ahf> i see nothing new from other team who needs something from us that ISNT being handled already
17:05:42 <ahf> the s30 tickets are there and david seems to be on the NHT tickets
17:06:23 <ahf> we have a single announcement: today is the day where i enable 2fa enforcement for /tpo/core/** membership on gitlab, but i think everybody have enabled it now
17:06:32 <ahf> if not, now is a great time
17:06:49 <ahf> that is all our mandatory meeting agenda items. i think mikeperry have some s61 stuff for us to talk about
17:06:51 <nickm> {also: reminder that i'm doing reduced hours this month; see tor-internal post}
17:06:58 <ahf> ya!
17:07:21 <mikeperry> yes, very exciting news
17:07:45 <mikeperry> I went down a congestion control rabbithole last week and into the weekend and implemented 3 congestion control algs
17:08:01 <mikeperry> I tested them with onion services, and all out-perform SENDME on the live network
17:08:27 <nickm> by how much?
17:08:32 <mikeperry> (I don't advocate or endorse working on weekends, but the live testing meant I could find and fix issues quickly and just got into an obsessive loop)
17:08:45 <mikeperry> I will probably take a day off this week
17:08:52 <ahf> sweet!
17:09:14 <nickm> Oh: Reminder that Friday we're observing a US holiday.
17:09:22 <ahf> and yes, please do - is it something people can start toying around with already to build up some understanding of these code changes?
17:09:25 <mikeperry> oh shit, I didn't know
17:09:29 <mikeperry> thanks
17:09:36 <ahf> i assume in the end we may include multiple CC algorithms in tor and let the consensus pick the one we prefer?
17:10:27 <mikeperry> anyway the alg perf compared to sendme is hard to say because I was just using random circuits. I saw anywhere from 3-5X faster.. but because I was tuhning against live which is all SENME, I may have made them too agrtessive (which means they will over-queue when deployed en-masse)
17:10:36 <mikeperry> we will still need shadow to learn that
17:11:18 <ahf> exciting
17:11:23 <mikeperry> ahf: yes, the plan is to tune them in shadow and get an idea on best algs and params there, and then verify on live once enough relays and clients have upgraded
17:11:32 <ahf> very nice
17:11:45 <asn> mikeperry: nicee wrt congestion control algos
17:12:10 <jnewsome> nod
17:12:19 <mikeperry> I need to do a lot of refactoring and implement stream flow control still.. I'd guess maybe a couple weeks before ready for prelim review of code structure
17:13:07 <mikeperry> (so we can review code structure and refactor per review feedback while we churn through endless shadow sims)
17:13:26 <ahf> if you need relays on the production network some hax patches applied before, we can also do that to get some feeling with it there too
17:13:27 <jnewsome> btw I chatted with Rob last week about modeling CPU latency; he agrees about the basic approach we discussed but we both think it'll be a pretty rough estimate. hopefully we'll end up showing that the CPU usage doesn't make much difference and we can ignore it
17:13:35 <mikeperry> it will also be useful to set up an onion service test on live, with fixed paths
17:14:20 <mikeperry> to further check live behavior, since this is a faster testing loop than shadow sims
17:14:56 <mikeperry> I hacked this in my branch, but it can be made much more rigourous for better datapoints
17:15:14 <ahf> makes sense. i have no idea how to pin the entire path for an OS though
17:15:39 <mikeperry> yes I am hoping dgoulet and/or asn can help with that
17:16:08 <mikeperry> it might also help if we pin some relays we run, so we can look at their queues
17:17:08 <ahf> yeah
17:17:17 <mikeperry> jnewsome: how about client models from rob's previous experiments? one thing he may not have done that we will need is an uploading client
17:17:37 <mikeperry> because that is where the flow control behavior will matter
17:17:57 <jnewsome> mikeperry: that should be an easy tgen config
17:18:28 <nickm> mikeperry: perhaps a foolish question: do your tests verify that the correct data, and the correct amount of data is actually transferred?  I've messed up in the past by only counting the time up to an EOF
17:18:41 <jnewsome> actually one nice thing about shadow 2.0 is if you already have a specific client you want to test, we may be able to run it directly in shadow
17:19:11 <mikeperry> nickm: I was just testing with a curl + onionshare. but good call I will check sha's and such
17:19:21 <nickm> mikeperry: great
17:20:33 <mikeperry> in general I want better metrics from the live tests than I have... need to gather RTT stats, and get better throughput infos
17:20:52 <mikeperry> but I still learned a lot and fixed a lot of things from the basic testing
17:21:14 <mikeperry> i forgot to say some things in the proposal too that I realized after re-reading my old posts :/
17:21:23 <mikeperry> so there will be proposal updates
17:22:11 <ahf> cool
17:22:12 <mikeperry> anw that's super exciting.
17:22:15 <ahf> ya
17:22:42 <ahf> we should probably make some comms noise too when we start having this in the tor code
17:22:58 <mikeperry> also remember re perf: this is a network effect thing.. the perf increases I saw will be small compared to what happens when all clients switch
17:23:06 <mikeperry> if we tune it right in shadow
17:24:01 <mikeperry> I think that's it for congestion control, unless there are other questions
17:24:10 <ahf> cool
17:24:15 <mikeperry> GeKo,juga: did you want to talk about the flooding experiment and/or sbws?
17:24:30 <GeKo> i can
17:24:41 <GeKo> we found a critical bug in sbws
17:24:54 <GeKo> so we asked one dir auth op to move back to torflow
17:25:03 <asn> oof
17:25:06 <GeKo> until we have investigated and a fix
17:25:18 <ahf> oh ok :-/
17:25:22 <GeKo> sbs#40091 is the ticket
17:25:28 <ahf> sbws#40091
17:25:34 <GeKo> it seems juga has something to test
17:25:36 <juga> hi
17:25:45 <GeKo> and i'll review it tomorrow
17:26:02 <GeKo> hopefully we can get the third bwauth back to sbws like next week
17:26:16 <GeKo> hihi
17:26:23 <gaba> o/
17:26:31 <GeKo> juga: anything you feel we should add here?
17:27:09 <GeKo> for the flooding part
17:27:21 <GeKo> i collected all the questions we gathered so far
17:27:26 <juga> hmm, not really, just that maybe tjr and i are finding the issue with sbws stalling and/or out of memory
17:27:34 <GeKo> and created a new ticket to not mess up the old one
17:27:37 <GeKo> https://gitlab.torproject.org/tpo/metrics/analysis/-/issues/40001
17:28:04 <GeKo> juga: right, that's one of the other sbws mystery's we try to figure out
17:28:19 <GeKo> it's only happening intermittently and only on one bwauth...
17:28:22 <GeKo> so, exciting :)
17:28:35 * juga will update #40092 with the findings
17:28:50 <ahf> nice
17:28:51 <GeKo> i think that's all for those two items
17:29:26 * GeKo hands mic back to mikeperry
17:30:13 <mikeperry> I think I am good unless there are more questions. also happy to get volunteers for help with live testing and instrumented relays
17:30:27 <mikeperry> sad dgoulet is out, I am guessing he can help there tho
17:30:34 <mikeperry> I have refactoring and cleanup to do anyway
17:30:54 <ahf> i will happily help with the relay stuff and testing on live network. david also have shell on those boxes though if he dives into it faster
17:31:46 <ahf> shall we call it then and get back to our non-irc shells
17:31:56 <ahf> anything we are missing?
17:32:00 <mikeperry> jnewsome: lmk once you get that docker thing going on the new box. it is a bit early to test this branch in shadow still - it is hacky. but soon! :)
17:32:28 <gaba> mikeperry: only a note to say that anarcat already has all the info we need to setup the shadow server. lavamind will do it.
17:32:32 <jnewsome> mikeperry: cool, still waiting on @anarcat to get access
17:32:40 <jnewsome> who I believe is waiting on cymru for credentials
17:32:52 <gaba> jnewsome: not waiting anymore. all set and it will happen this week.
17:33:10 <jnewsome> gaba: woohoo!
17:33:33 <anarcat> i'm going to double-check that right now
17:34:14 <anarcat> confirmed, i do have some access
17:34:24 <anarcat> the setup of those boxes is tricky and might take a while
17:36:39 <ahf> ok
17:36:43 <ahf> i am gonna disable the bot now
17:36:48 <ahf> talk to you all in the other channel(s)
17:36:50 <ahf> #endmeeting