16:59:31 #startmeeting Network team meeting, 31 January 2022 16:59:31 Meeting started Mon Jan 31 16:59:31 2022 UTC. The chair is ahf. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:59:31 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:59:35 hello hello welcome 16:59:41 our pad is at https://pad.riseup.net/p/tor-netteam-2022.1-keep 16:59:46 hihi 16:59:50 Evening all. 16:59:51 hi 16:59:58 o/ 17:00:20 o/ 17:00:36 how is our boards doin' at https://gitlab.torproject.org/groups/tpo/core/-/boards ? 17:02:09 I think we have a few things to untangle at arti, but I also think we're on it 17:02:19 o/ 17:02:22 I also need to pick up a little more hacking work once I'm done with reviews 17:02:28 (hi hiro!) 17:02:41 i need to finish those two s30 tickets so i can get out of that this week 17:02:45 o/ 17:02:52 Today has been a busy review day (and fixing random stuff day) for me. Volunteers turn up at the weekend and Do Stuff! 17:03:02 may need some help understanding how we handle our flag calculations at some point from either nick, david or mike 17:03:27 Diziet: :-D 17:03:59 ok i don't see anything looking off on the board at least 17:04:05 ahf: as in flags in the directory votes? 17:04:13 o/ 17:04:21 nickm: yes, in this case for the bridge auth 17:04:26 (hi, juga, eta!) 17:04:30 we have an annoying bug that i cannot reproduce locally with stable flag assignments there 17:04:35 o/ 17:04:47 on gman's bridge auth 17:05:10 nickm: maybe i can prod you about it wed/thu this week if you have a moment there? i think i could use your brainz there 17:05:18 * gman999 seeped in $work but around-ish 17:05:29 sure. I think I understand how it's supposed to work, though the others may have a better handle on how it really behaves 17:05:34 gman999: don't need you, yet, but we will need to test some things at some point :-/ 17:05:45 i'll ping you, probably over email this week on this 17:05:45 cool cool 17:05:49 np 17:05:55 nickm: awesome 17:06:00 dgoulet: anything on releases this week? 17:06:41 not really. I'll let mikeperry update us about s61 which is in relation to 047 17:06:57 excellent 17:07:04 we can do that in the s61 section 17:07:21 ugh, it's not the first monday of the month.. it's the last 17:07:52 doesn't look like there is any uncaught items from other teams 17:08:14 eta added an excellent item today to our discussion section: 17:08:18 [2022-01-31] giving external contributors more access / authority to merge (specifically trinity for their work on CI, etc) ~eta 17:08:47 we *finally* have a project where a lot of external people seems able to contribute a bit every now and then 17:09:05 and we have never really found a good way to handle that in our team in general, but now that we have arti we should probably start doing something here 17:09:42 we have in the org the ability to make people are "core contributor" which happens every now and then, so i think the goal with people we get in this way is eventually they should become that 17:09:43 but 17:09:52 what can we do to make it more attractive here and now to contribute to the project(s) ? 17:10:01 We should come up with a skeleton process for making someone into a maintainer. It doesn't have to be heavyweight, but we should write down the steps (and which parts of the discussion should be public) etc. 17:10:17 yeah 17:10:53 it seems to me, for example with trinity, that they already have some domains where they contribute a lot to: infrastructure, testing, and CI, right? 17:10:56 yeah 17:10:58 Inevitably it will invole some discussions of someone as a person so there needs to be a substantial element not done in the glare of publicity. 17:11:13 like, why I asked the question in the first place was because I'd kind of rather they review CI MRs 17:11:30 they know a fair deal more about it than me anyhow, since they've worked on it a fair deal 17:11:47 I got assigned a CI MR from trinity for review this morning and was like "urrrr no idea, err, let me throw it at nickm"... 17:11:56 maybe we should do the easy thing then for now while we build up a process for this and ask them if they are up for taking on the review task for CI/test ? 17:12:02 they still need a reviewer for their ow nchanges tho 17:12:04 own changes* 17:12:05 eta: That's really valuable expertise 17:12:41 ahf: Right, having a 2nd eye on everything is a good idea, even for maintainers. 17:13:47 so i think there is two things in this: we need to find someone who is up for talking with trinity if they are up for this? if nobody wants to, i'm cool with it and we should say this is a new thing for us that we are trying to build and the other thing i will have to do here is to figure out if we can make the review assignment bot be a bit smarter on who to assign things to based on files touched? 17:14:09 SGTM on both counts 17:14:26 does anybody want to talk with trinity here or should i? :-) 17:14:58 * eta doesn't feel a strong desire to, but can if required 17:15:06 i can do it, that is fine 17:15:26 ok, i think that was this item for now. i'll start a discussion on the process in our team and then prod trinity here and now 17:15:33 and look into what the triage-bot can be made to do 17:15:34 ok! 17:15:38 :-) 17:15:40 mikeperry: wanna do s61 stuff? 17:15:47 ok 17:17:02 so last week, I began running sims after switching the simulator over to negotiation, and while simulataneously checking this nagging issue of some guard relays having large-ish circuit queues, I noticed that something in the negotiation branch made it worse 17:17:39 jnewsome and I suspect that the geoip file update could have changed the network characteristics of the network model that the simulator builds, but I am making a list of other potential issues from the diff since rebase as well 17:18:03 so we will be trying to confirm the cause of that regression while I prepare the branch for review 17:18:37 while preparing the branch and checking the spec, I also noticed a missing piece from onion service negotiation, which dgoulet quickly fixed 17:19:02 hm, interesting with the geoip file 17:19:05 but I need to clean up commit structure now and make sure everything is clean, so review is not annoying 17:19:33 so I am a bit behind on that 17:20:08 I will be updating the sim plan soon wrt this investigation, as well as onion service testing, and other things the alpha needs tested wrt negotiation 17:20:24 cool! 17:20:37 jnewsome: how is the onion service sim support? hiro was asking me about what/when to do there? 17:21:39 mikeperry: i need to make another pass over it and send it back to rob for review. I'll try to make sure I get that done today 17:21:43 with this regression, there's now a few things that need simming. those can be done first, to find that regression, while the onion service stuff moves further along 17:22:15 or we could run a full scale sim with onion services just to see how it behaves. I can check for some things with just scripts, before graphs exist 17:23:05 i think i may have just forgotten to check this off in my list, but does reviews happen in parallel with this investigation or do we continue on the code review after this have been looked into? 17:23:12 ok. let me know. once we have that, it might be good just to kick off a full size sim, so hiro can have a pipeline output with results to look at graphing 17:23:13 * ahf have not gone over GL yet today 17:24:21 ahf: the regression is extremely minor. I think running sims to track down the commit that caused it is ok to do in parallel with code review 17:24:30 excellent 17:24:45 but I also have some things to do on the branch and the spec to make it cleaner and easier to review 17:24:58 ok, you wanna do that first? 17:25:06 so I am juggling a few things here. it may be another couple days before MRs are ready 17:25:11 ok! 17:25:16 let's wait a few days then, no worries 17:25:18 cool 17:25:28 i assume you'll prod us on irc when things are ready 17:26:03 we also have the report metrics, which were in fact impacted by something that happened last quarter. either the DoS and/or relay removal made our performance indicators worse, in the report 17:26:36 it looks like we mentioned the DoS in the indicators table, but maybe both are worth mentioning in the report, if we are not going to filter the dates out of the indicator metrics 17:26:46 gaba: that report is due like, today, yeah? 17:26:52 yes 17:26:54 yes 17:27:06 what time? 17:27:14 well. I already sent it to bekeela 17:27:20 if we are changing something, it should be now 17:27:20 do you think we should mentioned the dos and relay removal in more places, or is what we have enough? 17:27:23 yes 17:27:23 ok 17:27:31 let's add that into the narrative 17:27:35 the summary 17:27:48 ok. I can put a paragraph in there right after the meeting 17:27:52 thanks! 17:27:56 on the nextcloud link, yeah? that is still canonical? 17:28:24 yes 17:28:27 ok 17:28:29 still it is the right place 17:30:23 so I will be updating the simulator plan for the alpha today. that is top priority for me, so we're less scatterbrained about what we're doing for the negotiation branch and this overload issue 17:31:03 there may be some switching of ordering there, depending on how the onion service graphing comes along vs other things we can test sooner 17:31:21 souldn't we just mention this dos and relay removal in the next report? 17:31:32 seems a bit close to the deadline to revise a doc today if we are sending it off? 17:31:35 hiro,jnewsome: let's follow up later once rob does that review and/or we have something that can run in a full-size sim 17:31:51 mikeperry: sg 17:32:04 ahf: the problem is they will look at our metrics, like they always do, and ask "why did things get worse", like they always do 17:32:06 ahf: the report is already done. We are only adding one paragraph 17:32:45 so we need a pragraph that says we had two diff attacks on the network, so they at least know why 17:32:54 ah 17:32:55 ok 17:33:02 makes sense 17:34:16 gotta cross the T's on this stuff. this is why we're trying to create a process wrt ticket tags and tracking these date ranges. this stuff has to go into the report if it changes metrics, which it did 17:34:31 ya 17:34:33 so we need to make sure we see that earlier next time, I guess. it is a pretty clear change 17:35:39 anything else for today? 17:35:48 we're also a little handicapped not having geko on full capacity for that :/ 17:36:02 i am back! :) 17:36:09 juga,geko: anything wrt sbws and network-health? 17:36:31 we could use your input on https://gitlab.torproject.org/tpo/network-health/sbws/-/issues/40119 for sbws 17:36:39 yes 17:36:40 to figure out the prio of that one 17:37:06 also, new gabelmoo's CDF graph 17:37:15 none of them is urgent 17:37:25 otherwise not much from me. i made it out alive of the tor browser dungeon and ramping up on n-h work again 17:37:49 oh yes, that difference with the weight sum is likely gabelmoo's latency vs long claw. I meant to comment but I have been very distracted with the report, negotiation, and this overload cell queue issue 17:38:01 np 17:38:03 but juga's guess is likely right 17:38:27 so you would say no need to investigate more and it's fine? 17:39:19 yeah, it is the kind of thing that we should look into after sbws switches to congestion control 17:39:31 ok 17:39:46 before that point, performance in sbws is *heavily* dominated by latency of dirauth to fast relays 17:40:55 the graph in https://gitlab.torproject.org/tpo/network-health/metrics/analysis/-/issues/33077#note_2773414 looks more sane, btw 17:41:21 it looks a lot more in-line, and sbws is measuring some relays as faster, where as torflow was not 17:41:35 i see 17:41:46 sbws is the red line, yeah? 17:41:55 1min 17:42:05 I think this is ok 17:42:13 yes 17:42:29 and yeah, the previous discrepancy was likely because of the network attacks and flooding experiments in those months 17:42:34 many variables to control for, heh 17:42:43 yep 17:42:50 but I think this is good 17:42:57 it looks like sbws is working, from this last graph 17:43:09 (phew ;)) 17:43:14 \o/ 17:43:25 :-) 17:44:41 ok. I think that is it. I am of course displeased by how chaotic this is right now, but I will organize the sim plan at least, so it's more clear what we're doing there 17:45:25 i don't think you should be displeased by anything. there are many gears that needs to fit together in all of this and we are not that much off by the plan 17:45:52 i have one small thing i forgot to ask 17:45:56 yeah I didn't expect to have these kinds of sim issues at this point. but we will get to the bottom of these issues! 17:46:29 nickm, eta, Diziet: could y'all do 17 BBB tomorrow for arti api discussion? everybody else is welcome too, but based on my experience with the last round of this topic i will say that you need to have a good idea of rust if you want to dive into this conversation 17:46:49 mikeperry: \o/ 17:46:49 I'm free then 17:46:52 ahf: wfm 17:46:58 I can to 1700. 17:47:02 *do 17:47:06 awesome, let's go for that then 17:47:09 (that's what you meant, right?) 17:47:12 yes 17:47:34 nickm: Not 17 BBB's all at once. 17:47:40 ok, thanks all for a good meeting. next monday is the meeting where s61 is split into its own meeting since it's the first monday of february 17:47:58 i'll merge the january and december report for the forum as december was pretty low in stuff, but we can talk about that next week 17:48:00 thanks all o/ 17:48:01 #endmeeting