18:00:35 #startmeeting network health 18:00:35 Meeting started Thu Jan 23 18:00:35 2020 UTC. The chair is GeKo. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:35 Useful Commands: #action #agreed #help #info #idea #link #topic. 18:00:44 okay 18:00:44 O/ 18:00:45 dgoulet: ping 18:00:48 aha 18:00:49 nice 18:00:51 hey! 18:00:57 let's get started 18:01:07 pad here: https://pad.riseup.net/p/tor-networkhealth-2020.1-keep 18:01:21 thanks, i was looking for the link :) 18:01:25 hey 18:01:29 o/ 18:01:36 the first item is to review the vision for the work of the team 18:01:48 so, this is the kickoff meeting for the new team 18:01:59 time to bring the big pieces on the table :) 18:02:12 yeah "vision" might be too much here 18:02:16 nice that roger added gamification. a friend brought that up at the cccamp realy operators meetup 18:02:43 but i thought about writing down the long lines we orient our work on might be worth it 18:02:50 yes 18:02:56 should we start with (1) in line 32? 18:03:07 we could! 18:03:10 it would be great to sort out this list by priority 18:03:45 track community standards about what makes a good relay 18:04:06 i've been thinking that because it's just one person for now, there should be a big focus on automation, because automation will be the only way the one person can tackle a growing set of topics. 18:04:11 at the moment we don't have any of these items connected with deliverables for sponsors, right? 18:04:30 ggus: right 18:04:31 ggus: we do not 18:04:44 maybe the performance/scalability related pieces 18:04:48 but others, no 18:04:50 automation and scoping i think, even with automation there will be a bit more of maintaining the automated processes over time, so they accumulate 18:04:53 arma: i am a fan 18:04:59 ahf: yep 18:05:05 so also wearing the "no hat" is good to keep focus 8) 18:05:18 ggus has a good point, which is "and writing funding proposals to keep this going" goes next to every line 18:05:40 so the priority of that list should be "(5) maintain the components of the network " 18:05:45 and writing funding proposals :) 18:06:18 as automation goes in that bucket 18:06:26 i think we should be more granular than, say, point to (5) 18:06:31 geko: agreed 18:06:41 there are pieces in each of the big items that are more or less important 18:07:03 * arma puts an X on the pieces that seem high priority 18:07:04 yes, that will guide the creation of the roadmap for this year 18:07:43 there, i placed my four X's. 18:08:38 everybody gets to vote on GeKo's todo list? :-P 18:08:44 i'm integrating geko stuff into the list 18:08:46 one thing that might be worthwile to think about right from the beginning is where to draw the line between network health work and community work in the relay community related things 18:09:04 the list is where it says 'vision for network health', it was a list that arma started some time ago 18:09:04 because that's one of the pieces that are less clear to me 18:09:14 yes 18:09:22 geko: yep. to start that, we should ask ggus how the relay advocacy thing is going from his side. because maybe he is happy to have help or maybe he is happy to help you. 18:09:33 exactly 18:09:38 I added some info on who I think the network health team should be working with in each item but drawing the line is a good idea 18:09:59 i've always assumed that the relay advocacy thing fell into the community team's list accidentally and they never knew quite why it was there 18:09:59 i have added to my 2020 roadmap to start reaching out relays orgs 18:10:18 nice 18:10:33 is that somethong for Q1 or Q2, or...? 18:10:38 *something 18:11:15 are we jumping into the roadmap? 18:11:21 no 18:11:32 i was just wondering about ggus' planning here 18:11:41 we didn't finish the roadmap yet, but we have some ideas about a relay operator meetup in US and Europe 18:12:03 okay 18:12:10 since i'm going to fosdem next week, i want to talk with some folks from the orgs about these ideas 18:12:26 sounds good 18:12:29 Any other thing in the 'vision' list that should be priority? 18:12:30 but first, i want to introduce myself 18:12:34 (other than arma's Xs) 18:12:53 geko: feel free to disagree with me about my X's. i just put them there to start. 18:13:30 yeah, i think i'd wait with the relationship strengthening to see where ggus is going 18:13:41 but the other three sound important to me 18:14:14 for the relationship one, figuring out everything we want to learn/get/offer from the relay orgs is a worthwhile thing 18:14:17 i am not sure about the state of our performance/scalability work 18:14:25 where in that list is the stuff on scalability/performance? 18:14:32 like, how would the network health team benefit from having a good line of communication with relay operators 18:14:34 but depending on where we are (3) the first bullet could be important, too 18:14:40 and making sure ggus includes that. or doing it yourself alongside hi. 18:14:41 m 18:15:05 geko: let's add an X in that one too then 18:15:13 arma: yes 18:15:26 i think if ggus would include that for now i am fine 18:15:31 and see how that goes 18:15:44 ok 18:16:13 * ahf has put 3 X'es 18:16:37 thanks 18:16:51 i think those are all good starting points 18:16:57 anything else there? I'm going to copy those Xs into priorities for this year 18:17:08 and i am fine starting with them as our items to focus on for now 18:17:31 so hmmm 18:17:45 there is no X about "network disruption or problems" 18:17:49 which is sorta of a big piece tbh 18:18:00 dgoulet: this is why i put an X next to baselines, for Q1. 18:18:11 step one, automate as much as possible knowing what is normal 18:18:15 step two can come later 18:18:31 otherwise you find yourself into the rat's nest wondering how you get there and which way is up 18:18:57 yeah 18:18:57 that point is confusing to me one how you'll get action items out of it... 18:19:11 like Metrics graph right now are pretty much our "baseline" and when it spikes, we look at it 18:19:31 it's also why, in my wishlist items for Q1, i wrote "- build a roadmap / brainstorm for all the future things we might automate measuring" 18:19:33 dunno, maybe others have a clear idea with this :) 18:20:08 i think looking at current and past network anomalies, to think of items we ought to measure, is a good way to start 18:20:14 maybe we need to be more specific on this and indlcude dgoulet's point 18:20:29 makes sense 18:20:35 well, i mean if that what metrics gives us today are the baselines for expected behavior fine with me 18:20:40 but i am not sure about it 18:20:46 another angle might be "enumerate and track network disruptions", so it is explicitly there 18:21:15 but the plan is not to fix every one of them by himself, but rather to use it as motivation for what to automate looking at 18:21:17 yeah out of this you'll have a big action item which is what arma mentions: "items we ought to measure" (and how) 18:21:19 ok, I was adding 'monitor network disruption or problems' 18:21:52 in an ideal world, when we're doing Q2 roadmapping, we have that menu to choose from. 18:22:09 yes 18:22:10 if Q2 arrives and we still only know "hm maybe we should measure something, wonder what" we have done it wrong 18:22:17 agree 18:22:22 let's move into roadmap for Q1, ok? 18:22:31 yes 18:22:48 that's important for the performance stuff too 18:22:50 there is a bunch of things there from roger's list to geko's tickets to the ask from the network team to help with sbws 18:22:53 as a preparation 18:23:40 i am fine helping with sbws 18:23:52 sbws is in theory maintained by net team but not really in reality :S 18:23:54 it's an important thing 18:24:05 outstanding bugs get some eyes though 18:24:08 yea, sbws needs leadership from the network team side, and also it needs somebody from the network health side telling folks how it's working. 18:24:31 that could be some kind of role division here, yes 18:24:56 i think sbws needs a funded juga as the big thing if we want it to fly 18:25:09 hrm, ideally yes 18:25:21 ahf and me are going to meet with juga next week 18:25:22 but we should try making plans to get it to fly without that, too 18:25:25 "ideally yes and what are we going to do in the meantime" 18:25:34 and we can go over a plan and help them with proposal for funding 18:25:54 yes, but we should not wait for funding to arrive here i think 18:25:59 +1 18:26:12 this is the #1 issue with the tor network right now i think 18:26:16 is sbws holding anything back with the network? 18:26:19 really 18:26:30 so some way of splitting work up in the meantime seems to be smart 18:26:33 i had the impression that it was on the very nice to have list. 18:26:34 to move this forward 18:27:09 when people set up relays they get wildly weird weights, and often their weights never go up even though they have capacity, and they stop their relays and move on with their life 18:27:39 this is just an impression though. we should tie it into "talk to relay orgs" to get real data, and data over time. 18:27:58 is this actually the case? like do we know people says that and turns off their relay? 18:28:03 yes 18:28:12 okay 18:28:25 a recent tor-relays mail from quintex was him saying "why do i get so little traffic, this is making me sad, nothing changed, what's wrong" 18:28:50 ok, for the roadmap i think we should get the priorities we have and write down issues/tickets that we have or we need to have 18:28:53 yeah, i read that email 18:29:22 ahf: gaba: so it might be smart while talking to juga to think about ways to move this forward without funding arriving immediately 18:29:50 i am happy to help with work here in some capacity, as said 18:29:58 yeah, i think i need to understand the blockers too with current sbws 18:30:17 before doing what gaba said 18:30:35 are we good with the big pieces for say the next 2-3 months? 18:30:46 (if i had to pick a #2 issue with the tor network today, it would be that some relays can't handle their traffic well. but that issue might come down to appropriate weighting too.) 18:31:03 sbws work is not incorporated into the current "q1" network team roadmap i think, and it goes "until costa rica" 18:31:29 ahf: there are some deployement blockers that teor mentioned 18:31:31 ahf: for sbws 18:31:42 ah 18:31:44 ahf: i added yesterday to the network team roadmap in the temporal spreadsheet 18:31:45 maybe that is enough 18:31:56 they are around 7 tickets 18:32:07 here: https://nc.torproject.net/s/TGeW5CX4GqaagMN 18:32:22 look at everything marked as performance+scalability 18:32:27 in green 18:32:31 it would be awesome for the network team to prioritize getting sbws to the point that the network health team can provide proper feedback on what else is missing 18:32:48 (and in the mean time, for the network health team to figure out what they need, in order to provide proper feedback) 18:33:07 arma: yes, the issue is that so far we have other projects going on. I need to understand if this should be more important than the other ones or not OR to see how to squize them in. 18:33:12 squeeze* 18:33:39 yeah and if not teor doing it, we need someonelse to ramp up to sbws 18:33:40 yep. not saying we have to drop everything else. just saying it would be awesome. :) good thing we have project managers, roadmaps etc 18:33:59 yes 18:34:03 teor can not take them right now 18:34:41 dgoulet, ahf: let's try to check with other people in the team to see who else can take it. 18:35:02 wonderful net team email? :) 18:35:06 gaba: yes 18:35:07 hehe 18:35:10 nobody will volunteer in a public IRC setup :P 18:35:21 and maybe the network health team's first role there would be to help figure out a minimum set of critical path things 18:35:24 i was mostly thinkg about 1:1s syncs :) 18:35:32 dgoulet: why not? 18:35:39 ignorant question: sbws is deployed 3 places now and the problem is we want 100% deployment, but can't right now? 18:35:40 GeKo: just the net team group dynamic in some ways :P 18:35:47 huh 18:35:52 okay 18:36:05 _since_ Montreal we tried to get someone from the net team to help maintain swbs 18:36:08 and here we are so .... 18:36:17 sorta thing it needs to be _pushed_ on someone ;) 18:36:20 think* 18:36:24 catalyst was going to help with it a little. Maybe then can do it. We need to check. 18:36:27 ahf: yes 18:36:50 yeah and sbws is off by like 1000 relays from the old bwauth iirc ? 18:36:59 http://tgnv2pssfumdedyw.onion/#bwauthstatus 18:37:22 could be also due to geo location as in one is in Hong Kong so 18:38:00 i don't know what the sbws blockers are, but at a high level, i vote 'simple simple simple'. like, measure relays, produce number. then it should be more possible to figure out bugs like 'why are half the numbers missing'. 18:38:16 okay, let's move on here 18:38:30 great 18:38:45 i think we won't finish the roadmapping thing 18:39:00 so let's squeeze in our regular meeting time discussion 18:39:21 do we have a potential day/time that could work for like most of us? 18:39:32 what about the proposed 1900 utc on mondays? 18:39:47 fine by me, just after net team meeting 18:39:49 plausible! who is the 'most of us'? 18:39:56 we'll see! 18:40:00 ha 18:40:04 i think i can do that, but i might also have 1:1's there with netteam folks 18:40:11 i have moved all my 1:1 to monday now 18:40:15 Monday 19:00 UTC was the time that worked from the doodle 18:40:19 aha 18:40:43 ggus: what about you? 18:40:46 Next Monday I might be 30 minutes late 18:40:56 just in general 18:41:02 but only that monday, in general works 18:41:17 okay, then let's try that one? 18:41:32 starting with 2/3 18:41:49 ok 18:41:52 february 2nd 18:41:55 not this monday then 18:42:04 ok! 18:42:05 no, i'll be in berlin at all hands 18:42:40 good. so let's got back to the roadmapping exercie for the remaining minutes, i guess 18:42:45 *go 18:43:10 feb 2 is a sunday. let's call it feb 3 monday. 18:43:16 feb 3rd, yes 18:43:30 anyway, roadmap 18:43:39 we can continue looking at the roadmap between meetings 18:43:46 yep 18:43:54 I think we should add the priorities and add issues/tickets to each 18:44:05 with a must to the ones that needs to be done 18:44:05 in particular as other teams have not finished their roadmap 18:44:24 or sbws related parts show up etc. 18:46:12 are people ok with this process on how we add priorities and tickets? 18:47:00 yep 18:47:01 i am fine with it fwiw 18:47:51 sounds great. i will look forward to seeing how it goes. let me know if there are specific tickets i am the best person to file. 18:48:05 will do, thanks 18:48:38 i would suggest not getting too distracted by the mishmash of existing tickets. instead, figure out some organization and do things in an organized way. :) 18:49:00 yeah 18:49:27 are we using network-health tag in trac tickets? 18:49:35 though 'do triage on existing tickets' might be a fun periodic exercise 18:49:37 for now as keyword, yes 18:51:13 ok, and how long are you staying in berlin? 18:51:35 the whole week until saturday morning (assuming i am meant) :) 18:52:55 geko: until jan 31st? 18:53:24 well, yes, my train back leaves on feb 1 in the morning 18:53:51 so i expect to be the whole next week kind of distracted 18:54:05 but hope to sync at least with mike regarding network health stuff 18:54:33 in particular how it related to performance and scalability work under way 18:54:39 *relates 18:55:04 okay, i think i have all the items that come to mind right now for the bad relay part 18:55:27 ok, it seems we may be ok for the meeting today? 18:55:39 we can continue on the roadmap after and sync again in next meeting 18:55:52 yeah, i think so 18:56:03 geko: what would be helpful for me, at the end of your roadmapping process, is to hear a handful of goal statements, and put them on the network health team trac page, 18:56:09 do we have any last words/thoughts/ideas/complaints? 18:56:10 and then at the end of q1 we can ask how those goals are going, 18:56:18 yes, i agree 18:56:23 and while we are filing and working on tickets we can ask ourselves if this ticket works towards one of those goals 18:56:31 yes 18:56:36 we need to update the wiki page 18:56:40 and trac.. 18:56:48 and everything... :) 18:57:12 geko: do you send the notes to tor-project@ ? 18:57:20 yes, i can do that 18:57:53 i guess we are good for now. thanks everyone for this productive meeting. very nice to see things moving here! 18:57:59 #endmeeting