13:01:18 <hiro> #startmeeting network-health 2025-05-12 13:01:18 <MeetBot> Meeting started Mon May 12 13:01:18 2025 UTC. The chair is hiro. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:18 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 13:07:40 <hiro> https://grafana1.torproject.org/d/xfpJB9FGz/8b2109d?var-interval=2m&orgId=1&from=now-2d&to=now&timezone=browser&var-origin_prometheus=&var-job=node&var-hostname=$__all&var-node=colchicifolium.torproject.org:9100&var-device=$__all&var-maxmount=%2Fhome&var-show_hostname=alberti&var-total=94 is the link for those interested 13:07:41 <hiro> now that it seems things have stabilized a bit I was thinking to write a post-mortem of the latests issues. GeKo (IRC) where do you think this could go? the wiki or some of the analysis tickets? 13:07:42 <hiro> I was worried that if I write it up again in one of the fix issues it gets buried into gitlab so to speak 13:07:43 <hiro> other than this I do not have anything else besides what is in my tasks list and some bugs for the current metrics website reg some of the graphs not behaving properly that I have to dig into 13:07:49 <hiro> GeKo (IRC), juga do you have anything to discuss for this week sync? 13:08:09 <juga> hiro: hmm, i don't think so 13:09:53 <GeKo> hiro: i'd be fine in some of the tickets 13:10:25 <hiro> the fix tickets for the service or the analysis one? 13:10:42 <GeKo> or you mean analysis#96? 13:10:42 <tor> Uhm, which one of [tpo/network-health/analysis, tpo/network-health/metrics/analysis] did you mean? 13:10:52 <hiro> yeah 13:10:57 <GeKo> i'd reserve that for the actual ddos 13:11:08 <hiro> ok then 13:11:27 <GeKo> collector is kind of 2nd order collateral damage 13:12:07 <hiro> oh GeKo (IRC) one operator at the meetup said that they have observed a ddos to exits that we didn't know about (the specific to exits part) 13:12:22 <GeKo> interesting 13:12:30 <GeKo> do we have more info about that? 13:12:33 <hiro> we have asked for more info and to write to bad-relays so if that mail arrives we will know 13:12:40 <GeKo> ah, okay 13:13:33 <GeKo> hiro: is it expected that the memory usage for collector is still jumping? 13:13:56 <GeKo> it's essentially the same pattern as before, just the spikes are not that high right now 13:14:35 <GeKo> and the RAM cache + buffer does not folow the RAM used, hrm 13:14:39 <GeKo> *follow 13:15:44 <hiro> sometimes it is possible but the service seems more stable now 13:15:45 <hiro> the average is 60% free memory 13:15:45 <hiro> also we shouldn't use more than 80% memory anyways 13:16:02 <GeKo> okay 13:16:20 <GeKo> i don't have anything else 13:16:39 <GeKo> hiro: oh 13:16:48 <GeKo> did the votes document thing succeed? 13:16:50 <hiro> great I'll clean up the MR for collector and add the post mortem then 13:17:01 <hiro> yes 13:17:07 <hiro> we should have that in the db 13:17:24 <GeKo> awesome, so you could fix up that mr as well and we are done with that part 13:17:26 <hiro> but creating statuses takes 6h nowadays... so me and sarthik are working on a solution f or that 13:17:35 <GeKo> yeah... 13:18:13 <hiro> so yeah I can fix up the MR for the vote part .. probably doing a rebase 13:18:18 <hiro> but the long term solution is this: https://gitlab.torproject.org/tpo/network-health/metrics/aggreagator.rs/-/issues/1 13:18:29 <hiro> moving creating statuses out of the parser 13:19:42 <hiro> ok so that's all I guess 13:20:11 <hiro> if everybody is good we can end the meeting 13:21:17 <GeKo> +1 13:21:27 <hiro> #endmeeting