13:01:34 <hiro> #startmeeting network-health 2026-03-09
13:01:34 <MeetBot> Meeting started Mon Mar  9 13:01:34 2026 UTC.  The chair is hiro. Information about MeetBot at https://wiki.debian.org/MeetBot.
13:01:34 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
13:02:01 <hiro> and the pad
13:02:02 <hiro> #link https://pad.riseup.net/p/tor-nethealthteam-2025-keep
13:02:36 <Rohithh[mds]> It is https://pad.riseup.net/p/tor-nethealthteam-2026-keep
13:02:36 <sarthikg[mds]> umm, this one? https://pad.riseup.net/p/tor-nethealthteam-2026-keep
13:02:53 <hiro> oh yeah!
13:02:55 <hiro> sorry
13:03:03 <hiro> #link https://pad.riseup.net/p/tor-nethealthteam-2026-keep
13:04:26 <hiro> ok who wants to go?
13:04:39 <juga> My updates:
13:04:39 <juga> Last week i sorked on some exitmap, sbws and services issues, reviewed others and continued to work on tor_anomalies.
13:04:42 <juga> This week will continue with tor_anomalies and collector.
13:04:53 <hiro> I have been fighting OOM errors lol
13:05:42 <juga> hiro: those are hard...
13:06:08 <hiro> yeah can't wait for collector-rs \o/
13:06:19 <juga> >_>
13:07:41 <GeKo> i guess we discuss p183 issues in our monthly sync later, right?
13:07:53 <GeKo> apart from that i had a week of reviews
13:07:58 <GeKo> and i survived
13:07:59 <sarthikg[mds]> My update: Added the clickhouse backend to sea-query. currently extracting it as a separate crate since maintaining sea-query's fork is tougher in the long-term than to maintain this crate. Parallely migrating the queries to use the new crate. Also, in-discussion to add support for happy families in aggregator.
13:08:15 <hiro> I think so GeKo (IRC)
13:10:12 <GeKo> hiro: for the grafana dashboards, i think i found some issues
13:10:16 <hiro> alright! so if isn't there anything else we can talk at our voice sync in 1-h
13:10:30 <GeKo> i'll stop at that and see how fixing those goes
13:10:32 <hiro> GeKo (IRC): which one?
13:10:43 <GeKo> both dashboards
13:11:05 <GeKo> i filed issues in datastore
13:11:14 <hiro> the bw one needs some work for sure... and I have to review the clients/onions counts again but the oom issues has blocked me there
13:11:23 <GeKo> https://gitlab.torproject.org/tpo/network-health/metrics/datastore/-/issues/14 is the one for the bw
13:11:36 <GeKo> there is some issues with the bw_line_stream at least
13:12:02 <GeKo> well, i am not concerned with the layout of the panes or etc but the data displayed/used
13:12:04 <hiro> maybe I am filtering too aggresively
13:12:24 <GeKo> and unrelated but while we are here we should talk about the happy family part
13:12:46 <GeKo> we need to display that somehow on relay-search
13:13:08 <GeKo> so, just having family-cert and -id in onionoo doesn't cut it, i think
13:13:21 <GeKo> although it's a necessary step
13:13:25 <hiro> @geko the MR for onionoo is almost there and I left you a comment there
13:13:35 <hiro> I think I am able to keep both system
13:13:48 <hiro> I just hope onionoo doesn't choke in the process
13:13:57 <GeKo> yeah, the question is relevant for aggregator-rs as well
13:14:14 <GeKo> as we want to have something related to happy families in the statuses as well
13:14:23 <GeKo> so, it's not onionoo specific
13:14:38 <hiro> well for aggregator it is easier because it is a query away the result
13:14:49 <GeKo> i was wondering whether we could just reuse effective_family here
13:15:03 <hiro> yes
13:15:06 <GeKo> if there is no happy family configured use the old mechanism
13:15:17 <GeKo> otherwise populate it with the new one
13:15:24 <hiro> so effective family should be everything that is verified via the family_ids
13:15:35 <GeKo> for declared_family it's a bit trickier
13:15:45 <GeKo> as we have all the fingerprints in the relay descs
13:15:53 <GeKo> but for happy family it's just the cert
13:16:08 <hiro> yes
13:16:22 <hiro> this is what I have done in onionoo
13:16:44 <GeKo> so, maybe we just leave declared_ family empty in case happy families is configured
13:17:18 <hiro> why?
13:17:34 <GeKo> because the relay only declares a certificate
13:17:45 <hiro> well it saves strings in the db
13:17:45 <GeKo> and no family in the server desc anymore
13:17:54 <hiro> but for onionoo as well?
13:18:00 <GeKo> i don't follow
13:18:33 <GeKo> declared_family in the old system contains all the fingerprints in the relay descriptors
13:18:52 <GeKo> but there is no such thing with happy families
13:18:56 <hiro> well if you want to visualize the effective family the same way, why you don't want the alleged family to match?
13:18:58 <hiro> sorry declared
13:19:14 <hiro> ah because you are thinking of the field in the descriptor ok
13:19:19 <GeKo> yes
13:19:35 <hiro> ok makes sense then
13:19:44 <GeKo> they might not even match because folks made a mistake with the family configuration
13:20:03 <GeKo> sarthikg[mds]: ^
13:20:08 <hiro> yeah that's why I was doing the check
13:20:13 <GeKo> does that make sense?
13:20:14 <hiro> to see if it was matching
13:20:31 <hiro> I thought that was relevant information to haev
13:20:39 <GeKo> i am not sure what we should do in case the old and new system contradicts each other
13:21:01 <GeKo> maybe we just pick the happy families config in that case and run with that
13:21:38 <sarthikg[mds]> GeKo: yeah, for the purpose of happy families, we will only be having effective_family based on the certs/id matching? but should we also introduce a new field to mark that the relay uses the happy_families instead of the legacy families?
13:22:06 <GeKo> i don't think we'd need that
13:22:19 <GeKo> hiro: how do you take several family-certs into account?
13:22:43 <GeKo> because that's possible now while we only had one `family` in the server desc
13:22:52 <GeKo> do you merge them somehow?
13:23:49 <GeKo> sarthikg[mds]: but, yeah, i think we can leave declared_family empty and focus only on the effective_family for now
13:24:15 <GeKo> one thing i was thinking about was to use declared_family in the happy family case to somehow display all the family-certs
13:24:34 <GeKo> because there are now several possible, meaning a relay can now belong to several families
13:24:36 <hiro> I was treating the cert like declared fps
13:24:40 <hiro> and then using the ids to verify... but not sure what we should do when we have a two ids for example...
13:25:05 <hiro> that's why I thought in the beginning that the effective_family is not something we should maintain
13:25:06 <GeKo> okay, well, we can have several certs to begin with
13:25:12 <hiro> just list the certs and the ids
13:25:50 <GeKo> we don't need to maintain that but just populate that from the microserver descs
13:26:00 <GeKo> we get that for free from the dir-auths
13:26:10 <hiro> I mean I think we should keep the two system separated
13:26:31 <hiro> yes and just publish the family_ids
13:26:47 <hiro> and then one could query all the relays with the same family_id(s)
13:27:41 <GeKo> how would we populate the effective family members on relay search?
13:28:25 <hiro> from the old system
13:28:41 <hiro> and for the new system you would be able to visualize the ids the relay belongs to
13:28:42 <GeKo> but there are relays with new system only
13:29:07 <hiro> yes and you'd get the ids for those
13:30:00 <hiro> and when you click on an id, you'd get a list of relays
13:30:20 <GeKo> hrm
13:30:38 <hiro> I think happy family might open more possibilities to relay belonging to different families and that might have to be visualized differently imo than a list
13:30:47 <sarthikg[mds]> i believe the happy_families require having family as its own entity, since a relay can be part of multiple families, and not all families have the same set of relays. I'm not sure how it will work with the current design, but that's just what i think for now...
13:31:47 <GeKo> okay, so in the old system we keep the fingerprint list in effective_family
13:31:49 <hiro> but I am happy to add all the relays sharing the same family ids to the effective family if that's what one would expect.
13:32:07 <GeKo> and in the new one we only add the family ids?
13:32:21 <hiro> yes that's what I thought in the beginning
13:33:02 <GeKo> we are fingerprint based, though, so i expect no one can make use of the family ids
13:33:08 <GeKo> at least not on relay-search
13:33:23 <GeKo> and i'd expect everyone clicking on the ids to get the fingerprints/other relays
13:33:25 <hiro> but one could search by family_id
13:33:26 <hiro> or family_cert
13:33:42 <hiro> yes that should be the idea, that you get a list
13:33:46 <hiro> of fp sharing that id
13:33:57 <GeKo> yeah, searching by id makes sense
13:34:36 <GeKo> but i think we should populate the Effective Family Members part with the actual fingerprint directly
13:35:06 <GeKo> the looking clase next to it could contain a the link to the familiy id(s) or something
13:35:30 <GeKo> where we now have https://metrics.torproject.org/rs.html#search/family:E4F9B844BB53B27EC4394C34A75CFBCC06E5F266
13:36:09 <GeKo> but things get tricky quickly with serveral ids
13:36:58 <hiro> this is what onionoo is doing atm in the MR
13:37:14 <GeKo> great
13:37:21 <hiro> but the relationship now is more graph like imo and the list might be deceiving
13:38:17 <GeKo> we can test whether we get the right thing with: https://metrics.torproject.org/rs.html#search/contact:applied
13:38:46 <sarthikg[mds]> GeKo: how about we rename effective_family & declared_family as legacy_effective_family & legacy_declared_family? and introduce the family_ids key which contains all the listed id's. happy families will only use family_ids? and the family_id will be searchable to get all relays that list that family_id in the website?
13:38:56 <GeKo> they only have happy families configured and i thought all their fps should show up in the effective familys section
13:39:17 <hiro> quetzacoal I think is on both systems for example
13:39:17 <GeKo> and there should be a (104) again after the nicknames
13:39:31 <hiro> which is another candidate
13:39:50 <GeKo> sarthikg[mds]: we could do that
13:40:29 <GeKo> it might be cleaner that way
13:40:51 <GeKo> i was a bit unhappy with legacy_* stuff
13:41:00 <hiro> if we use the family_ids key we do not need to rename the old system? it'll still be valid to analyse old data
13:41:03 <GeKo> but maybe we need to bite that bullet for better clarity
13:41:53 <sarthikg[mds]> hiro: we can just do the renaming in the website maybe, not in the db
13:42:06 <GeKo> yeah, i was more thinking about having effective_family_new and declared_family_new or something
13:42:23 <GeKo> so, something in parallel
13:42:52 <GeKo> because at the end family_cert and family_id is something in parallel as well
13:43:21 <GeKo> (compared to "family")
13:44:06 <hiro> I am not sure about having all these family fields in onionoo apis
13:44:26 <GeKo> yeah
13:44:28 <hiro> for what the DB is concerned we can add as many fields as we want
13:45:02 <sarthikg[mds]> GeKo: but we won't be able to represent the graph structure well enough with the lists imo... or at all tbh.
13:45:08 <GeKo> that's why i was thinking we might try to bolt the new system on top of what we have with onionoo as good as we can
13:45:23 <GeKo> sarthikg[mds]: true
13:45:30 <hiro> yes that's what is happening
13:45:59 <GeKo> so, maybe we try to stick to what we have with onionoo as good as possible
13:46:08 <hiro> anyways I have to change location before the 183 meeting. let'
13:46:10 <GeKo> while not following those constraints with the db
13:46:24 <GeKo> and do the right thing there instead
13:46:32 <GeKo> hiro: kk
13:47:21 <hiro> let me end the meeting. I'll be 5 mintues late for 183 I think
13:47:29 <hiro> #endmeeting