#debian-kernel log

19:00:16 <carnil> #startmeeting
19:00:16 <MeetBot> Meeting started Wed May 29 19:00:16 2024 UTC.  The chair is carnil. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:16 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
19:00:31 <ema> hi!
19:00:34 <bwh> hi
19:00:54 <carnil> Hello everybody. As discussed on the mailing list we would like to try to bootstrap some regular meetings for the kernel-team to see where we can get more traction on issues, merge requests and topics to discuss
19:01:04 <waldi> hi
19:01:20 <carnil> first of all: I'm quite unexperienced in running online meetings, so if someone feels like wanting to (co)chair it feel free to speak up
19:01:36 <diederik> hi
19:01:43 <carnil> for today I have looked through some of the items and put an agenda as follows roughly:
19:02:47 <carnil> agree on the team meetings, looking trough most important open bugs filled recently or updted recently, looking at merge requests, an item on handling check s on kernel-team projects and if time remains at lest mention the trixie kernel maintenance issue
19:03:16 <carnil> #topic can we agree on trying to schedule regular team meetings every week
19:04:11 <bwh> I will usually be available on Mon-Thu evenings (CET)
19:04:13 <carnil> #info My proposal here would be that we hold those meetings on every wednesday, 21:00 CEST/19:00 UTC trying to keep them focused on the most important bits and do not make them overlong
19:04:35 <carnil> would a fixed weekday work for the interested people?
19:04:46 <bwh> For me, yes
19:05:04 <waldi> for now, yes
19:05:12 <ema> +1
19:05:43 <diederik> football season is over, so I can now too
19:05:51 <carnil> it should be noted that we can cancel the event if the experiment fail, it's fair to say
19:06:24 <carnil> so let's try that and make it every Wednesday (until there is need to change something)
19:06:51 <bwh> OK, updated my calendar
19:06:51 <carnil> #agreed hold kernel-team meetings every week on wednesday 21:00 CEST/19:00 UTC
19:07:25 <ema> excellent, the first topic was easy :)
19:07:27 <carnil> next topic would be to go over at least the grave, serious and important bugs where we can say something
19:07:39 <carnil> #topic recent open bugs / recently updated bugs
19:07:55 <diederik> Sure about important? That's a LOT of bugs
19:08:00 <bwh> For #1063754, the reporter is trying to investigate in his own way and not doign the test I asked for
19:08:26 <bwh> I intend to reply and mark it moreinfo unreproducible
19:09:09 <carnil> that sounds good to me, if we do not have an actionable hint we cannot do much, if users cannot perform the tests we ask for then we might be out of luck
19:09:56 <carnil> #action bwh will reply to #1071378 on the reporter performing the needed tests and eventually mark it moreinfo and unreproducible for now
19:10:30 <diederik> ... that's a different bug
19:10:33 <carnil> #1039883 goes in very similar direction, there seems to be the reporter affected but nobody was really able to track this down.
19:11:00 <carnil> Theodore Ts'o  was looped in a while back (july 2023) but there was not followup from upstream
19:11:21 <carnil> should we do the same here as well and mark it further unreproducible?
19:11:55 <bwh> The reporter gave a script to reproduce it; did anyone try that?
19:12:23 <carnil> #info reproducer from the reporter is in message #40, did anyone tried to reproduce it with it?
19:13:16 <carnil> I guess the answer is no
19:13:18 <waldi> carnil: #info and #action are standalone, you have to provide all context
19:14:30 <diederik> I haven't tried it
19:14:33 <carnil> waldi: thanks, will try to improve on it.
19:14:56 <bwh> Could someone try it, then?
19:15:15 <ema> I can
19:15:46 <carnil> #info With respect to #1039883 the reporter provided a reproducer script in message https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039883#40 where it is unclear is someone tried to reproduce it on their own as well
19:16:08 <ema> #action ema will try to reproduce #1039883
19:17:05 <carnil> thanks ema
19:17:08 <bwh> thanks
19:17:17 <ema> np!
19:17:52 <bwh> For #1071378, carnil asked the reporter to bisect and report upstream. Should it again be tagged moreinfo then?
19:18:53 <carnil> #action carnil will tag #1071378 moreinfo as we asked the reporter to bisect the issue and report upstream
19:19:25 <diederik> Waiting for the reporter to respond seems more useful. If then new questions arise, then it could be added (imo)
19:19:36 <carnil> #info #1057282 is affecting ci.debian.net infrastructure, once updating the kernel on arm64 hosts. Ben asked Paul if more recent kernels fix the issue
19:20:10 <carnil> #info Paul Gevers  has though a question back to us before trying that on ci.debian.net in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1057282#42
19:20:37 <bwh> So what do we say?
19:20:42 <carnil> #info summarizing question is if ith is worth knowing and put workload on the ci.debian.net maintainers
19:20:59 <carnil> his reply:
19:21:01 <carnil> If you think it worth enough knowing if either is the case, I can
19:21:02 <carnil> install the backports kernel again on the arm64 hosts, but obviously
19:21:02 <carnil> that will be annoying for us. Please let me know if I should pursue this
19:22:04 <bwh> I think they're going to have to upgrade at some point so we might as well find out whether the issue is fixed rather than waiting for them to upgrade to trixie and potentially hit it then
19:22:06 <carnil> the problem here is if I understand correctly, is that the stable kernel has other issues they were facing, so switching to the bpo kernel for arm64 hosts was a possibiltiy
19:23:12 <diederik> that bug doesn't mention that they're currently having problems with the Stable kernel
19:23:44 <carnil> diederik: it is in the very first message: Thursday 30 November I upgraded the ci.debian.net workers. We're running
19:23:47 <carnil> the backports kernel there due to issues we discussed earlier, but after
19:23:49 <carnil> upgrading, we lost access to our arm64 hosts one after the other.
19:24:40 <diederik> Yes, and that's now half a year old. The most recent message said they switched back to the Stable kernel
19:24:42 <carnil> but it does not explicitly mentions anymore which were the issue, and I at least have lost the overview which one. there is at least the apparmor issue to be looked right after this which has an impact on them (but with upstream approaching a solution)
19:24:56 <diederik> (but are willing to upgrade again if it helps fixing the issue)
19:25:18 <carnil> I suggest we ask Paul to please test the updated version
19:25:52 <bwh> #action bwh to follow up to #1057282
19:26:07 <carnil> thanks, bwh you were faster to write that
19:26:21 <carnil> Last bug to quickly look at until we start to run out of time
19:27:34 <carnil> #info #1072004 is actually affecting task for QA for the release, breaking autopkgtest for qemu jobs . We lowered the severity for now, but bluca is asking to make it actually RC to not let the version of linux migrate to testing
19:28:05 <carnil> it would be ideal to know if this is fixed in 6.9.y, has someone tried that from the kernel-team?
19:28:33 <diederik> technically it isn't RC. But I do think it's very important as it affects important Debian infra
19:28:35 <waldi> they can run qemu jobs with the kernel from stable, so no problem at this time?
19:30:12 <waldi> anyway, is it fixed somewhere?
19:30:14 <bwh> This is for isolaiton-machine, where the package gets run in a QEMU VM
19:30:43 <bwh> so I think it makes sense to install the kernel from testing in the VM
19:32:16 <bluca> yes it's a guest kernel issue, not a host kernel issue
19:32:44 <bwh> waldi: I don't see anything in next referring to the breaking commit, so probably no
19:33:00 <waldi> bwh: i don't even see a log. just a SIGTERM
19:33:11 <carnil> Paul in any case seems to indicate to defer the decision to us (in his last message)
19:33:56 <waldi> maybe we should just refer people to migrate to virtiofs for any new usecase, which this would be
19:34:00 <carnil> so if I understand you all correctly: we wuold not increase the severity to RC, but seek for someone who can verify if the issue if addressed in 6.9.y upstream (maybe pursuing again the stalled thread upstream)
19:34:09 <diederik> I (then) want to hightlight another part: "I would be expecting a bit quicker turn around on this bug if you say yes now ;)"
19:35:52 <diederik> increasing the severity (to RC) or not is a decision that the maintainer(s) need to make
19:36:03 <ema> apparently canonical did verify that reverting one commit is sufficient to get a working 6.8: https://bugs.launchpad.net/ubuntu/+source/autopkgtest/+bug/2056461/comments/13
19:36:05 <diederik> That's also what Paul explicitly said
19:36:41 <bwh> ema: Saw that. I worry a bit whether that will still work when we upgrade to 6.9
19:37:45 <waldi> anyway, i have to run, not enough time today
19:37:56 <carnil> and any patch we do diverge from upstream which is not the solution applied by upstream will hit us later in some way.
19:38:18 <carnil> ok waldi, thanks for partecipating, and this is hilighting a good point that time is running fast
19:38:32 <diederik> The problem seems consistent and (thus) reproducible. So if someone has time, try to reproduce it in a VM (or sth like that)?
19:38:50 <diederik> If reproduced, try it with a 6.9 kernel to see if the issue is still there
19:39:18 <bwh> Yes I think the difficulty is to make a simple reproducer rather than the whole of autopkgtest
19:41:09 <diederik> That would be better as it's likely quicker. But this doesn't seem like an issue which *only* the reporter can trigger/hit (as it's not dependent on certain HW)
19:41:10 <carnil> so how about approaching again the people in the upstream stalled thread, to get more information and unerstanding if 6.9.y is still affected? The point here is likely to find someone with enough free time to do otherwise do experimenting on our own.
19:42:08 <carnil> diederik: would you be in the position to schedule enough time for trying to reproduce the issue and verifying it against 6.9.y as well?
19:42:15 <diederik> no
19:42:39 <ema> AFAIU reproducing is a matter of installing a 6.9 kernel in a VM and use such VM in a autopkgtest with qemu backend? If so I can hopefully give it a go in the next couple of days
19:43:01 <ema> s/reproducing/checking if 6.9 is affected/
19:43:17 <diederik> Yes, first make sure you can reproduce it with a 6.8 kernel
19:43:43 <carnil> #info if our understanding is correct the reproducing #1072004 is a matter of installing the kernel in a VM and use such a VM in autopkgtest with qemu backend with first verifying it is reproducible with a 6.8.y kernel
19:44:10 <carnil> #action ema might have time to give it a try to reproduce the issue in the next few days
19:44:16 <bwh> ema: If you haven't used autopkgtest before, don't undereztimate the difficulty of setting it up
19:44:24 <ema> bwh: I have :-)
19:44:31 <bwh> Ah, good
19:45:08 <carnil> perfect, so I had actually on the agenda at least to talk how we move forward with the firmware-nonfree rebases and rebased for 6.9.y and 6.10-rcX in experimental
19:45:31 <ema> #action ema to try reproduce #1072004 with 6.8 and 6.9 as guest kernel in autopkgtest-virt-qemu
19:45:56 <carnil> do we have capacity to still at least look at the firmware-nonfree situation?
19:46:23 <bwh> I hope to work on it "soon", but can't promise anything
19:46:39 <carnil> #topic firmware-nonfree lacking behind upstream versions
19:47:12 <carnil> #info situation is rather unforunate for firmware-nofree . We lack behind several upstream version containing both security fixes and updates which have real impact for users with recent HW
19:47:19 <diederik> I want to make a remark about the previous topic if that's ok
19:47:32 <carnil> #info diederik did a lot of work rebasing the versions but the main problem is reviewing those MR and lacking of automatism
19:47:46 <diederik> "approaching again the people in the upstream stalled thread" the last message was YESTERDAY, not 6+ months ago
19:47:48 <carnil> #info bwh hopes to work on it "soon" but there cannot be a promise for it
19:48:24 <diederik> I don't think the problem is automatism, but someone making the time to review them
19:49:04 <bwh> I do mean to review the MRs
19:49:37 <diederik> If you look at the procedure I described in https://lists.debian.org/debian-kernel/2024/05/msg00049.html and try that out on f.e. 202309XY, that should (hopefully) reveal it's not that hard ...
19:49:51 <diederik> ... after the huge one (and the next) are cleared
19:49:55 <daissi> I'm willing to help with firmware-nonfree
19:50:10 <diederik> bwh: ah ok, thanks :)
19:50:37 <diederik> I also saw a different approach which does look like automation and I thought that was referred to
19:51:05 <bwh> So I want to automate things more for future updates
19:51:51 <bwh> Something I worked on Berlin was to support wildcard file lists, so there's no need to add individual files to a package any more
19:52:00 <bwh> (usually)
19:52:58 <bwh> Anyway, I think the rest of this can be discussed in the relevant MRs
19:52:59 <diederik> The main problem I encountered was coming up with a 'random string of characters' to use as Description
19:53:14 <diederik> IOW: I think that field is mostly useless
19:53:41 <bwh> Right, it is pretty pointless
19:53:50 <carnil> thanks bwh and diederik for summarizing the current situation.
19:54:27 <bwh> since we now have automation for finding firmware packages relevant to your hardware
19:54:32 <carnil> (if you have seen aboive daissi offered as well help for firmware-nonfree)
19:55:08 <bwh> daissi: Thank you, but at the moment the limitng factor is review by maintainers (like me)
19:55:14 <bluca> just a note: reproducing the qemu hang with autopkgtest is really easy, two steps: build an image with 'autopkgtest-build-qemu unstable /path/to/some/img' and then 'autopkgtest -B dpdk -- autopkgtest-virt-qemu /path/to/some/img'
19:55:38 <ema> thanks bluca
19:55:40 <diederik> I think there's another important issue wrt firmware and that's on the kernel side
19:55:45 <daissi> Okay so don't hesitate to ping me if I can do something
19:56:05 <carnil> #info limiting factor for firmware-nonfree is right now rather reviewing the MRs by kernel-team maintainers
19:56:34 <diederik> We 'upgrade' every firmware message to error, which in turn causes reporters to focus on that (and report more as it's an error, which itself is understandable)
19:56:39 <carnil> #info some ifnormation we put in the package is quite useless (descriptions for firmware) as we do have automation for finding firmware packages relevant to present hardware
19:57:22 <diederik> also, people look for file names/paths, not some random string I came up with
19:58:14 <bwh> diederik: Yes the firmware logging patches do need attention (possibly deletion, depending on whether d-i still wants that reporting)
19:58:57 <diederik> yeah, they still need the messages, but I hope they don't require them to be errors
19:59:45 <diederik> imo, those (2 IIRC) patches should be deleted OR split up (as described in the patch description)
20:00:35 <diederik> but right now I think it's causing more harm then providing value
20:00:50 <bwh> #action bwh to overhaul firmware logging patches
20:01:09 <carnil> I think we have bugs for those, but I do not have them at hand right now
20:01:32 <bwh> There's one relating to iwlwifi-yoyo.bin which is a debug thing not present in linux-firmware.git
20:02:07 <bwh> and I've been meaning to deal with the patches since I saw that, quite some time ago now  :-/
20:03:01 <carnil> okay we are running out of time and I propose to close it here. We can discuss the relevant items further in the MRs for firmware-nonfree and how to rework the firmware logging patches again in a next meeting or off-meeting
20:03:09 <bwh> I agree
20:03:56 <carnil> so thanks to all for partecipating to this experiment. Again if someone feels better suited to chair the meeting please do approach.
20:04:15 <ema> sounds good, I also wanted to briefly discuss other items but nothing that needs sync communication, I'll send emails :-)
20:04:43 <carnil> #info next meeting will be On Wed. 5th June 21:00 CEST
20:04:54 <bwh> Thanks carnil
20:05:06 <carnil> #endmeeting