20:02:05 <bwh> #startmeeting 20:02:05 <MeetBot> Meeting started Wed Feb 26 20:02:05 2025 UTC. The chair is bwh. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:02:05 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 20:02:37 <bwh> #chair waldi carnil 20:02:37 <MeetBot> Current chairs: bwh carnil waldi 20:03:00 <bwh> Hi all 20:03:01 <ukleinek> before we start, I want to highlight that carnil closes quite a some old bugs. Thanks a lot \o/. 20:03:29 <bwh> Thanks carnil, that's always appreciated 20:03:31 <ukleinek> There was some negative feedback and caring for his mental health I wonder if there is some need for support from the rest of us. 20:03:50 <bwh> #topic Bugs #1076372 and #1090717: NVMe corruption 20:05:26 * ukleinek didn't follow the discussion. does someone else know if there is something new? 20:05:30 <bwh> I don't see any news from the reporter on the upstream bug 20:05:51 <carnil> I do neither have dug into the long bug 20:06:00 <carnil> just noticed another comment that say "So BIOS firmware 4.10 seems to have solved the problem." 20:06:11 <bwh> Right 20:06:11 <carnil> (comment 128 in the upstream issue) 20:06:35 <carnil> so at least there seems to be some indication that's not actually a Linux issue to handle 20:06:57 <bwh> In any case I don't think there's anything we can do at this stage 20:07:11 <bwh> #topic Bug #1098661: linux: fails to boot on VisionFive 2: Unhandled exception: Store/AMO access fault 20:07:56 <ukleinek> there was also some discussion in #d-arm about efi zboot 20:08:01 <bwh> I think MR #1384 is related to this? 20:08:03 <ukleinek> (also involving aurel32) 20:08:45 <ukleinek> salsa 500s for me, but yes, there is an MR by aurel32 related to that 20:08:47 <bwh> OK, the MR subject only mentions riscv64 (I can't load the page for it at the moment) 20:09:16 <ukleinek> Topic in #salsa is "Maintenance in progress" 20:09:28 <waldi> and it is not flly right. even that systm can boot the kernel. except if run via grub 20:09:36 <carnil> (ah that's unlucky for the meeting) 20:10:07 * ukleinek is happy to have a clone of meeting.git :-) 20:10:18 <ukleinek> but that doesn't help that much 20:10:28 <aurel32> yes, it's just a revert of the riscv64 specific part of the commit 20:11:39 <bwh> aurel32: I didn't follow why it is broken for riscv64 and not arm64? 20:11:40 <aurel32> https://paste.debian.net/1356413/ 20:11:49 <ukleinek> waldi: I'm not aware what the motivation was to enable that efi-zboot stuff. Is there a masterplan somewhere? 20:12:20 <waldi> ukleinek: be able to compress the stuff on arm64 as well 20:12:38 <aurel32> bwh: arm64 seems also borken, just i haven't tested it 20:12:59 <ukleinek> If I understand correctly vmlinuz.efi is (theoretically) superior to Image, but people/machines are not prepared to handle that? 20:13:22 <ukleinek> waldi: what is "the stuff"? 20:14:11 <aurel32> there are two issues on riscv64: 1) a bug somewhere that prevents the decompressor to work when using grub 2) kernel stopped working in the non-uefi case 20:14:24 <aurel32> AFAIK arm64 is only affected 2) 20:14:55 <aurel32> 1) is probably fixable, just that debugging on real hardware takes time (it can't be reproduced under QEMU, or at least I have not been able too so far) 20:15:26 <waldi> aurel32: which bootloader uses the non-uefi case? 20:15:44 <waldi> i know about flash-kernel 20:15:51 <bwh> Are most Debian arm64 systems booting with EFI? 20:16:16 <bwh> I would have guessed not, but I just don't know 20:16:18 <aurel32> waldi: u-boot with extlinux.conf, kvm 20:16:18 * ukleinek thinks most arm64 systems are booted by U-Boot 20:16:36 <aurel32> (or the rpi bootloader) 20:16:36 <waldi> aurel32: "kvm"? 20:16:56 <ukleinek> aurel32: ack, was just about to mention that. 20:17:10 <bwh> ukleinek: Yes or whatever Android uses, but Debian may be different 20:17:11 <waldi> aurel32: rpi uses flash-kernel, which copies stuff around. so this needs zboot support 20:17:14 <aurel32> waldi: qemu -enable-kvm, current S-mode is not supported under KVM, so you need to load the kernel directly 20:17:46 <ukleinek> "my" rpi doesn't use flash-kenrel 20:18:16 <aurel32> also IIRC, i guess a few testsuites are calling qemu with the default kernel and initrd, they need to be updated 20:18:33 <ukleinek> raspi-firmware handles kernel updates I think, but the intention is similar to f-k 20:19:13 <bwh> It seems like we need to revert this for both arm64 and riscv64 for now, then plan a transition together with the relevant boot loader maintainers 20:19:17 <waldi> aurel32: we already had an arch with zboot before, so they need to support it anyway 20:19:27 <ukleinek> I think the goal of migrating to efi-zboot is consensus? 20:19:37 <waldi> bwh: for experimental? 20:20:07 <ukleinek> waldi: if this influences backports, IMHO yes 20:20:12 <aurel32> note also that the situation between arm64 and riscv64 is a bit different as the later is the is already using EFI_STUB=y with an uncompressed kernel, so the kernel already works fine for both EFI non non-EFI, even with systemd-boot 20:20:31 <aurel32> so efi-zboot only brings compression 20:20:38 <bwh> waldi: Less important for experimental, but still 20:20:56 <aurel32> on arm64 it is needed to keep compression with systemd-boot 20:21:43 <ukleinek> Is systemd-boot one (or the?) motivation to migrate to efi? 20:22:38 <ukleinek> Just for me: efi-zboot is a compressed binary, the bios/bootloader is expected to extract it to the right location in memory and then dive into it? 20:23:00 <waldi> ukleinek: no. the included efi binary decompresses it 20:23:19 <ukleinek> ah, so it's like a zImage, just as efi binary. 20:23:29 <waldi> if you don't use efi, you need to recreate that decompressor somehow 20:23:57 <waldi> yes. there are even patches floating around to convert x86-64 to zboot 20:24:51 <aurel32> qemu/arm64 is also able to do the decompression, but only for gzip, not zstd 20:24:59 <aurel32> qemu/riscv64 is not able to 20:25:12 <waldi> issue is reported. someone needs to implement it 20:25:31 <waldi> neither does loog64 20:25:39 <ukleinek> And there is no zImage for arm64 and riscv64? That would be an alternative, right? 20:25:49 <bwh> Right, there is no zImage 20:26:01 * ukleinek guesses upstream doesn't want zImage 20:26:32 <waldi> zboot is supperior, as it runs in a capable environment already 20:27:00 <aurel32> the alternative is uncompressed kernel with CONFIG_EFI_STUB=y. It's what we were using on riscv64 before that change 20:27:23 <bwh> It really seems premature to make this switch when we know some boot loaders and QEMU do not support it 20:27:50 <carnil> maybe the question could be: We know this is experimental only, and trixie is not yet released so trixie-backports is not yet impacted, but would a tempoary revert be helpful to get some baseline done first on the other fronts and then re-apply the implementations (or maybe postpone it for after the trixie release at all?) 20:27:51 <aurel32> that make the kernel way bigger, but if you look at kernel + initrd the difference is not some important 20:27:54 <bwh> I know this is in experimental but that will go to unstable in the middle of the year 20:28:33 <ukleinek> For a softer transition it would be good if we could have both. Maybe a separate kernel image package providing the efi-zboot image to work on bootloader support? 20:28:56 <waldi> ukleinek: add a script to /etc/kernel that decompresses the kernel 20:29:18 <aurel32> well also people are using experimental during the freeze to get a newer kernel for newer hardware 20:29:46 <ukleinek> waldi: fine for me, that script should run automatically at install time to ensure that unprepared machines can still boot. 20:31:12 <ukleinek> maybe plus making sure that only Image or the efi-image is in /boot to not excessively eat partition space? 20:31:51 <bwh> So let's revert the change until we have something like that implemented 20:31:56 <ukleinek> ack 20:32:14 <waldi> bwh: if we set a definitive date 20:32:42 <ukleinek> waldi: what do you imagine approximately? 20:33:14 <ukleinek> something like "trixie release + X"? 20:33:22 <waldi> six months. this should be enough for people to follow 20:34:16 <ukleinek> trixie + six months? 20:34:19 <bwh> We also need a NEWS entry for any change like this 20:34:59 <aurel32> i still don't get what it brings besides compression to the riscv64 case 20:35:42 <ukleinek> Does upstream push into the efi direction? On both arm64 and riscv64? 20:37:26 * ukleinek eyes the agenda and wonders how we can close this discussion to have opportunity to handle the other issues, too. 20:37:54 <bwh> #agreed EFI_ZBOOT should be disabled for now on arm64 and riscv64 20:38:23 <ukleinek> I can care to look over aurel32's MR and merge it 20:38:26 <waldi> #agreed trixie + 6 months is re-enable 20:38:38 <waldi> ukleinek: no. just revet the original one 20:38:59 <bwh> #topic Bugs #1086028, #1087809, #1093200: mips spurious EFAULTs 20:39:24 <carnil> bwh: asked upstream to backport two commits to 6.1: https://lore.kernel.org/stable/Z79tTfjD-rCIa6EV@eldamar.lan/T/#u 20:39:34 <bwh> Yes, I saw that, thanks! 20:40:04 <ukleinek> waldi: fine for me 20:40:07 <bwh> So do we want to apply those already to unblock builds, or should we wait for a stable update? 20:41:24 <carnil> bwh: given it's mips6el and that we have the problem since 6.1.37-1 I'm not sure we should hurry up a next upload. I have 6.1.129 already prepared and a point release is upcoming so latest then we should have all in I believe 20:41:35 <carnil> 2025-03-15 is point release 20:41:43 <carnil> but if you think it should happen earlier I can do that 20:42:02 <bwh> That makes sense to me 20:42:05 <ukleinek> looking at https://buildd.debian.org/stats/graph-week-big.png the situation doesn't seem too bad. (But where is mipsel?) 20:42:16 <carnil> so I would have waited until upstream really queues up at least the two commits 20:43:10 <bwh> So no specific action needed here, I think 20:43:28 <bwh> #topic Bug #1071562: nfsd blocks indefinitely in nfsd4_destroy_session 20:43:49 <carnil> this one has two commits in 6.1.129-1 as well 20:44:17 <carnil> according to Chuck and other upstream people there are still known issues, for the above bug one reporter said that after applying the patches situation is stable 20:44:29 <carnil> so I have added closer for this bug in 6.1.129-1 20:44:41 <bwh> which is not released yet, right? 20:44:51 <carnil> no not in Debian 20:45:02 <bwh> OK 20:45:11 <carnil> 6.1.129-1 is just in https://salsa.debian.org/kernel-team/linux/-/merge_requests/1381 20:45:28 <bwh> Well, we can come back to this bug if it turns out not to be fixed 20:45:39 <carnil> yes 20:45:46 <bwh> #topic #1085178: linux-signed-amd64: Some BPF fentry hooks silently fail 20:46:24 <carnil> Relevant comment here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1085178#30 "I would guess that series will merge into 6.15, but we'll have to see." 20:46:57 <ukleinek> ..ooOO(The reporter has an @crowdstrike.com address ...) 20:47:04 <bwh> I have no idea what's going on here 20:47:28 <bwh> ukleinek: Yep, they use BPF to crash^Wsecure Linux systems now 20:48:18 <bwh> I don't think we have anything to do here 20:48:39 <bwh> If and when the fixes land upstream we maybe need to request backporting to stable 20:48:44 <carnil> wait until it lands in mainline, eventually maybe they go back to stable then 20:48:53 <ukleinek> apart from keeping an eye on that and check it's properly backported 20:49:02 <bwh> (but again I don't really understand the bug and I don't know whether it's practical to backport) 20:49:33 <bwh> #topic Bug #1095745: rockchip: NVMe unavailable on rk3568 platform 20:50:25 <carnil> this is a regression for all stable series. There is a commit/patch acked, but still (not checked today) it did not get up to mainline 20:51:13 <carnil> > the patch already is in the fixes-branch of the phy-tree [0], sho should 20:51:16 <carnil> make its way into 6.14-rc shortly. 20:51:21 <carnil> https://lore.kernel.org/lkml/6647031.K2JlShyGXD@diego/ 20:51:55 <bwh> It's in next 20:52:26 <carnil> ok 20:52:29 <ukleinek> It's not in linus/master 20:52:40 <bwh> Maybe we should cherry-pick it? 20:53:07 <carnil> I tried here to get it a bit faster in: https://lore.kernel.org/lkml/Z7gosm7PJMR0zCg4@eldamar.lan/ but apparently it does not seem that important 20:53:13 <bwh> because it's probably going to miss the point release otherwise 20:53:24 <ukleinek> sounds reasonable to cherry-pick 20:54:04 <ukleinek> hmm, the commit that is noted in the Fixes: line is only in v6.13-rc5? That's not what I expected. 20:54:24 <carnil> ok yes we can do that, so cherry-pick for the next experimental (debian/latest) and unstable upload (and then cerry-pick it as wlel for 6.1.y for bookworm) 20:55:09 <carnil> ukleinek: fbcbffbac994aca1264e3c14da96ac9bfd90466e is in 6.13-rc5 but it got backported to 6.1.123, 6.6.69. 6.12.8 20:55:25 <ukleinek> carnil: ah, that explains it. thx 20:55:44 <bwh> carnil: Will you take an action for that? (Only if you have the time) 20:55:59 <carnil> bwh: yes you can assign a an action for that to me 20:56:23 <bwh> #action carnil will apply upstream fix for #1095745 to affected branches 20:56:27 <carnil> I prefer that it lands officially in the stable series but if that fails I will cherry-pick it 20:56:37 <ukleinek> carnil: If you hit problems, feel free to get in touch. I probably have more time than usual for such things next week. 20:56:41 <carnil> "if taht fails in time for the next upload I mean" 20:56:43 <bwh> #topic Bug #1050578: linux-image-6.1.0-11-amd64: kernel disk device cache coherency issue: stale reads on /dev/sda1 20:56:58 <carnil> ukleinek: ok noted! 20:57:18 <bwh> I think the user did something crazy and this is not a bug 20:57:44 <ukleinek> bwh: that's what I thought when I saw "hexedit /dev/sda" 20:58:05 <carnil> bwh: some background on this if you all are interested (but we are short in time) 20:58:08 <carnil> I will try to be brief 20:58:25 <carnil> - this was an old bug without relevant action, I closed 20:58:31 <carnil> - reporter did not agreed 20:58:43 <carnil> - short interaction to let him explain, where he mentions he reported upstream 20:59:02 <bwh> I think the issue is the page cache of whole-disk and partition block devices are independent; this is known upstream and wontfix 20:59:05 <carnil> - since coonveration was a bit difficult, researched and there is https://lore.kernel.org/lkml/CA+jjjYTk=5wn2o46uNB+bJYX8xLgMP==dsJuvC94DvtN2f_6Yw@mail.gmail.com/ upstream 20:59:22 <carnil> which is "intersting to read" 20:59:38 <carnil> but at this point I think we can try to close the bug again and hoping reporter does not play pingpong on reopening 20:59:39 <bwh> So I propose I will tag this as wontfix 20:59:44 <bwh> and downgrade to normal 20:59:45 <ukleinek> ..ooOO(If it stings, I won't read it :-) 21:00:14 <bwh> Does anyone disagree with that resolution? 21:00:20 <ukleinek> I don't know 21:00:22 <carnil> bwh: yes please in this case. 21:00:38 <carnil> not sure about to close it as well, as this brings it away from open bugs plate, but we might diagree here 21:00:44 <bwh> #action bwh will mark #1050578 wontfix and reduce severity 21:00:59 <ukleinek> (FTR: I don't know = I don't disagree) 21:01:00 <carnil> (my goal is to keep our bugs somehow overviewable in the BTS) 21:01:05 <bwh> #topic Bug #1087981: linux-image-6.1.0-27-amd64: detected stalls in kernel log, system very slow on IO (regression) 21:01:28 <bwh> carnil: Yes, I get that. Can you easily exclude bugs marked wontfix? 21:01:43 <carnil> bwh: yes sure 21:02:18 <carnil> bwh: about #1087981: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087981#37 but not sure if that's helpful 21:03:06 <carnil> if someone has better ideas that would be welcome I think 21:03:35 <ukleinek> I like the reporter testing a newer upstream version 21:03:36 <bwh> Uh why are they doing drop_caches periodically? 21:04:31 <bwh> That's probably not going to help with "computer got slow" 21:04:48 <ukleinek> bwh: where do you see that? 21:04:59 <bwh> In the latest kernel.log.xz 21:05:15 <bwh> The times are several minutes away from the stall reports though 21:06:13 <bwh> So I have no idea what's going on 21:06:30 <carnil> so maybe let's wait that Willi reports back 21:06:34 <bwh> OK 21:06:45 <ukleinek> +1, so I think testing upstream and then eventually involving upstream sounds reasonable 21:06:57 <bwh> #topic Bug #1098354: linux-image-6.1.0: FriendlyElec R5S, one Ethernet-Port / PCI-Device missing on Kernels beyond 6.1.0-28-arm64 21:07:32 <carnil> duplicate of the previous discussed on, together with #1095745 21:07:37 <bwh> Oh right 21:07:48 <carnil> (and #1098250) 21:08:09 <bwh> #topic Bug #1098698: linux: Segfault and system hang on larger network file transfers 21:08:55 <bwh> carnil: I think you believe this to be fixed, but the reporter ran into another regression, right? 21:09:52 <carnil> bwh: yes the original trace posted was a known bug which got fixed, but reporter experiences those hangs in my understanding with the most recent kernels still (asked to explicitly confirm). But then this is still an open issue (again unrelated to the original posted trace) 21:10:34 <bwh> Can you ask for another crash log? 21:11:05 <carnil> ok yes. Is there anything else we can ask at this stage already on the problem? 21:11:46 <bwh> I'm guessing there may be some difficulty getting a log 21:12:20 <bwh> So you may need to point to netconsole documentation 21:12:24 <carnil> right because the system get unresponsive, but maybe attaching a netconsole could get enugh information 21:12:27 <carnil> ah same idea :) 21:12:42 <ukleinek> Is earlyprintk + sysrq a thing on amd64? 21:12:55 <ukleinek> That might be easier than netconsole 21:13:01 <waldi> yes, it is 21:13:27 <bwh> I'm not seeing how that would help 21:14:11 <ukleinek> bwh: because the output would appear, but getting that in a mail is difficult? 21:14:57 <bwh> This is not a boot failure so earlyprintk is irrelevant 21:15:56 <bwh> #action carnil will ask for new kernel log for #1088826 21:15:59 <ukleinek> ack, that's just an automatism of mine to thing earlyprintk when considering sysrq 21:16:12 <ukleinek> s/thing/think/ 21:16:32 <bwh> Scratch that 21:16:39 <bwh> #action carnil will ask for new kernel log for #1098698 21:16:58 <bwh> #topic Bug #1088826: /usr/share/bug/linux-image-686-pae/presubj: Fails to boot after 6.1.0-22 21:17:10 <carnil> I do not expect an answer here on the moreinfo question 21:17:14 <ukleinek> #1088826 waits for report feedback 21:17:28 <ukleinek> the reporter's email address looks suspicous 21:17:37 <carnil> has "sapammer @ ..." address 21:17:37 <bwh> Right 21:18:08 <bwh> So, nothing to do for this bug for now 21:18:25 <bwh> I will skip all the bugs < important 21:18:55 <bwh> #topic Issue linux#6: trixie kernel maintenance 21:19:09 <bwh> I still need to talk to the release team about this 21:19:40 <bwh> #topic New upstream versions 21:19:54 <bwh> I need to update firmware-nonfree, ktls-utils, and wireless-regdb. 21:20:19 <bwh> Did anyone look at a linux updatie to 6.14 yet? 21:20:33 * ukleinek didn't 21:20:39 * carnil is doing basic testing with the current 6.12.17-rc2 and 6.13.5-rc2 but no work done on 6.14 at all 21:21:06 <carnil> and I wonder when it's the best time to switch from 6.13.y stable series in experimental to an RC version of 6.14 21:21:06 <bwh> Well, if I ever get through my actions I will take a look at it 21:21:14 <carnil> early would be nice so that we "keep the pace" 21:21:21 <bwh> yes 21:21:49 <bwh> #topic Merge requests 21:22:20 <bwh> I looked at all the initramfs-tools MRs but didn't fully review all of them yet. I'm planning to make a release this week. 21:22:46 <bwh> We discussed linux#1384. Do either of the others need discussion? 21:23:09 <carnil> there would be as well #1359 were I think is disagreement 21:23:24 <carnil> but we are almost over the time and not sure how pressing a decision is here 21:23:38 * ukleinek doesn't understand the problem in !1359 21:23:39 <carnil> and waldi is anyway in the best position to explain the current issue 21:23:44 <bwh> We are well over time :-/ 21:24:15 <bwh> I haven't looked at #1359 yet 21:25:00 <ukleinek> I'll look into the arm64 MRs (!1313 + !1301) 21:25:34 <bwh> ukleinek: Thank you 21:25:55 <bwh> I will try to look at !1359 but can't promise it 21:26:02 <ukleinek> there are some more, !1295 21:26:53 <bwh> The script only shows MRs that have been changed in the last week 21:27:06 * ukleinek also looks into !1321 as he was involved into that already 21:27:31 <carnil> does it skip as well such in Draft and with failed CI? 21:27:41 <bwh> It skips drafts, yes 21:27:47 <carnil> (do not neet to answer, can look up in code myself later) 21:27:53 <carnil> bwh: ack 21:28:01 <ukleinek> though a look from someone with more knowledge about the debian/rules targets would be welcome 21:28:03 <bwh> failed CI should be indicated in the St(atus) column 21:28:18 <bwh> #topic AOB 21:28:33 * ukleinek will not be able to attend the meeting next week. 21:28:55 <carnil> for chair: I think it is in meanwhile my turn again 21:29:02 <bwh> Thank you 21:29:24 <ukleinek> so in two weeks it will be my turn. Feel free to assign me next week. 21:29:36 <carnil> ukleinek: ok! 21:30:18 <bwh> #endmeeting