19:02:15 <h01ger> #startmeeting
19:02:15 <MeetBot> Meeting started Wed Sep 28 19:02:15 2016 UTC.  The chair is h01ger. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:15 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
19:02:48 <jcristau> somebody stole our meeting time
19:03:03 <h01ger> #topic jenkins.debian.org monthly meeting - please say hi if you intend to participate - else thanks for being able to use this channel for a while :)
19:03:19 <mapreri> h01ger: btw, you can add the meeting topic to the #startmeeting line
19:03:20 <h01ger> jcristau: yeah, i just noticed…
19:03:28 <h01ger> mapreri: ah nice
19:03:36 <mapreri> that will also put the meeting name in the html main page of the report, iirc.
19:03:41 <mapreri> anyway
19:03:43 <mapreri> hi! o/
19:03:51 <tnnn> hi :)
19:03:58 <h01ger> hi. i'm here and just updated todo/status, see 4dc586b
19:05:17 <KGB-0> 03Holger Levsen 05master feb9028 06jenkins.debian.net 10TODO add OOMs to agenda
19:05:56 <h01ger> #topic jenkins.d.n status
19:06:17 * h01ger guesses fil and helmut are afk and hilights them for good measure
19:06:29 <h01ger> i've added pb-build7-amd64 running stretch…
19:06:37 <h01ger> for the reproducible fdroid jobs
19:06:58 <h01ger> and thus shutdown fil's jenkins-test vm
19:07:14 <h01ger> else there is nothing really new on the status, except OOM stuff
19:07:21 <h01ger> or am i missing something? :)
19:09:22 <mapreri> about general jenkins stuff, yeah, nothing new.
19:09:41 <h01ger> #topic jenkins.d.n status - OOMs and how to fix/address them
19:10:14 <h01ger> automatic zipping of the dumps is still missing, right?
19:10:28 <mapreri> right, but I do hope to not have to implement such things.
19:10:37 <mapreri> tnnn: do you had any progress in analyzing dumps/logs'
19:10:39 <mapreri> ? *
19:11:28 <h01ger> /var/lib/jenkins/userContent/ has 145G free space, so we can automatically publish them and remove dumps older than 2 days and modify the initscript to send mail on jenkins start maybe?
19:11:45 <tnnn> Not as much as I'd like. I did found a few interesting references but I need a little more time to get to something useful.
19:11:47 <mapreri> we could, yep.
19:12:06 <tnnn> I will try to do that later today.
19:12:22 <h01ger> mapreri: i think we should… we have the diskspace so meh?!?!
19:12:39 <h01ger> (and the initscript is in git already anyway…)
19:12:43 <mapreri> yeah.  like, put stuff in a http dir, then mail the url to the 3 of us?
19:13:04 <mapreri> and something to remove them a couple of days later, umh.
19:13:41 <h01ger> jenkins maintainance job, it already exist
19:13:46 <mapreri> yep
19:13:58 <h01ger> bin/maintenance.sh in git
19:14:26 <mapreri> yeah
19:14:44 <mapreri> h01ger: feel free to put that item on me, I'll get to it…
19:15:03 <mapreri> who knows, maybe RL will start to match a bit more my planning.
19:15:11 <tnnn> It would be really usefull to be able those dumps that are generated after OOMs force jenkins restart.
19:15:31 <mapreri> tnnn: missing verb?
19:15:33 <tnnn> **to be able to look at those
19:15:39 <mapreri> like the last one?
19:15:49 <tnnn> mapreri: yup ;)
19:15:57 <tnnn> mapreri: yup (2nd time) ;)
19:16:00 <h01ger> #agreed jenkins initscript (hosts/jenkins/etc/init.d/jenkins) should be modified to put heapdumps into /var/lib/jenkins/userContent/OOM/ and bin/maintenance.sh should them clean up after three days. the initscipt should also send mail to inform (more) people about restarts
19:16:14 <fil> hi (sorry I'm late)
19:16:16 <h01ger> tnnn: please send patches :)
19:16:41 <h01ger> hi fil, backlog at meetbot.debian.net…
19:17:16 <h01ger> to where should these mails go to?
19:17:35 <mapreri> h01ger: also, the initscript could take dumps and move them to an http place.  At the next restart clean the http place and copy an eventual newer heap dump.  That way the maintenance job doesn't get more work to do.  Is a problem to keep the dump for a full "cycle"? (i.e. between 2 restarts)
19:18:01 <mapreri> I'd say privately.  There could be secrets?  (like our passwords, I suppose).
19:18:03 <h01ger> #save
19:18:46 <tnnn> h01ger: Patches to that script or java memory handler? ;)
19:19:29 <h01ger> mapreri: put the dumps in …userContent/OOMs/$(pwgen -s 16) and mail those urls (gpg encrypted?) to people
19:19:39 <h01ger> tnnn: both? :-D
19:19:47 <mapreri> sounds an idea
19:20:02 <mapreri> not sure if I want to fiddle around with gpg though
19:20:08 * h01ger nods
19:20:23 <mapreri> (root would then have to have our key somehow?  bah, sounds too much trouble for no gain)
19:20:33 <h01ger> #info the dumps could contain passwords so maybe better put the dumps in …userContent/OOMs/$(pwgen -s 16) and mail those urls to specific people
19:21:05 <mapreri> (also log too, not only dump)
19:21:25 * h01ger nods
19:21:32 <h01ger> (you can use #info too)
19:21:38 <tnnn> mapreri: very mych true
19:21:41 <tnnn> **much
19:21:52 <mapreri> #info logs should be included near the heap dumps in the export
19:22:00 <h01ger> #save
19:22:29 <mapreri> I just hope this will really be something temporary and not a "somebody in the future will fix it for good, here is an handy mean"
19:22:31 <h01ger> anything else to discuss right now?
19:22:49 <tnnn> mapreri: as for me, getting the logs between restarts should be ok. Just don't restart too often...
19:22:51 <h01ger> i'm not sure we can remove much usage of the logparse plugin :/
19:22:54 <mapreri> h01ger: can we have a team@jenkins.debian.net alias mailing us personally instead of the ML, so private stuff like this can go there, instead of being between us?
19:23:01 <mapreri> and we avoid hardcoding list of us
19:23:14 <mapreri> root@ could go there too?  or maybe i can just use root@ for that.
19:23:36 <mapreri> logparse could be fixed to be less leaky and be more memory-friendly, I suppose.
19:23:47 * h01ger nods
19:24:05 <h01ger> mapreri: please make up your mind regarding mail aliases :)
19:24:21 <StevenC99_> you're already in the root: alias, right?
19:24:22 <mapreri> yeah, that was just a wild idea that came to me right now, without proper thought
19:24:31 <mapreri> StevenC99_: I am.
19:24:49 <mapreri> h01ger: writing while thinking during a meeting considered bad habit, sorry :)
19:24:52 <StevenC99_> oh good, just making sure
19:25:10 <h01ger> the reproducible builder jobs use the logparse plugin… and run 6000 times a day
19:25:33 * tnnn is probably not on any jenkins list, so please add if not mailing directly
19:25:34 <h01ger> hi StevenC99_ :)
19:25:36 <mapreri> "Last Release Date: Oct 20, 2015".  For such a plugin that looks quite in the past.   (last logparser plugin release date)
19:25:40 <StevenC99_> holger: (hi!)
19:25:54 <tnnn> maybe there is something newer that can replace logparser?
19:27:04 <tnnn> because as far as I can tell quite a few leads point to it and file parsing
19:27:07 <mapreri> ftr: https://github.com/jenkinsci/log-parser-plugin/commits/master
19:27:47 <helmut> wrt putting heapdumps to userContent: they may contain credentials or similar stuff. having them public may not be the best idea
19:27:55 <h01ger> tnnn: i dont think we can replace logparser / stop using it / i dont know of any replacement. we might be able to use it less
19:28:24 <helmut> ah. you figured as well :)
19:28:28 <h01ger> helmut: hi! already answered in http://meetbot.debian.net/debian-qa/2016/debian-qa.2016-09-28-19.02.html
19:28:52 <h01ger> next topic (or is there still stuff to discuss about OOM situation improvements?)
19:29:15 <tnnn> h01ger: I know that something depends on it, can you tell me later what exactly (after the meeting maybe)?
19:29:15 <mapreri> helmut: do you think having them in a non-easily guessable path is enough protection?  or shall we put them in a directory covered by e.g. an .htpasswd ?  (just to keep things easy)
19:29:25 <helmut> mapreri: yes
19:29:29 <mapreri> yes what?
19:29:33 <helmut> mapreri: sufficient
19:29:42 <mapreri> ok
19:30:15 <helmut> h01ger: we can implement logparser in /bin/sh. ;)
19:30:37 <KGB-2> 03Holger Levsen 05master 1bce24d 06jenkins.debian.net 10job-cfg/reproducible.yaml reproducible debian: stop using logparser plugin for build jobs
19:30:39 <mapreri> or in perl... that ought to be better memory-wise, and also runtime-wise.
19:30:41 <h01ger> helmut: please do
19:30:46 <tnnn> h01ger: would be probably much more memory efficient ;)
19:30:59 <h01ger> tnnn: rgrep logparse job-cfg
19:31:12 <tnnn> **that was supposed to go to helmut
19:31:30 <tnnn> h01ger: kk, will check, thx
19:31:32 * mapreri thinks tnnn is mistyping a lot today :)
19:31:33 <h01ger> next topic?
19:31:57 <h01ger> #topic jenkins.d.o migration - next steps
19:32:17 <tnnn> mapreri: I could pretend that it's due to a new laptop... But the truth is that is't just a Bad Writing Day ;>
19:32:23 <h01ger> so status is: we have jenkins installed, we can login into the UI and we have plugins installed
19:32:30 * tnnn did that again...
19:32:33 <h01ger> #info so status is: we have jenkins installed, we can login into the UI and we have plugins installed
19:33:40 <h01ger> the next steps should probably be to update update_jdn.sh so it yells when packages are missing (so we can tell DSA to install them) and to deploy the jenkins stuff in /srv/jenkins.d.o
19:34:35 <h01ger> and then to add a node and a test job, i'd suggest to go with pb7 and the reproducible fdroid job. (cause thats 1 job which runs on 1 node and is nicely isolated from all the rest)
19:35:08 <mapreri> h01ger: another job that could go, is to set up chroot-run to run elsewhere.
19:35:15 <h01ger> (and then follow the rest of the migration plan as outlined in TODO in git)
19:35:22 <helmut> h01ger: telling dsa == sending patches for a meta package
19:35:25 <mapreri> doesn't fdroid need to store artifacts to jenkins.d.n/userContent/reproducible/ too?
19:35:53 <h01ger> mapreri: yes, to jenkins.d.o/userContent/reproducible/ - ah - humpf
19:36:11 <mapreri> s/\.o/.n/
19:37:15 <mapreri> imho, chroot-run is the best suited to be moved first.  They store nothing but jenkins logs (afaik, icbw), and they could run just about everywhere.
19:37:16 <h01ger> jdn should become pb-build0 and so far i thought it would stay hosting reproducible.debian.net and tests.r-b.o. now i guess i need to rethink that
19:37:33 <h01ger> mapreri: thats not a job?!?
19:37:44 <mapreri> chroot-run-based jobs *
19:37:56 <mapreri> chroot-run.sh based jobs **
19:38:10 * h01ger wants a single job for testing stuff… once that works, migrate jobs
19:38:28 <mapreri> I mean, I beg that if moving lintian building away is hard, everything is wrong...
19:38:56 <h01ger> g-i-installation_debian_sid_daily_lxde whatever. the results are displayed within jenkins…
19:39:15 <h01ger> and anyway
19:40:15 <helmut> mapreri: chroot-run.sh has provisions to copy stuff. that's used by some haskell jobs
19:40:24 <h01ger> #info next step is to update update_jdn.sh to deploy scripts+configs on jenkins.d.o too (to a changed path, /srv/jenkins.debian.org/ so we need to deal with that…) and to tell us (to send patches to DSA) which packages are missing
19:41:01 <mapreri> anyway, given that in jerea nothing will run, I expect nearly no new packages to be installed there.
19:41:47 <h01ger> #info after that, we can start with running 1 job on jenkins.d.o: self_maintenance
19:42:41 <h01ger> sounds somewhat sane? :)
19:43:32 <mapreri> sounds good to me.
19:43:37 <helmut> can't we just move to using /srv/jenkins.d.o on jenkins.d.n as well by creating a suitable symlinks once?
19:43:51 <h01ger> helmut: probably
19:43:57 <mapreri> helmut: but it's really not only about that.
19:43:58 <fil> h01ger: would it not be better to have the packages installed by a meta package, and just maintain that outside update_jdn.sh, so that all the package installing in update_jdn would be replaced by an attempt to install the meta package?
19:44:18 <h01ger> fil: probably ;) (DSA has such metapackages)
19:44:22 <helmut> fil: that's a dsa requirement anyway
19:44:41 <mapreri> but really, in jenkins.d.o itself we should need nothing anyway.
19:44:51 <mapreri> only jenkins itself should run there, right?
19:44:51 <fil> mapreri: quite
19:45:32 * h01ger nods
19:45:37 <mapreri> self_maintenance only runs df/gzip/find and all such utilties, and probably that's the only job that is going to run there
19:46:05 <KGB-0> 03Holger Levsen 05master d68df54 06jenkins.debian.net 10TODO update next steps for j.d.o
19:46:13 <h01ger> #topic AOB
19:47:05 <mapreri> none from me.
19:47:21 * fil suspects that the subject of where the artifacts went and/or why $WORKSPACE is not where we think is not really meeting material, right?
19:48:14 * mapreri once thought there was an hidden symlink to /dev/null, considering that they seem to vanish in your tests :P
19:48:35 <h01ger> fil: at least mentioning that this problem still exists surely is!
19:49:14 <h01ger> fil: and that problem will become more urgent as we will run everything on remote nodes soon^wwhen we migrate to j.d.o
19:49:20 <fil> Well, last I saw it was, but then I went on holiday and forgot everything :-))
19:49:34 <h01ger> :)
19:50:59 <h01ger> fil: i suppose we need to look into this problem again…
19:52:15 <h01ger> #info the logparser plugin has been removed from all 88 reproducible build jobs (which daily run ~6000 times)
19:52:15 * fil wants root on j.d.n so I can strace everyting and watch the meltdown ... Hmm, maybe not -- any better ideas?
19:52:50 <h01ger> fil: i intend to bring your test vm back soon, btw…
19:54:01 <fil> I guess I could try setting up the test_vm with all the same remote job weirdness, and see if it gets the same behaviour
19:54:07 <tnnn> h01ger: Yup, I've noticed. We will see if that improves situation.
19:54:32 <h01ger> fil: yeah
19:54:48 <h01ger> fil: or you can have root on jenkins.d.n and strace around ;)
19:55:13 <fil> OK Great :-)
19:55:20 <helmut> (any reason why fil doesn't have root? if too many root, swap him for me.)
19:55:32 <h01ger> fil: you know the drill, please send patches for that :)
19:55:44 <h01ger> helmut: because so far he didnt want to :)
19:56:26 <fil> helmut: I think I don't have root because I wrote the bit that allowed people to selectively have root and needed a test subject.  Also, blame avoidance :-)
19:57:58 * h01ger thinks we can close the meeting in under an hour, any disagreements?
19:58:09 <fil> none
19:58:24 <tnnn> nope
19:58:48 <h01ger> #info many thanks to http://profitbricks.com for supporting jenkins.debian.net since 2012 - today with more than 120 cores and >300 GB RAM
19:59:17 <h01ger> many thanks to you reading this too! :)
19:59:55 <h01ger> #endmeeting