16:00:03 <ln5> #startmeeting Debian snapshot service meeting #5
16:00:03 <MeetBot> Meeting started Mon Jun 10 16:00:03 2024 UTC.  The chair is ln5. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:03 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:04 <weasel> Hello everyone,
16:00:15 <ln5> timing is everything
16:00:25 <ln5> there's an agenda in https://pad.sigsum.org/p/2024-06-10_snapshot.do
16:00:45 <MeetBot> weasel: Error: Can't start another meeting, one is in progress.
16:00:55 <weasel> ah, slow me.
16:00:59 <weasel> Mon 18:00:08 [ adsb      ] [ h01ger  ] [ ln5    ] [ MeetBot] [ pkern[m]] [ zigo]
16:00:59 <weasel> Mon 18:00:08 [ aurel32   ] [ jas4711 ] [ lucas  ] [ noahm  ] [ tianon  ]
16:00:59 <weasel> Mon 18:00:08 [ axhn      ] [ jcristau] [ lyknode] [ olasd  ] [ vimer   ]
16:00:59 <weasel> Mon 18:00:08 [ daissi    ] [ josch   ] [ mapreri] [ pabs   ] [ waldi   ]
16:00:59 <weasel> Mon 18:00:08 [ fireonlive] [ kpcyrd  ] [ maytham] [ peb    ] [ weasel  ]
16:01:02 * weasel present
16:01:04 <ln5> #topic Status updates
16:01:11 * ln5 present
16:01:21 <weasel> ln5: status update.  should I go?
16:01:25 <ln5> please do
16:01:42 <weasel> ok,
16:01:49 <weasel> we have new hardware, named snapshot-mlm-01,
16:02:10 <weasel> the DSA setup is done, snapshot software also installed and running
16:02:22 <weasel> currently, we are mirroring the farm from both sanger and leaseweb,
16:02:44 <weasel> progress 37603/65536 done (~57%)
16:02:59 <weasel> so in another month or so we should be done.
16:03:18 <weasel> snapshot-mlm already does import runs of all the suites we care about
16:03:18 <ln5> based on disk utilization i estimated we were at 70%
16:03:34 <weasel> (debian; debian-debug;  debian-ports;  debian-security;  debian-security-debug)
16:04:15 <weasel> snapshot-mlm's apache is also configured to stream files it does not have from another snapshot instance, so it can, theoretically, already answer all the queries
16:04:22 <weasel> plan:
16:04:25 <weasel> - keep syncing
16:04:33 <weasel> - point snapshot to just sanger,
16:04:48 <weasel> (3) make snapshotdb-manda (which is the DB that leaseweb uses) a replica of snapshot-mlm
16:05:03 <weasel> (4) point snapshot to snapshot-mlm and leaseweb
16:05:17 <weasel> (5) make sallinen be a replica of snapshot-mlm as well
16:05:28 <weasel> (6) server snapshot from all three until the sync is done;
16:05:44 <weasel> (7) remove sallinen/sanger from snapshot
16:05:51 <weasel> (x) migration is finished.
16:06:14 <ln5> splendid, thanks a lot!
16:06:17 <weasel> (independent) import -ports archives that axhn has
16:06:18 <weasel> any questions/comments?
16:06:49 <axhn> About my -ports story ...
16:07:05 <weasel> go ahead, please.
16:07:07 <ln5> weasel: what of the above points in the plan can i help with?
16:07:43 <weasel> ln5: good question; axhn and ports first, then help stuff?
16:07:43 <axhn> Shall I continue updating my mirror? I have the feeling it's no longer necessary. I still can do it for another few months, there's enough space.
16:08:22 <weasel> adsb: I don't think it should still be necessary, but it also doesn't hurt.  keep it like it is until we have started importing your things myabe?
16:08:28 <axhn> And, possibly at another place and another time, we should discuss how to do the import
16:08:38 <weasel> we totally should.
16:08:56 <weasel> we can do that today, once the other points are done, time permitting.
16:09:02 <axhn> Aye
16:09:08 <weasel> any other ports stuff for now?
16:09:31 <ln5> would like to be part of that but have a hard stop at 17, let's see how that works out
16:09:58 <weasel> ln5: I think your time would be best spent by getting the -dev system working; so we have some place to develop and test changes and fixes to the snapshot code.
16:10:17 <ln5> great, i'll get that started then
16:10:56 <weasel> there's some requests that seem low hanging and that we should be able to merge, after an iteration or two.  however, that'd be a lot nicer with a dev/test system and maybe even with a testsuite
16:11:11 <ln5> yes, testsuite is on my list too
16:11:31 <weasel> so in general, get the stuff up, maybe add a testsuite, start merging things and investigate how to do software deployments in a saner way.
16:11:57 <ln5> how is sw deployment happening today?
16:12:03 <weasel> git pull
16:12:07 <ln5> sweet
16:12:15 <weasel> and currently the trees are not clean
16:12:20 <ln5> ofc
16:12:25 <weasel> and the webserver and varnish configs live in dsa-puppet
16:12:41 <weasel> so there's room to change things for the better of everyone
16:12:45 <ln5> would you recommend anything in particular for deploying? i saw ansible happening elsewhere
16:13:39 <weasel> not sure; I have a few crazy ideas; like maybe put stuff into containers we autobuild from source.  but in a way I'm happy for other people to come up with solutions if they want to continue running the show
16:14:37 <weasel> anything else or should we move to merging the ports stuff?
16:14:51 <ln5> ah yes, you mentioned that. i've got docker scars from other projects but maybe...
16:15:04 <weasel> ln5: rootless podman :)
16:15:20 <weasel> it has other sharp edges, but if you're careful, you might avoid scars
16:15:30 <ln5> yes, much better. still stuff happens.
16:16:08 <ln5> let's go to merging ports, as part of next steps
16:16:13 <weasel> great;
16:16:18 <ln5> #topic Next steps
16:16:30 <weasel> axhn: anything you want to share up front or should I start poking you with questions?
16:17:01 <axhn> Well, currently it's about 4.1Tbyte of data, organized in zfs snapshots.
16:17:27 <axhn> Main question, how to get that to your place in an efficient way.
16:18:05 <axhn> Another questions, should I drop duplicate zfs snapshots and incomplete mirrors first?
16:18:35 <weasel> I think everything that is incomplete should go, and duplicates serve very little purpose
16:19:56 <axhn> Snapshot names are time-based, like "2024-06-10Z10-55-54" and are the time the mirror job was completed.
16:20:28 <weasel> sounds good.
16:20:36 <weasel> snapshot has this "dump" format that we used back in the day
16:20:37 <weasel> https://volatile.noreply.org/2024-06-10-luL5x6R6IyE/dump-87318
16:20:39 <weasel> is an exactly
16:20:44 <weasel> is an *example
16:21:00 <weasel> (we used this before postgresql WAL shipping was a thing to transfer the metadata to the secondary site)
16:21:14 <weasel> it should be quite straight forward to generate for a given directory tree
16:21:27 <weasel> so if we could get one of those per snapshot, that'd be a great start.
16:21:35 <axhn> I see
16:21:52 <weasel> and then for the actual objects, maybe you could create a list of the sha1sums of all the files you have,
16:22:07 <axhn> Would avoid transferring the files you have in the farm anyway.
16:22:21 <weasel> and then we compare that against the DB/what we already have, and then produce a list of files we need to copy
16:22:36 <weasel> exactly
16:23:52 <weasel> does that sound doable?  do you have an alternate idea?
16:24:25 <axhn> Do you have a tool around that creates the dump file?
16:25:31 <weasel> the current snapshot script can make one given a snapshot DB
16:25:33 <axhn> My idea was rather file-system based, converting the snapshots into a hardlink farm. A lot of work, but might be importable right away on your side.
16:25:40 <weasel> I don't have an existing tool that can create one from a filesystem,
16:26:01 <weasel> it should be quite straight forward, though
16:26:12 <weasel> hardlink farms don't buy me anything
16:26:26 <axhn> Okay, I'll try to implement that.
16:26:42 <weasel> I can also help with that if needed.
16:27:07 <axhn> I'd say I'll mail you a few dump files, and you tell me whether it's good.
16:27:26 <weasel> works for me
16:27:42 <weasel> we could also start with the identifying missing files if you prefer
16:27:57 <weasel> copying those might take a bit longer (but then, 4t isn't that much)
16:28:46 <weasel> https://salsa.debian.org/snapshot-team/snapshot/-/blob/master/snapshot?ref_type=heads#L1277
16:28:48 <ln5> we could make that 1G -mlm has a 10G if that helps
16:28:56 <axhn> Rather not, don't know how much load the local end might handle.
16:29:01 <weasel> is the code that dumps the file
16:29:08 <ln5> axhn: fair enough
16:29:18 <axhn> (early thought included shipping hard drives :-)
16:29:24 <weasel> ln5: If done permanently, I think it would be a good thing
16:29:42 <weasel> ln5: we will have other use-cases for transfering lots of data *out*  (backups, imports into s3, etc)
16:29:52 <axhn> So, I'm good for the moment
16:30:02 <ln5> weasel: yes, i will look into it
16:30:25 <ln5> #action ln5 try to turn -mlm's 1G into 10G
16:30:29 <weasel> axhn: so basically that for loop over the query result could be any recursive filesystem walk
16:30:50 <ln5> #action ln5 get the -dev environment up and running
16:31:11 <ln5> #action ln5 write a testsuite, at least for parts of the code (imports, web app)
16:32:56 <ln5> do we want to talk more about the import or should we move into open questions?
16:33:10 <ln5> #idea for sw deployment, maybe put stuff into containers that we autobuild from source
16:33:28 <weasel> I'm open to move on or to answer questions;  I have no other business
16:33:34 <axhn> I have an idea about the gap from Jul '23 to Jan '24, but perhaps later
16:33:54 <ln5> we seem to have plenty time, please go ahead
16:34:09 <axhn> Okay ...
16:34:38 <axhn> As another business, I scan the /pool/ directories, and write logs, for the past ten years or so.
16:35:16 <axhn> Therefore, I can tell when (+/- 6 hours) a file appeared in the pool, with name and hashsums.
16:35:49 <axhn> So we could use that to re-create /pool/ of the -ports for the gap. /dists/ however is lost.
16:36:15 <axhn> Also, if a particular file cannot be found by hash sum, it's lost as well.
16:36:26 <axhn> Question, should I spend time on this at all?
16:36:31 <weasel> so it'd be just metadata?
16:36:41 <weasel> no files we don't already have will be added, right?
16:36:58 <weasel> or would a reasonable percentage of those be available from other sources?
16:37:36 <axhn> I never checked how much would be around.
16:38:12 <weasel> if we can get a lot of missing files, we could import into another archive ("debian-ports-pool/extra/whatever").
16:38:13 <axhn> But if we ever have the chance to access some other farm, we could fill the gap.
16:38:37 <axhn> (although I doubt many people have archived ia64)
16:38:38 <weasel> it would give us all the .debs and whatnot, make them available when searching for packages versions, but would not "pollute" the full mirror imports
16:39:46 <weasel> I'm up for importing them; I'm not convinced it's useful but if you feel passionable about it, I'm up for doing my part
16:40:23 <axhn> Perhaps we should take that to the list to avoid misunderstandings. So, yes, that's just metadata.
16:42:29 <ln5> ok, do we have bystanders patiently waiting for "open questions" or "other" to start?
16:43:53 <weasel> seems not
16:43:55 <ln5> if not, i propose we skip to "next meeting" and then close
16:44:08 <ln5> #topic Next meeting
16:44:25 <ln5> when do we want to meet next? 4w from today?
16:44:51 <weasel> july 8?
16:44:57 * axhn checks the championship schedule *ducks*
16:45:11 <weasel> hockey?
16:45:14 <ln5> ha
16:45:30 <axhn> soccer
16:45:43 <axhn> July 8th is fine for me
16:45:46 <ln5> if 4w is a good time period and y'all like 1600 UTC, july 8 at 1600 is good
16:46:24 <ln5> ok, going...
16:46:33 <ln5> #agreed next meeting 2024-07-08 1600Z
16:46:47 <weasel> https://volatile.noreply.org/2024-06-10-GhHjY4ghgVA/83B482CE-6D38-487C-BF98-826019E61C28.ics
16:46:59 <ln5> unless there's other things to discuss, let's close
16:47:04 <weasel> thanks
16:47:07 <ln5> #endmeeting