13:59:57 <karsten> #startmeeting metrics team
13:59:57 <MeetBot> Meeting started Thu Jul 28 13:59:57 2016 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:59:57 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:00:07 <karsten> it's meeting time. who's here for the metrics team meeting?
14:00:32 * karsten already saw iwakeh
14:00:40 <iwakeh> right :-)
14:00:45 * qbi lurks.
14:00:48 <karsten> hi qbi!
14:01:06 * karsten finds the pad..
14:01:33 <karsten> https://pad.riseup.net/p/zUNzEIFRq5S4
14:03:16 <karsten> okay.
14:03:21 <karsten> * Bridge descriptor sanitizer (karsten)
14:03:22 <iwakeh> ok.
14:03:39 <karsten> I spent the last 20 days (well, it felt like 20) writing tests.
14:03:45 <karsten> and spotted many bugs.
14:03:49 <iwakeh> hihi
14:03:58 <iwakeh> good.
14:04:08 <iwakeh> one question
14:04:11 <karsten> I also found out that the batch process that re-processes archives broke.
14:04:16 <karsten> after 13 of 28 or so days.
14:04:20 <karsten> out of memory.
14:04:23 <karsten> sure, what's the question?
14:04:24 <iwakeh> oh no.
14:04:43 <iwakeh> these tests having a TODO; do they fail already?
14:04:44 <karsten> I have an idea what the reason could be. I don't have a good fix though.
14:04:51 <karsten> no, I changed them all to pass.
14:04:57 <karsten> and to fail once we fix things.
14:05:14 <iwakeh> maybe, fix before refactoring?
14:05:25 <karsten> sure!
14:05:40 <iwakeh> but, the batch ...
14:05:48 <karsten> should I fix them, or would you want to look into that?
14:06:00 <iwakeh> that's topic 2
14:06:04 <iwakeh> planning.
14:06:08 <karsten> ok :)
14:06:14 <karsten> yes, the batch.
14:06:26 <karsten> so, we're keeping a data structure of all file digests we're processing.
14:06:31 <karsten> to avoid processing them again.
14:06:38 <karsten> and that data structure grows and grows.
14:06:46 <karsten> apparently, after 13 days, it grew too much.
14:06:49 <iwakeh> needs bigger hw.
14:07:01 <karsten> well, maybe.
14:07:03 <iwakeh> how much RAM?
14:07:09 <karsten> 8g
14:07:31 <karsten> hmm, I wonder if that old mac mini can handle more.
14:07:42 <iwakeh> ok, that was for which amount of files processed?
14:07:59 <karsten> 600g, I think.
14:08:12 <iwakeh> which is ?
14:08:17 <iwakeh> half ?
14:08:21 <karsten> ah!
14:08:26 <karsten> well, 40% or so.
14:08:37 <karsten> 240g.
14:08:44 <karsten> 600g is the total size.
14:08:47 <iwakeh> well, I could offer 32G ram.
14:09:00 <iwakeh> but I'd have to download ...
14:09:18 <karsten> right. and I'd want to keep these archives offline.
14:09:32 <iwakeh> yes. well
14:09:41 <karsten> so, I just restarted the batch where it stopped.
14:09:44 <iwakeh> then we need to improve the processing
14:09:53 <karsten> in theory, it'll break again at 80%, and then it will run through.
14:09:58 <iwakeh> won't it reprocess?
14:10:11 <karsten> nope. I moved the old files away.
14:10:32 <iwakeh> why not let it chew on smaller chunks?
14:10:35 <karsten> improving the processing would also be my favored solution.
14:11:00 <karsten> well, I could have moved the last 20-30% away, too. true.
14:11:03 <iwakeh> such a reprocessing might come up again?
14:11:11 <karsten> right.
14:11:20 <iwakeh> new ticket?
14:11:34 <karsten> so, my plan was to use an LRU cache instead of keeping all digest.
14:11:47 <karsten> but that's also just my guess that it's this data structure. I don't know for sure.
14:12:01 <karsten> I had jvisualvm running, but that broke after 90 hours for some other reason.
14:12:07 <karsten> new ticket sounds good.
14:12:08 <iwakeh> the processed ones could be stored in a simple db, too.
14:12:49 <karsten> well, switching to a db sounds like a bigger change.
14:13:08 <karsten> which also crossed my mind: fix all the bugs now, do the reprocessing afterwards.
14:13:19 <iwakeh> yes?
14:13:35 <iwakeh> you mean the bugs
14:13:44 <iwakeh> found in the refactoring part?
14:13:47 <iwakeh> and
14:13:48 <karsten> yep.
14:13:53 <iwakeh> ok
14:14:04 <karsten> I don't think they were ever triggered, because tonga was always nice enough not to give us bad data.
14:14:13 <karsten> still, would be good to fix them.
14:14:19 <iwakeh> yes, if reprocessing can wait a little.
14:14:28 <karsten> yes, a week or two.
14:14:42 <iwakeh> then that should be done.
14:14:59 <karsten> alright. let me create that ticket for the out-of-memory problem later today.
14:15:06 <iwakeh> fine.
14:15:16 <karsten> ok.
14:15:33 <karsten> moving to the next topic?
14:15:42 <iwakeh> ok
14:15:47 <karsten> * CollecTor planning (iwakeh)
14:16:11 <iwakeh> well, we have milestones(ms) for
14:16:24 <iwakeh> the collector (ct) release
14:16:38 <iwakeh> I'm wondering when to put out the
14:16:43 <iwakeh> first ct release.
14:16:47 <iwakeh> I'd like
14:17:03 <iwakeh> to have that soon when all the 1.0.0 ms tickets are done.
14:17:25 <iwakeh> https://trac.torproject.org/projects/tor/query?milestone=CollecTor+1.0.0&group=status&order=priority
14:17:56 <iwakeh> #18865 will be ready for review today
14:18:15 <iwakeh> and  #19169 could rather be moved to ms 110
14:18:29 <karsten> I'm not sure if we can add #19317 before we add #19755.
14:18:44 <karsten> still, having #19317 in 1.0.0 seems useful.
14:19:13 <iwakeh> move it to 110
14:19:14 <iwakeh> ?
14:19:40 <iwakeh> add release 101?
14:20:19 <karsten> so, if we assume that reprocessing bridges will take another few weeks,
14:20:25 <karsten> do you think 1.1.0 would be out by then?
14:20:54 <iwakeh> depends, what we assign to ms 1.0.x
14:20:55 <karsten> what was your idea for releasing 1.0.0?
14:21:30 <iwakeh> good question.
14:22:04 <iwakeh> just noticing that
14:22:25 <iwakeh> there is a ticket missing for the release process
14:22:37 <iwakeh> the signing uploading whatever needs to done.
14:22:42 <karsten> right.
14:23:32 <iwakeh> before 10th of Aug?
14:23:46 <karsten> so, #2966 needs more discussion before being included in 1.1.0.
14:23:59 <karsten> I'd say unassign from that milestone.
14:24:06 <iwakeh> ok
14:24:23 <karsten> and #19317 goes to 1.1.0?
14:24:42 <iwakeh> isn't done?
14:24:43 <karsten> would it make sense to move #19720 back to 1.0.0?
14:24:53 <karsten> ah, I didn't reload.
14:25:09 <karsten> not done yet, should I move it?
14:25:38 <iwakeh> ok.
14:26:43 <iwakeh> we can have a 1.0.x  for the fixes.
14:26:56 <karsten> sure.
14:27:58 <karsten> removed #2966 from milestone.
14:28:15 <iwakeh> so, have priority on the ms 100 tickets?
14:28:31 <iwakeh> I think I work on these mostly.
14:29:21 <karsten> okay, so there are three tickets left?
14:29:39 <karsten> can I add a fourth? :)
14:29:40 <iwakeh> four, if we move the runtime configuration change ticket.
14:29:49 <karsten> which one?
14:29:50 <iwakeh> sure.
14:29:58 <iwakeh> you just named it
14:30:22 <iwakeh> #19720
14:30:41 <karsten> ok. should I move it?
14:30:49 <iwakeh> done.
14:30:54 <karsten> ok.
14:31:13 <iwakeh> i can also add the ant tasks
14:31:27 <iwakeh> for pmd&findbugs this week.
14:31:52 <karsten> hmm, but we wouldn't fix any of those issues before the 10th, right?
14:31:59 <karsten> well, s/any/many/
14:32:02 <iwakeh> you're right.
14:32:30 <karsten> my fourth (now fifth or sixth) ticket would be about improving the scheduler a bit.
14:32:38 <karsten> things like:
14:32:39 <iwakeh> how?
14:32:39 <karsten> undo path changes (everything under out/)
14:32:39 <karsten> make recent/ truly configurable
14:32:39 <karsten> start at 00:00.000 of configured minute, not x minutes from current time
14:32:39 <karsten> add mode to run once immediately
14:32:55 <karsten> things that came up while testing today.
14:33:26 <iwakeh> x minutes from current?
14:33:43 <karsten> here's what I did:
14:33:46 <iwakeh> well, just add tickets for these :-)
14:33:58 <karsten> I edited collector.properties to contain the next minute, like 35.
14:34:09 <karsten> then I started the process at, say, 34:15.
14:34:16 <karsten> and it would start at 35:15.
14:34:23 <karsten> when it should ideally start at 35:00.
14:34:32 <iwakeh> ah, that's interesting.
14:34:38 <karsten> but yes, I can be even more verbose than those four lines in the ticket. ;)
14:34:42 <iwakeh> period was 60 I suppose?
14:34:58 <iwakeh> good ;-)
14:35:03 <karsten> hmm, no, 10.
14:35:23 <iwakeh> oh?
14:35:42 <karsten> but minutes.
14:35:57 <iwakeh> yes, that need clarification in a ticket ...
14:36:00 <karsten> :)
14:36:11 <iwakeh> :-)
14:36:26 <karsten> are you going to create a ticket for the release?
14:36:36 <iwakeh> yes.
14:36:42 <karsten> I usually follow the instructions for releasing metrics-lib line by line.
14:36:50 <iwakeh> ok.
14:37:22 <karsten> okay, I think that's a good plan for 1.0.0 then.
14:37:29 <iwakeh> right.
14:37:34 <iwakeh> will you begin the
14:37:39 <karsten> let's make a plan for 1.0.1 or 1.1.0 after that.
14:37:44 <iwakeh> sanitizer bugfixes?
14:37:54 <iwakeh> sure.
14:38:19 <karsten> yes, happy to.
14:38:24 <iwakeh> I'd like to make a suggestion for that test class.
14:38:26 <karsten> should I also fix findbugs/pmd issues?
14:38:31 <karsten> please do!
14:38:40 <iwakeh> hmm
14:39:07 <iwakeh> the one-liners, anything else might be a real big change.
14:39:07 <karsten> I don't have to.
14:39:16 <karsten> okay.
14:39:22 <karsten> what's the suggestion?
14:39:39 <iwakeh> the things that are really small.
14:39:44 <iwakeh> and of course
14:39:55 <iwakeh> the potential null dereferences and the like.
14:40:18 <iwakeh> these can be done while working on the functional errors.
14:40:32 <iwakeh> i.e. the TODOs you identified.
14:41:00 <karsten> right.
14:41:05 <karsten> some things need more thoughts.
14:41:10 <karsten> like removing System.gc(); ...
14:41:21 <iwakeh> true, we do not
14:41:27 <karsten> I mean, in theory I agree that it shouldn't have to be there.
14:41:35 <karsten> but then it's there because we ran out of memory before.
14:41:46 <karsten> so maybe we should look what happened and if it still happens.
14:41:52 <iwakeh> need to change some of the rules or toss one or the other.
14:41:57 <karsten> not following findbugs suggestions blindly. ;)
14:42:00 <karsten> right.
14:42:03 <iwakeh> right.
14:42:09 <karsten> what's the suggestion about the test class?
14:42:31 <iwakeh> Configuration needs just an InputStream
14:42:34 <karsten> to be clear, I'd want to make that class better. the goal is not just to increase coverage.
14:42:37 <iwakeh> which can come from a String.
14:42:50 <karsten> the goal is also to write better test classes for other code bases.
14:42:58 <iwakeh> it's about simplifying the test class.
14:43:00 <karsten> hmm, didn't I fix that?
14:43:21 <iwakeh> I didn't have time to look at the class before this meeting.
14:43:33 <iwakeh> so maybe.
14:43:36 <iwakeh> :-)
14:43:50 <karsten> hmm, maybe I fixed it a bit but could fix it even more.
14:44:00 <karsten> so, yes, we should simplify the test class as much as possible.
14:44:16 <iwakeh> It'll also make the test a little more readable.
14:44:51 <iwakeh> maybe rename runTest to prepareTest?
14:44:53 <karsten> I'm conflicted how much code to write that's not actually tests.
14:45:07 <iwakeh> what do you mean?
14:45:08 <karsten> just to make tests more readable.
14:45:26 <iwakeh> more readable means shorter.
14:45:27 <karsten> well, right now, the first @Test annotation comes in line 515. :)
14:45:54 <iwakeh> maybe, I should look at the new version of the test and talk then?
14:46:03 <iwakeh> or write.
14:46:14 <karsten> and I can see us simplifying things even more, but at the cost of the first @Test being in line 700 or 800.
14:46:24 <iwakeh> oh no.
14:46:29 <karsten> yes, I'd very much appreciate your review here.
14:46:42 <iwakeh> ok.
14:46:57 <karsten> having clean tests seems like a good goal, too.
14:47:05 <iwakeh> yes.
14:47:10 <karsten> especially if we want to re-use concepts for other parts of the code.
14:47:32 <karsten> but, that's for 1.1.0.
14:47:41 <karsten> feel free to prioritize 1.0.0 stuff.
14:47:54 <iwakeh> yes or 1.2.0?
14:48:00 <karsten> or that.
14:48:07 <karsten> by the way, am I behind on any reviews?
14:48:15 <iwakeh> there are actually quite some intermodule code duplications, too.
14:48:31 <iwakeh> no, al up to-date, i think.
14:48:35 <karsten> ok.
14:48:48 <karsten> hmmmm
14:48:54 <karsten> in theory, there are only 2 real modules.
14:49:04 <iwakeh> huh?
14:49:07 <karsten> exit list stuff is tiny, torperf goes away.
14:49:14 <iwakeh> ah, ok.
14:49:40 <iwakeh> relaydescs and bridgedescs have code im common.
14:49:47 <karsten> okay, we should look at that.
14:50:05 <karsten> we could have a shared package,
14:50:10 <karsten> or we could move things to metrics-lib.
14:50:18 <iwakeh> That's why I wanted to add the tasks this week.
14:50:19 <karsten> depending on how generic the code is.
14:50:43 <iwakeh> yes, that's to be seen when refactoring.
14:50:50 <iwakeh> did you look at
14:51:12 <iwakeh> #19170
14:51:33 <iwakeh> comment:7
14:52:43 <karsten> looked, yes, but I don't know what's the right thing to do there.
14:53:23 <iwakeh> you mean, what data to store?
14:53:24 <karsten> I'll put it on my list.
14:53:29 <karsten> yes.
14:53:32 <iwakeh> fine.
14:53:43 <iwakeh> it needs thinking.
14:53:50 <karsten> yes.
14:53:54 <iwakeh> of several brains :-)
14:53:59 <karsten> ideally, yes!
14:54:16 <karsten> ah, one question about milestones:
14:54:24 <karsten> #18910.
14:54:36 <karsten> that's what we promised for the MOSS award, right?
14:54:50 <iwakeh> yes
14:54:59 <karsten> would it make sense to include that in 1.0.0, just to lower the pressure of getting out 1.1.0 soon?
14:55:11 <karsten> even if that delays 1.0.0 a bit.
14:55:12 <karsten> ?
14:55:40 <iwakeh> I'd first like to have a new instance running with a scheduler.
14:56:21 <karsten> ok. how about we subdivide the current 1.1.0 into one part with that ticket and the rest?
14:56:27 <karsten> and call the rest 1.2.0?
14:56:35 <iwakeh> sure!
14:56:43 <iwakeh> that's a good idea.
14:56:44 <karsten> just to make it more realistic to get 1.1.0 out son.
14:56:46 <karsten> soon*
14:56:52 <iwakeh> august.
14:57:02 <karsten> august would be great.
14:57:21 <karsten> should I create a 1.2.0 milestone in trac?
14:57:28 <iwakeh> please do.
14:57:43 <karsten> oh, and should I define dates for 1.0.0 and 1.1.0?
14:58:02 <iwakeh> not yet?
14:58:09 <karsten> ok.
14:58:12 <karsten> 1.2.0 created.
14:58:14 <karsten> without date.
14:58:17 <iwakeh> great.
14:58:20 <iwakeh> regarding
14:58:29 <iwakeh> the sync
14:58:55 <iwakeh> the meta-design needs to be one very soon.
14:59:13 <iwakeh> i.e. @source tags if and for what.
14:59:22 <karsten> ah ok.
14:59:27 <karsten> I thought we gave up on those.
14:59:34 <karsten> but I didn't look for a while.
14:59:39 <karsten> adding to the list.
15:00:07 <iwakeh> there is just a long discussion with no decision reached yet.
15:00:13 <karsten> ok.
15:00:32 <karsten> alright, we just crossed 15:00 UTC!
15:00:40 <iwakeh> we could first assume only benevolent collectors.
15:00:42 <karsten> and I have a looooong list of things.
15:00:46 <iwakeh> ok.
15:00:54 <iwakeh> then, back to work.
15:00:55 <karsten> yes, I think that's a good assumption.
15:00:57 <karsten> haha
15:00:58 <iwakeh> :-)
15:01:19 <iwakeh> more in tickets
15:01:20 <karsten> alright, we could talk more on monday, or next thursday.
15:01:24 <karsten> yes, and in tickets.
15:01:26 <iwakeh> sure.
15:01:44 <karsten> monday?
15:01:59 <iwakeh> 9utc
15:02:04 <karsten> sounds good.
15:02:09 <iwakeh> fine.
15:02:21 <iwakeh> are we done?
15:02:22 <karsten> great! thanks for taking the time.
15:02:26 <karsten> yes! bye. :)
15:02:29 <iwakeh> thanks
15:02:30 <karsten> #endmeeting