13:59:57 #startmeeting metrics team 13:59:57 Meeting started Thu Jul 28 13:59:57 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:57 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:07 it's meeting time. who's here for the metrics team meeting? 14:00:32 * karsten already saw iwakeh 14:00:40 right :-) 14:00:45 * qbi lurks. 14:00:48 hi qbi! 14:01:06 * karsten finds the pad.. 14:01:33 https://pad.riseup.net/p/zUNzEIFRq5S4 14:03:16 okay. 14:03:21 * Bridge descriptor sanitizer (karsten) 14:03:22 ok. 14:03:39 I spent the last 20 days (well, it felt like 20) writing tests. 14:03:45 and spotted many bugs. 14:03:49 hihi 14:03:58 good. 14:04:08 one question 14:04:11 I also found out that the batch process that re-processes archives broke. 14:04:16 after 13 of 28 or so days. 14:04:20 out of memory. 14:04:23 sure, what's the question? 14:04:24 oh no. 14:04:43 these tests having a TODO; do they fail already? 14:04:44 I have an idea what the reason could be. I don't have a good fix though. 14:04:51 no, I changed them all to pass. 14:04:57 and to fail once we fix things. 14:05:14 maybe, fix before refactoring? 14:05:25 sure! 14:05:40 but, the batch ... 14:05:48 should I fix them, or would you want to look into that? 14:06:00 that's topic 2 14:06:04 planning. 14:06:08 ok :) 14:06:14 yes, the batch. 14:06:26 so, we're keeping a data structure of all file digests we're processing. 14:06:31 to avoid processing them again. 14:06:38 and that data structure grows and grows. 14:06:46 apparently, after 13 days, it grew too much. 14:06:49 needs bigger hw. 14:07:01 well, maybe. 14:07:03 how much RAM? 14:07:09 8g 14:07:31 hmm, I wonder if that old mac mini can handle more. 14:07:42 ok, that was for which amount of files processed? 14:07:59 600g, I think. 14:08:12 which is ? 14:08:17 half ? 14:08:21 ah! 14:08:26 well, 40% or so. 14:08:37 240g. 14:08:44 600g is the total size. 14:08:47 well, I could offer 32G ram. 14:09:00 but I'd have to download ... 14:09:18 right. and I'd want to keep these archives offline. 14:09:32 yes. well 14:09:41 so, I just restarted the batch where it stopped. 14:09:44 then we need to improve the processing 14:09:53 in theory, it'll break again at 80%, and then it will run through. 14:09:58 won't it reprocess? 14:10:11 nope. I moved the old files away. 14:10:32 why not let it chew on smaller chunks? 14:10:35 improving the processing would also be my favored solution. 14:11:00 well, I could have moved the last 20-30% away, too. true. 14:11:03 such a reprocessing might come up again? 14:11:11 right. 14:11:20 new ticket? 14:11:34 so, my plan was to use an LRU cache instead of keeping all digest. 14:11:47 but that's also just my guess that it's this data structure. I don't know for sure. 14:12:01 I had jvisualvm running, but that broke after 90 hours for some other reason. 14:12:07 new ticket sounds good. 14:12:08 the processed ones could be stored in a simple db, too. 14:12:49 well, switching to a db sounds like a bigger change. 14:13:08 which also crossed my mind: fix all the bugs now, do the reprocessing afterwards. 14:13:19 yes? 14:13:35 you mean the bugs 14:13:44 found in the refactoring part? 14:13:47 and 14:13:48 yep. 14:13:53 ok 14:14:04 I don't think they were ever triggered, because tonga was always nice enough not to give us bad data. 14:14:13 still, would be good to fix them. 14:14:19 yes, if reprocessing can wait a little. 14:14:28 yes, a week or two. 14:14:42 then that should be done. 14:14:59 alright. let me create that ticket for the out-of-memory problem later today. 14:15:06 fine. 14:15:16 ok. 14:15:33 moving to the next topic? 14:15:42 ok 14:15:47 * CollecTor planning (iwakeh) 14:16:11 well, we have milestones(ms) for 14:16:24 the collector (ct) release 14:16:38 I'm wondering when to put out the 14:16:43 first ct release. 14:16:47 I'd like 14:17:03 to have that soon when all the 1.0.0 ms tickets are done. 14:17:25 https://trac.torproject.org/projects/tor/query?milestone=CollecTor+1.0.0&group=status&order=priority 14:17:56 #18865 will be ready for review today 14:18:15 and #19169 could rather be moved to ms 110 14:18:29 I'm not sure if we can add #19317 before we add #19755. 14:18:44 still, having #19317 in 1.0.0 seems useful. 14:19:13 move it to 110 14:19:14 ? 14:19:40 add release 101? 14:20:19 so, if we assume that reprocessing bridges will take another few weeks, 14:20:25 do you think 1.1.0 would be out by then? 14:20:54 depends, what we assign to ms 1.0.x 14:20:55 what was your idea for releasing 1.0.0? 14:21:30 good question. 14:22:04 just noticing that 14:22:25 there is a ticket missing for the release process 14:22:37 the signing uploading whatever needs to done. 14:22:42 right. 14:23:32 before 10th of Aug? 14:23:46 so, #2966 needs more discussion before being included in 1.1.0. 14:23:59 I'd say unassign from that milestone. 14:24:06 ok 14:24:23 and #19317 goes to 1.1.0? 14:24:42 isn't done? 14:24:43 would it make sense to move #19720 back to 1.0.0? 14:24:53 ah, I didn't reload. 14:25:09 not done yet, should I move it? 14:25:38 ok. 14:26:43 we can have a 1.0.x for the fixes. 14:26:56 sure. 14:27:58 removed #2966 from milestone. 14:28:15 so, have priority on the ms 100 tickets? 14:28:31 I think I work on these mostly. 14:29:21 okay, so there are three tickets left? 14:29:39 can I add a fourth? :) 14:29:40 four, if we move the runtime configuration change ticket. 14:29:49 which one? 14:29:50 sure. 14:29:58 you just named it 14:30:22 #19720 14:30:41 ok. should I move it? 14:30:49 done. 14:30:54 ok. 14:31:13 i can also add the ant tasks 14:31:27 for pmd&findbugs this week. 14:31:52 hmm, but we wouldn't fix any of those issues before the 10th, right? 14:31:59 well, s/any/many/ 14:32:02 you're right. 14:32:30 my fourth (now fifth or sixth) ticket would be about improving the scheduler a bit. 14:32:38 things like: 14:32:39 how? 14:32:39 undo path changes (everything under out/) 14:32:39 make recent/ truly configurable 14:32:39 start at 00:00.000 of configured minute, not x minutes from current time 14:32:39 add mode to run once immediately 14:32:55 things that came up while testing today. 14:33:26 x minutes from current? 14:33:43 here's what I did: 14:33:46 well, just add tickets for these :-) 14:33:58 I edited collector.properties to contain the next minute, like 35. 14:34:09 then I started the process at, say, 34:15. 14:34:16 and it would start at 35:15. 14:34:23 when it should ideally start at 35:00. 14:34:32 ah, that's interesting. 14:34:38 but yes, I can be even more verbose than those four lines in the ticket. ;) 14:34:42 period was 60 I suppose? 14:34:58 good ;-) 14:35:03 hmm, no, 10. 14:35:23 oh? 14:35:42 but minutes. 14:35:57 yes, that need clarification in a ticket ... 14:36:00 :) 14:36:11 :-) 14:36:26 are you going to create a ticket for the release? 14:36:36 yes. 14:36:42 I usually follow the instructions for releasing metrics-lib line by line. 14:36:50 ok. 14:37:22 okay, I think that's a good plan for 1.0.0 then. 14:37:29 right. 14:37:34 will you begin the 14:37:39 let's make a plan for 1.0.1 or 1.1.0 after that. 14:37:44 sanitizer bugfixes? 14:37:54 sure. 14:38:19 yes, happy to. 14:38:24 I'd like to make a suggestion for that test class. 14:38:26 should I also fix findbugs/pmd issues? 14:38:31 please do! 14:38:40 hmm 14:39:07 the one-liners, anything else might be a real big change. 14:39:07 I don't have to. 14:39:16 okay. 14:39:22 what's the suggestion? 14:39:39 the things that are really small. 14:39:44 and of course 14:39:55 the potential null dereferences and the like. 14:40:18 these can be done while working on the functional errors. 14:40:32 i.e. the TODOs you identified. 14:41:00 right. 14:41:05 some things need more thoughts. 14:41:10 like removing System.gc(); ... 14:41:21 true, we do not 14:41:27 I mean, in theory I agree that it shouldn't have to be there. 14:41:35 but then it's there because we ran out of memory before. 14:41:46 so maybe we should look what happened and if it still happens. 14:41:52 need to change some of the rules or toss one or the other. 14:41:57 not following findbugs suggestions blindly. ;) 14:42:00 right. 14:42:03 right. 14:42:09 what's the suggestion about the test class? 14:42:31 Configuration needs just an InputStream 14:42:34 to be clear, I'd want to make that class better. the goal is not just to increase coverage. 14:42:37 which can come from a String. 14:42:50 the goal is also to write better test classes for other code bases. 14:42:58 it's about simplifying the test class. 14:43:00 hmm, didn't I fix that? 14:43:21 I didn't have time to look at the class before this meeting. 14:43:33 so maybe. 14:43:36 :-) 14:43:50 hmm, maybe I fixed it a bit but could fix it even more. 14:44:00 so, yes, we should simplify the test class as much as possible. 14:44:16 It'll also make the test a little more readable. 14:44:51 maybe rename runTest to prepareTest? 14:44:53 I'm conflicted how much code to write that's not actually tests. 14:45:07 what do you mean? 14:45:08 just to make tests more readable. 14:45:26 more readable means shorter. 14:45:27 well, right now, the first @Test annotation comes in line 515. :) 14:45:54 maybe, I should look at the new version of the test and talk then? 14:46:03 or write. 14:46:14 and I can see us simplifying things even more, but at the cost of the first @Test being in line 700 or 800. 14:46:24 oh no. 14:46:29 yes, I'd very much appreciate your review here. 14:46:42 ok. 14:46:57 having clean tests seems like a good goal, too. 14:47:05 yes. 14:47:10 especially if we want to re-use concepts for other parts of the code. 14:47:32 but, that's for 1.1.0. 14:47:41 feel free to prioritize 1.0.0 stuff. 14:47:54 yes or 1.2.0? 14:48:00 or that. 14:48:07 by the way, am I behind on any reviews? 14:48:15 there are actually quite some intermodule code duplications, too. 14:48:31 no, al up to-date, i think. 14:48:35 ok. 14:48:48 hmmmm 14:48:54 in theory, there are only 2 real modules. 14:49:04 huh? 14:49:07 exit list stuff is tiny, torperf goes away. 14:49:14 ah, ok. 14:49:40 relaydescs and bridgedescs have code im common. 14:49:47 okay, we should look at that. 14:50:05 we could have a shared package, 14:50:10 or we could move things to metrics-lib. 14:50:18 That's why I wanted to add the tasks this week. 14:50:19 depending on how generic the code is. 14:50:43 yes, that's to be seen when refactoring. 14:50:50 did you look at 14:51:12 #19170 14:51:33 comment:7 14:52:43 looked, yes, but I don't know what's the right thing to do there. 14:53:23 you mean, what data to store? 14:53:24 I'll put it on my list. 14:53:29 yes. 14:53:32 fine. 14:53:43 it needs thinking. 14:53:50 yes. 14:53:54 of several brains :-) 14:53:59 ideally, yes! 14:54:16 ah, one question about milestones: 14:54:24 #18910. 14:54:36 that's what we promised for the MOSS award, right? 14:54:50 yes 14:54:59 would it make sense to include that in 1.0.0, just to lower the pressure of getting out 1.1.0 soon? 14:55:11 even if that delays 1.0.0 a bit. 14:55:12 ? 14:55:40 I'd first like to have a new instance running with a scheduler. 14:56:21 ok. how about we subdivide the current 1.1.0 into one part with that ticket and the rest? 14:56:27 and call the rest 1.2.0? 14:56:35 sure! 14:56:43 that's a good idea. 14:56:44 just to make it more realistic to get 1.1.0 out son. 14:56:46 soon* 14:56:52 august. 14:57:02 august would be great. 14:57:21 should I create a 1.2.0 milestone in trac? 14:57:28 please do. 14:57:43 oh, and should I define dates for 1.0.0 and 1.1.0? 14:58:02 not yet? 14:58:09 ok. 14:58:12 1.2.0 created. 14:58:14 without date. 14:58:17 great. 14:58:20 regarding 14:58:29 the sync 14:58:55 the meta-design needs to be one very soon. 14:59:13 i.e. @source tags if and for what. 14:59:22 ah ok. 14:59:27 I thought we gave up on those. 14:59:34 but I didn't look for a while. 14:59:39 adding to the list. 15:00:07 there is just a long discussion with no decision reached yet. 15:00:13 ok. 15:00:32 alright, we just crossed 15:00 UTC! 15:00:40 we could first assume only benevolent collectors. 15:00:42 and I have a looooong list of things. 15:00:46 ok. 15:00:54 then, back to work. 15:00:55 yes, I think that's a good assumption. 15:00:57 haha 15:00:58 :-) 15:01:19 more in tickets 15:01:20 alright, we could talk more on monday, or next thursday. 15:01:24 yes, and in tickets. 15:01:26 sure. 15:01:44 monday? 15:01:59 9utc 15:02:04 sounds good. 15:02:09 fine. 15:02:21 are we done? 15:02:22 great! thanks for taking the time. 15:02:26 yes! bye. :) 15:02:29 thanks 15:02:30 #endmeeting