14:29:33 #startmeeting metrics team 14:29:33 Meeting started Thu Jul 6 14:29:33 2017 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:29:33 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:29:55 https://storm.torproject.org/shared/Ou-1QRctynWbF4yedi-MfDsjImFMFSIEP20fbVGCPRa <- agenda pad 14:31:26 shall we start? 14:31:32 yes. 14:31:49 https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam#ObjectivesandKeyResultsfortheMetricsTeaminQ22017 14:32:02 I did a quick evaluation of our OKRs. 14:32:08 we got 0.6125 of 1.0. 14:32:42 which, I think, means we overcommitted. 14:32:57 oh well ... 14:33:14 I wonder, should we ask isabela to join us on this conversation? 14:33:22 she's out this week, so maybe that would be next week. 14:33:44 well, if we have the goal to find 14:34:03 out if okrs are useful to us 14:34:45 we can do that now. 14:35:05 okay. 14:35:36 I found it useful to have a rough plan of the things we wanted to do. 14:35:50 even though we focused more on the 1.x goals there than the rest. 14:35:54 which totally made sense. 14:36:27 we could have reached 0.7 or 0.75 quite easily by focusing on specific tickets or sharing service operation. 14:36:46 but I don't think that would have helped us much now. 14:37:07 on the other hand I'm not sure whether we'd have achieved less without such a plan. 14:37:11 imho, the 'number magic' and added bookkeeping is an empty exercise. 14:37:22 I think I agree on that part. :) 14:37:44 it's a cheap one, fortunately. 14:38:29 so, something I'm not so sure about is what to do with Q3 and the uncertainty of suddenly getting funding. 14:38:43 this makes planning really hard. 14:38:50 I guess we can do w/o okrs. 14:39:52 or we could do okrs without magic numbers. 14:40:13 well, that's our usual tasklist or roadmap. 14:40:18 :-) 14:40:38 if we make a roadmap just for Q3, yes! 14:40:47 should we try that? 14:41:02 now? 14:41:20 maybe we can assign "estimated weeks" to items, 14:41:36 nay, rather not. 14:41:51 and if we learn that we'll get funding for something else, we can replace planned tasks with others. 14:42:35 I think, the current planning can only stretch out to 14:42:42 I think it's useful to do some estimating of effort. 14:42:51 the next decision about possible funding. 14:43:08 another thing we could do is make a plan for july only. 14:43:31 true, that seems to be the time range we have available. 14:43:45 before higher priority changes come up. 14:43:55 be it new proposals or tasks. 14:44:02 and even if they come up in mid-june, we can postpone them for two weeks. 14:44:13 mid-july 14:44:17 yes, that. :) 14:44:48 so, changing next item to "make plans for current month". 14:45:01 yes. Currently, I think CollecTor needs some work. 14:45:17 the webstats part. what else? 14:45:33 release. 14:45:41 #21759 14:46:14 does that need persistence so badly? 14:46:18 I noticed some 'non-standard' parts in the implementation. 14:46:22 persistence as in sync? 14:46:36 I'd like to have both. 14:46:48 Getting to a point where all 14:46:49 okay, let's look at that. 14:47:06 modules can easily switch and onionperf is a good start here. 14:47:19 The others can follow the example. 14:47:32 later, if more important tasks come up. 14:47:49 ok. 14:48:11 There could be a release earlier. 14:48:33 with the changes piled up already. 14:48:57 47 open tickets. 14:49:35 we should be able to reduce those. 14:49:51 and some closed ones? 14:50:06 hmm? 14:50:11 https://trac.torproject.org/projects/tor/query?status=closed&group=resolution&milestone=CollecTor+1.2.0 14:50:42 ah, you mean for the release. 14:50:43 https://trac.torproject.org/projects/tor/query?status=needs_information&status=needs_revision&status=merge_ready&status=reopened&status=needs_review&status=assigned&status=new&status=accepted&group=status&milestone=CollecTor+1.2.0 14:50:59 how about we put out two releases this month? 14:51:08 one quite soon and one towards the end of the month? 14:51:10 yep. 14:51:19 sounds fine. 14:51:45 ok. 14:52:08 the first only with done tickets. 14:52:11 ? 14:52:30 and, next week we decide what goes into the second? 14:52:32 plus urgent ones. like #22833. 14:52:46 urgent, because we might soon see those new bridge network statuses. 14:52:55 true. 14:52:58 maybe there are other urgent ones. 14:53:21 review-tickets are also candites for the first. 14:53:29 maybe #22754 which is quite easy to fix. 14:54:15 okay, I think we mean the same thing here. 14:54:32 yes. 14:54:33 let me go through the whole list tomorrow and see if there are others that really need to go in. 14:54:50 fine 14:54:51 how do we test things before release/deployment? 14:55:11 can you set up a local collector to try out stuff? 14:55:31 I run CollecTor locally with 14:55:44 or should we deploy a pre-release tarball on the backup instance? (though that will only test relaydescs download.) 14:55:57 'extreme' settings and test different runtime options to trigger possible errors. 14:56:12 sounds good! 14:56:33 Just, no long-running tests 14:56:50 as that is cumbersome with my internet connection. 14:57:03 okay. 14:57:24 how is wednesday as release date? 14:57:24 And, as I'm looking at the source 14:57:54 here quite a lot currently I might find other topics. 14:58:00 wed. is fine. 14:58:18 if we have a list of on-review tomorrow? 14:58:54 evening, yes. 14:58:58 that is, a complete list of already merged and on-review tickets. 14:59:13 yep. that's fine. 14:59:27 ok. 14:59:38 another item for july: 14:59:44 - suggestion: bridge descriptor re-processing 15:00:05 ah? 15:00:05 I'd like to make changes to the sanitizing process, 15:00:13 like keeping contact lines in bridge descriptors 15:00:24 or adding fingerprint lines into past bridge network statuses. 15:00:28 ah, ok. 15:00:41 it makes a lot of sense to collect all changes and then do the reprocessing in one step. 15:00:49 sounds good. 15:01:00 some of these changes require preparation time. like discussing whether it's a good or bad idea to keep contact lines. 15:01:00 and modernize the module ;-) 15:01:09 sure, why not. :) 15:01:28 it would be great to be able to start the reprocessing by the end of july. 15:01:31 Good to do that while 15:01:38 it will take weeks. 15:01:52 adding webstats and extending onionperf persistence etc. 15:02:16 but, needs to done at some point. 15:02:38 okay. how about I focus on the list of things that require discussion for now? 15:02:48 fine. 15:02:59 and we both look into code improvements that would also go into the end-of-month release? 15:03:12 yep. 15:03:46 That'll be 1.3.0? 15:03:56 I guess so. 15:04:29 changed on the pad. 15:05:40 I added the milestone CollecTor 1.3.0. 15:05:51 okay. 15:06:08 looks like a good list for this month. 15:06:25 right. 15:06:51 shall we talk about #22428 ? 15:07:13 how are 'real' access logs and the log meta data provided to CollecTor? 15:07:42 I think all we get is a directory tree. 15:07:56 direct file system access? 15:08:14 rsynced to the local file system. 15:08:16 probably. 15:08:26 that's how we get bridge descriptors. 15:08:42 ok. 15:09:13 regarding metadata in output files, 15:09:19 I don't have a good answer there, 15:09:20 and meta-data is in the path somewhere? 15:09:45 * iwakeh still discussing input to CollecTor 15:09:45 but we should try out processing our extended logs using typical tools before putting something in. 15:09:55 for input, metadata is in the path. 15:10:10 as on webstats.tp.o? 15:10:18 I assume that https://webstats.torproject.org/out/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz 15:10:26 who do I talk to regarding some talks on SHA2017? :) 15:10:42 was generated from an input file in/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz 15:11:02 sepr: no clue. maybe ask in #tor-project? 15:11:21 sepr: or come back in ~10 minutes when this meeting is over and others might notice your question better. 15:11:34 iwakeh: but we can ask Sebastian. 15:12:10 He's not the one providing the files to CollecTor, or will he? 15:12:14 by the way, I had this crazy idea of deploying a CollecTor instance that only sanitizes web logs on the current host that sanitizes web logs. 15:12:24 and have our primary collector sync files from there. 15:12:45 well, I have sync-webstats on the list. 15:12:51 rather than syncing more original files to colchicifolium. 15:13:11 yes, that all should be possible. 15:13:17 I think weasel would be the one providing files. 15:13:24 and makes sense to not spread the original files. 15:13:26 but Sebastian knows how they are currently provided. 15:13:48 well, currently that fits into the current way of processing. 15:14:09 I'd rather have it that the admins like it and collector has easy access. 15:14:22 :-) 15:14:36 I'm always in favor of writing tools that the admins like. :) 15:14:49 okay, but this means that sync needs to work, right? 15:14:51 yes, that's usually best :-) 15:15:05 any conceptual issues to solve there first? 15:15:05 sure, I intend to add that. 15:15:17 okay. 15:15:25 Just need the answer to the out and recent paths question. 15:15:40 and the meta-data 15:15:55 my last comment 15:15:59 on that ticket: 15:16:00 Shouldn't applications processing Apache access logs ignore 'funny' lines? 15:16:34 hi. I'm leaving to play soccer. Maybe there's a concrete question to answer or you can write an email? 15:16:43 hi Sebastian! 15:16:58 quick question: what's the file structure of input files on the webstats host? 15:17:06 15:10:49 <+karsten> I assume that 15:17:06 https://webstats.torproject.org/out/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz 15:17:09 15:11:13 <+karsten> was generated from an input file 15:17:09 in/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz 15:17:23 is that assumption correct? 15:17:31 (and what else do you want to know, iwakeh?) 15:18:22 okay :-) 15:18:29 If that's the standard way of providing these files? 15:19:03 The original files before cleaning. 15:19:32 should we move this to email? 15:19:36 * karsten needs to leave in ~5. 15:19:43 all fine, or the ticket? 15:19:54 then, sure. 15:20:08 we should cc Sebastian there. 15:20:30 and weasel? 15:20:52 email, yes, but don't cc the weasel on a ticket if you want to stay friends. ;) 15:21:10 (there's no way to stop receiving emails for that ticket afterwards.) 15:21:11 But, for the moment I rely on this type of path. 15:21:22 oh, I thought you referred to mail. 15:21:29 sure, mail works. 15:21:32 well, un-cc. 15:21:38 nope. unpossible. 15:21:45 oh, never tried. 15:21:57 okay, I'll respond to the other parts on the ticket. 15:22:07 all fine. 15:22:24 great! so, email, trac, and talk more next week? 15:22:31 ttynw 15:22:39 yes :-) 15:22:42 heh. great! thanks, and bye! 15:22:51 #endmeeting