14:29:33 <karsten> #startmeeting metrics team
14:29:33 <MeetBot> Meeting started Thu Jul  6 14:29:33 2017 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:29:33 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:29:55 <karsten> https://storm.torproject.org/shared/Ou-1QRctynWbF4yedi-MfDsjImFMFSIEP20fbVGCPRa <- agenda pad
14:31:26 <karsten> shall we start?
14:31:32 <iwakeh> yes.
14:31:49 <karsten> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam#ObjectivesandKeyResultsfortheMetricsTeaminQ22017
14:32:02 <karsten> I did a quick evaluation of our OKRs.
14:32:08 <karsten> we got 0.6125 of 1.0.
14:32:42 <karsten> which, I think, means we overcommitted.
14:32:57 <iwakeh> oh well ...
14:33:14 <karsten> I wonder, should we ask isabela to join us on this conversation?
14:33:22 <karsten> she's out this week, so maybe that would be next week.
14:33:44 <iwakeh> well, if we have the goal to find
14:34:03 <iwakeh> out if okrs are useful to us
14:34:45 <iwakeh> we can do that now.
14:35:05 <karsten> okay.
14:35:36 <karsten> I found it useful to have a rough plan of the things we wanted to do.
14:35:50 <karsten> even though we focused more on the 1.x goals there than the rest.
14:35:54 <karsten> which totally made sense.
14:36:27 <karsten> we could have reached 0.7 or 0.75 quite easily by focusing on specific tickets or sharing service operation.
14:36:46 <karsten> but I don't think that would have helped us much now.
14:37:07 <karsten> on the other hand I'm not sure whether we'd have achieved less without such a plan.
14:37:11 <iwakeh> imho, the 'number magic' and added bookkeeping is an empty exercise.
14:37:22 <karsten> I think I agree on that part. :)
14:37:44 <karsten> it's a cheap one, fortunately.
14:38:29 <karsten> so, something I'm not so sure about is what to do with Q3 and the uncertainty of suddenly getting funding.
14:38:43 <karsten> this makes planning really hard.
14:38:50 <iwakeh> I guess we can do w/o okrs.
14:39:52 <karsten> or we could do okrs without magic numbers.
14:40:13 <iwakeh> well, that's our usual tasklist or roadmap.
14:40:18 <iwakeh> :-)
14:40:38 <karsten> if we make a roadmap just for Q3, yes!
14:40:47 <karsten> should we try that?
14:41:02 <iwakeh> now?
14:41:20 <karsten> maybe we can assign "estimated weeks" to items,
14:41:36 <iwakeh> nay, rather not.
14:41:51 <karsten> and if we learn that we'll get funding for something else, we can replace planned tasks with others.
14:42:35 <iwakeh> I think, the current planning can only stretch out to
14:42:42 <karsten> I think it's useful to do some estimating of effort.
14:42:51 <iwakeh> the next decision about possible funding.
14:43:08 <karsten> another thing we could do is make a plan for july only.
14:43:31 <iwakeh> true, that seems to be the time range we have available.
14:43:45 <iwakeh> before higher priority changes come up.
14:43:55 <iwakeh> be it new proposals or tasks.
14:44:02 <karsten> and even if they come up in mid-june, we can postpone them for two weeks.
14:44:13 <iwakeh> mid-july
14:44:17 <karsten> yes, that. :)
14:44:48 <karsten> so, changing next item to "make plans for current month".
14:45:01 <iwakeh> yes. Currently, I think CollecTor needs some work.
14:45:17 <karsten> the webstats part. what else?
14:45:33 <karsten> release.
14:45:41 <iwakeh> #21759
14:46:14 <karsten> does that need persistence so badly?
14:46:18 <iwakeh> I noticed some 'non-standard' parts in the implementation.
14:46:22 <karsten> persistence as in sync?
14:46:36 <iwakeh> I'd like to have both.
14:46:48 <iwakeh> Getting to a point where all
14:46:49 <karsten> okay, let's look at that.
14:47:06 <iwakeh> modules can easily switch and onionperf is a good start here.
14:47:19 <iwakeh> The others can follow the example.
14:47:32 <iwakeh> later, if more important tasks come up.
14:47:49 <karsten> ok.
14:48:11 <iwakeh> There could be a release earlier.
14:48:33 <iwakeh> with the changes piled up  already.
14:48:57 <karsten> 47 open tickets.
14:49:35 <karsten> we should be able to reduce those.
14:49:51 <iwakeh> and some closed ones?
14:50:06 <karsten> hmm?
14:50:11 <iwakeh> https://trac.torproject.org/projects/tor/query?status=closed&group=resolution&milestone=CollecTor+1.2.0
14:50:42 <karsten> ah, you mean for the release.
14:50:43 <iwakeh> https://trac.torproject.org/projects/tor/query?status=needs_information&status=needs_revision&status=merge_ready&status=reopened&status=needs_review&status=assigned&status=new&status=accepted&group=status&milestone=CollecTor+1.2.0
14:50:59 <karsten> how about we put out two releases this month?
14:51:08 <karsten> one quite soon and one towards the end of the month?
14:51:10 <iwakeh> yep.
14:51:19 <iwakeh> sounds fine.
14:51:45 <karsten> ok.
14:52:08 <iwakeh> the first only with done tickets.
14:52:11 <iwakeh> ?
14:52:30 <iwakeh> and, next week we decide what goes into the second?
14:52:32 <karsten> plus urgent ones. like #22833.
14:52:46 <karsten> urgent, because we might soon see those new bridge network statuses.
14:52:55 <iwakeh> true.
14:52:58 <karsten> maybe there are other urgent ones.
14:53:21 <iwakeh> review-tickets are also candites for the first.
14:53:29 <karsten> maybe #22754 which is quite easy to fix.
14:54:15 <karsten> okay, I think we mean the same thing here.
14:54:32 <iwakeh> yes.
14:54:33 <karsten> let me go through the whole list tomorrow and see if there are others that really need to go in.
14:54:50 <iwakeh> fine
14:54:51 <karsten> how do we test things before release/deployment?
14:55:11 <karsten> can you set up a local collector to try out stuff?
14:55:31 <iwakeh> I run CollecTor locally with
14:55:44 <karsten> or should we deploy a pre-release tarball on the backup instance? (though that will only test relaydescs download.)
14:55:57 <iwakeh> 'extreme' settings and test different runtime options to trigger possible errors.
14:56:12 <karsten> sounds good!
14:56:33 <iwakeh> Just, no long-running tests
14:56:50 <iwakeh> as that is cumbersome with my internet connection.
14:57:03 <karsten> okay.
14:57:24 <karsten> how is wednesday as release date?
14:57:24 <iwakeh> And, as I'm looking at the source
14:57:54 <iwakeh> here quite a lot currently I might find other topics.
14:58:00 <iwakeh> wed. is fine.
14:58:18 <iwakeh> if we have a list of on-review tomorrow?
14:58:54 <karsten> evening, yes.
14:58:58 <iwakeh> that is, a complete list of already merged and on-review tickets.
14:59:13 <iwakeh> yep. that's fine.
14:59:27 <karsten> ok.
14:59:38 <karsten> another item for july:
14:59:44 <karsten> - suggestion: bridge descriptor re-processing
15:00:05 <iwakeh> ah?
15:00:05 <karsten> I'd like to make changes to the sanitizing process,
15:00:13 <karsten> like keeping contact lines in bridge descriptors
15:00:24 <karsten> or adding fingerprint lines into past bridge network statuses.
15:00:28 <iwakeh> ah, ok.
15:00:41 <karsten> it makes a lot of sense to collect all changes and then do the reprocessing in one step.
15:00:49 <iwakeh> sounds good.
15:01:00 <karsten> some of these changes require preparation time. like discussing whether it's a good or bad idea to keep contact lines.
15:01:00 <iwakeh> and modernize the module ;-)
15:01:09 <karsten> sure, why not. :)
15:01:28 <karsten> it would be great to be able to start the reprocessing by the end of july.
15:01:31 <iwakeh> Good to do that while
15:01:38 <karsten> it will take weeks.
15:01:52 <iwakeh> adding webstats and extending onionperf persistence etc.
15:02:16 <iwakeh> but, needs to done at some point.
15:02:38 <karsten> okay. how about I focus on the list of things that require discussion for now?
15:02:48 <iwakeh> fine.
15:02:59 <karsten> and we both look into code improvements that would also go into the end-of-month release?
15:03:12 <iwakeh> yep.
15:03:46 <iwakeh> That'll be 1.3.0?
15:03:56 <karsten> I guess so.
15:04:29 <iwakeh> changed on the pad.
15:05:40 <iwakeh> I added the milestone CollecTor 1.3.0.
15:05:51 <karsten> okay.
15:06:08 <karsten> looks like a good list for this month.
15:06:25 <iwakeh> right.
15:06:51 <karsten> shall we talk about #22428 ?
15:07:13 <iwakeh> how are  'real' access logs and the log meta data provided to CollecTor?
15:07:42 <karsten> I think all we get is a directory tree.
15:07:56 <iwakeh> direct file system access?
15:08:14 <karsten> rsynced to the local file system.
15:08:16 <karsten> probably.
15:08:26 <karsten> that's how we get bridge descriptors.
15:08:42 <iwakeh> ok.
15:09:13 <karsten> regarding metadata in output files,
15:09:19 <karsten> I don't have a good answer there,
15:09:20 <iwakeh> and meta-data is in the path somewhere?
15:09:45 * iwakeh still discussing input to CollecTor
15:09:45 <karsten> but we should try out processing our extended logs using typical tools before putting something in.
15:09:55 <karsten> for input, metadata is in the path.
15:10:10 <iwakeh> as on webstats.tp.o?
15:10:18 <karsten> I assume that https://webstats.torproject.org/out/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz
15:10:26 <sepr> who do I talk to regarding some talks on SHA2017? :)
15:10:42 <karsten> was generated from an input file in/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz
15:11:02 <karsten> sepr: no clue. maybe ask in #tor-project?
15:11:21 <karsten> sepr: or come back in ~10 minutes when this meeting is over and others might notice your question better.
15:11:34 <karsten> iwakeh: but we can ask Sebastian.
15:12:10 <iwakeh> He's not the one providing the files to CollecTor, or will he?
15:12:14 <karsten> by the way, I had this crazy idea of deploying a CollecTor instance that only sanitizes web logs on the current host that sanitizes web logs.
15:12:24 <karsten> and have our primary collector sync files from there.
15:12:45 <iwakeh> well, I have sync-webstats on the list.
15:12:51 <karsten> rather than syncing more original files to colchicifolium.
15:13:11 <iwakeh> yes, that all should be possible.
15:13:17 <karsten> I think weasel would be the one providing files.
15:13:24 <iwakeh> and makes sense to not spread the original files.
15:13:26 <karsten> but Sebastian knows how they are currently provided.
15:13:48 <iwakeh> well, currently that fits into the current way of processing.
15:14:09 <iwakeh> I'd rather have it that the admins like it and collector has easy access.
15:14:22 <iwakeh> :-)
15:14:36 <karsten> I'm always in favor of writing tools that the admins like. :)
15:14:49 <karsten> okay, but this means that sync needs to work, right?
15:14:51 <iwakeh> yes, that's usually best :-)
15:15:05 <karsten> any conceptual issues to solve there first?
15:15:05 <iwakeh> sure, I intend to add that.
15:15:17 <karsten> okay.
15:15:25 <iwakeh> Just need the answer to the out and recent paths question.
15:15:40 <iwakeh> and the meta-data
15:15:55 <iwakeh> my last comment
15:15:59 <iwakeh> on that ticket:
15:16:00 <iwakeh> Shouldn't applications processing Apache access logs ignore 'funny' lines?
15:16:34 <Sebastian> hi. I'm leaving to play soccer. Maybe there's a concrete question to answer or you can write an email?
15:16:43 <karsten> hi Sebastian!
15:16:58 <karsten> quick question: what's the file structure of input files on the webstats host?
15:17:06 <karsten> 15:10:49 <+karsten> I assume that
15:17:06 <karsten> https://webstats.torproject.org/out/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz
15:17:09 <karsten> 15:11:13 <+karsten> was generated from an input file
15:17:09 <karsten> in/meronense.torproject.org/metrics.torproject.org-access.log-20160829.xz
15:17:23 <karsten> is that assumption correct?
15:17:31 <karsten> (and what else do you want to know, iwakeh?)
15:18:22 <sepr> okay :-)
15:18:29 <iwakeh> If that's the standard way of providing these files?
15:19:03 <iwakeh> The original files before cleaning.
15:19:32 <karsten> should we move this to email?
15:19:36 * karsten needs to leave in ~5.
15:19:43 <iwakeh> all fine, or the ticket?
15:19:54 <iwakeh> then, sure.
15:20:08 <karsten> we should cc Sebastian there.
15:20:30 <iwakeh> and weasel?
15:20:52 <karsten> email, yes, but don't cc the weasel on a ticket if you want to stay friends. ;)
15:21:10 <karsten> (there's no way to stop receiving emails for that ticket afterwards.)
15:21:11 <iwakeh> But, for the moment I rely on this type of path.
15:21:22 <iwakeh> oh, I thought you referred to mail.
15:21:29 <karsten> sure, mail works.
15:21:32 <iwakeh> well, un-cc.
15:21:38 <karsten> nope. unpossible.
15:21:45 <iwakeh> oh, never tried.
15:21:57 <karsten> okay, I'll respond to the other parts on the ticket.
15:22:07 <iwakeh> all fine.
15:22:24 <karsten> great! so, email, trac, and talk more next week?
15:22:31 <iwakeh> ttynw
15:22:39 <iwakeh> yes :-)
15:22:42 <karsten> heh. great! thanks, and bye!
15:22:51 <karsten> #endmeeting