14:00:22 <karsten> #startmeeting metrics team
14:00:22 <MeetBot> Meeting started Thu Sep 15 14:00:22 2016 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:22 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:00:26 <karsten> hi iwakeh!
14:00:35 <karsten> https://pad.riseup.net/p/3M7VyrTVgjlF
14:03:20 <karsten> iwakeh: let me know when you're done with the agenda pad.
14:03:33 <iwakeh> I think this is all now.
14:03:42 <karsten> okay, cool, let's start.
14:03:45 <karsten> * Shiny prototype (karsten)
14:03:52 <karsten> https://tor-metrics.shinyapps.io/webstats/
14:03:53 <iwakeh> looks really cool.
14:04:02 <karsten> and it was really easy to write.
14:04:09 <karsten> there are some open questions though.
14:04:21 <iwakeh> how easy? just R?
14:04:32 <karsten> right now, it's hosted by a third party, though we could host such a server ourselves.
14:04:40 <karsten> but this seemed fine for the prototype.
14:04:42 <karsten> just R.
14:04:46 <iwakeh> is it available as source?
14:04:52 <iwakeh> the shiny server?
14:05:00 <karsten> ah, yes.
14:05:32 <iwakeh> questions are?
14:05:41 <karsten> https://www.rstudio.com/products/shiny/shiny-server/
14:05:59 <karsten> I think the main question is requirements to clients.
14:06:10 <iwakeh> javascript
14:06:18 <karsten> this runs in Tor Browser, but I think only on medium-something level.
14:06:45 <iwakeh> well, I had to enable scripts.
14:06:50 <karsten> yes, javascript. so, that's the main question.
14:06:57 <karsten> not one we can answer today.
14:07:16 <karsten> but one we should answer before moving forward.
14:07:19 <iwakeh> we could take a look at the source
14:07:35 <karsten> to see whether we can work around that?
14:07:43 <iwakeh> and decide then. They might use familiar tools in the background.
14:07:45 <iwakeh> yes.
14:08:04 <iwakeh> or to get an idea how to provide the service.
14:08:14 <iwakeh> with other tools/servers.
14:08:37 <karsten> wow, ok.
14:08:42 <iwakeh> shall I add this to my list.
14:08:43 <karsten> that would be quite the project though.
14:08:55 <iwakeh> first just see what is used.
14:09:00 <karsten> feel free to do that, just don't put it right on number 1. ;)
14:09:12 <iwakeh> two and a half.
14:09:16 <iwakeh> ;-)
14:09:17 <karsten> heh
14:09:39 <karsten> okay, I mainly put it out there as a way to look at this particular data set and to make some first experiences with the tool.
14:10:02 <karsten> other questions: obtaining data.
14:10:09 <iwakeh> ok?
14:10:26 <karsten> this application comes with its own data exported from a database on my server, shipped with the application bundle.
14:10:43 <karsten> what we'd like to do is fetch data from a server somewhere. but don't fetch it every single time.
14:10:50 <karsten> or we might even want to use a database for this.
14:10:59 <karsten> which might be possible if we use our own shiny server.
14:11:04 <karsten> but I didn't look yet. prototype.
14:11:16 <iwakeh> I keep these
14:11:27 <iwakeh> questions in mind when looking at shiny.
14:11:44 <karsten> cool! otherwise, I think it's powerful enough for the things we want to do.
14:11:55 <iwakeh> yes, looks neat.
14:12:04 <karsten> and it would be really cool not to have to develop all that stuff.
14:12:10 <karsten> and instead focus on the R code.
14:12:19 <karsten> let me quickly upload that to give you an idea:
14:12:37 <iwakeh> cool.
14:13:02 <karsten> http://paste.debian.net/823844/
14:13:06 <karsten> two files.
14:13:20 <karsten> well, plus about.html, but that only contains the description part.
14:14:01 <iwakeh> less than 70 lines! nice.
14:14:15 <karsten> yep!
14:14:42 <karsten> okay, so much about shiny.
14:14:50 <karsten> moving on?
14:14:55 <iwakeh> sure.
14:15:00 <karsten> * Bridge descriptors (karsten)
14:15:24 <karsten> bridge descriptors until 2016-05 will be ready later today or tomorrow at the latest, depending on how fast this `xz -9e` finishes the rest.
14:15:33 <karsten> how do we proceed?
14:15:50 <iwakeh> with a parallel instance?
14:16:08 <karsten> yes, with just the minimal patch to sanitize tcp ports?
14:16:33 <iwakeh> did you try that?
14:17:03 <karsten> well, I ran a version with that patch plus a few more to not run out of memory.
14:17:34 <iwakeh> so, a hotfix release for these all?
14:18:05 <karsten> hmmmmm
14:18:10 <karsten> yes, we could do that.
14:18:22 <iwakeh> branch?
14:18:37 <karsten> oh, wait, those other patches are not ready to be merged yet.
14:18:44 <karsten> just the sanitize tcp ports patch is.
14:18:49 <iwakeh> ah, ok
14:19:04 <karsten> but we only need those other patches to batch-process months and years of tarballs.
14:19:14 <karsten> okay, let me prepare a branch for you to review.
14:19:21 <iwakeh> fine.
14:19:47 <karsten> then we release that, I set up a new instance and resume sanitizing descriptors from 2016-06 till today.
14:20:06 <iwakeh> sounds like a good plan.
14:20:08 <karsten> it might even be done before seattle.
14:20:13 <iwakeh> :-)
14:20:32 <karsten> here's something else: would you be able to verify the new tarballs and see if they contain the right descriptors?
14:20:37 <karsten> well, some samples.
14:20:44 <iwakeh> sure.
14:21:03 <karsten> cool. how about I encrypt and upload a few months of descriptors?
14:21:05 <iwakeh> You refer to the date issue?
14:21:11 <karsten> date issue?
14:21:35 <iwakeh> descriptors from a different month in tarball xy.
14:21:38 <karsten> with descriptors being sorted into the wrong month tarball? no, that was only an issue with relay descriptors.
14:21:55 <karsten> I'm just thinking of things I overlooked.
14:22:04 <karsten> things I left in, or something.
14:22:24 <iwakeh> so, basically I should look at the tar and find oddities?
14:22:31 <karsten> yes! :)
14:22:49 <karsten> keeping in mind that we used different secrets for sanitizing IP addresses than last time.
14:22:58 <karsten> so, the 10.x.y.z addresses are all different.
14:23:28 <iwakeh> I think, I didn't review the tars before.
14:23:41 <iwakeh> Or did I?
14:23:42 <karsten> that's perfect. no assumptions. :)
14:23:47 <karsten> not sure, maybe not.
14:24:00 <karsten> alright, let me upload something and you take a look.
14:24:08 <karsten> and I prepare the branch for the hotfix release.
14:24:09 <iwakeh> fine.
14:24:16 <karsten> great! moving on?
14:24:21 <iwakeh> yes.
14:24:24 <karsten> * wiki changes according to planning and discussions in Berlin (iwakeh)
14:24:40 <iwakeh> I updated a few pages links follow
14:24:57 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam#ReleasesandMilestones
14:25:20 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Documentation
14:26:16 <iwakeh> some formatting in faq and update to the road-map questions.
14:26:47 <iwakeh> and a draft for the new volunteers page discussed in Berlin
14:26:50 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Volunteers
14:26:55 <karsten> yes, thanks for that!
14:27:26 <iwakeh> idea: tag easy tasks with open timeframe as 'metrics-help'
14:27:46 <iwakeh> so they can be listed on the volunteering page.
14:29:07 <karsten> I think I could do something like that after seattle.
14:29:31 <karsten> but this and next week (and during the seattle week) I'll be mostly distracted by other stuff.
14:29:42 <iwakeh> And in Seattle maybe contribute this to the volunteer discussion?
14:29:57 <iwakeh> The ideas behind the page.
14:30:02 <karsten> yes, certainly!
14:30:41 <iwakeh> I'll link the volunteer page from the other documentation.
14:31:04 <karsten> sounds good!
14:31:26 <iwakeh> the other changes were just updates; so next topic?
14:31:43 <karsten> okay.
14:31:44 <karsten> * logging for operators concerns collector, onionoo, ... (iwakeh)
14:32:11 <iwakeh> I should have opened a new issue, but as the log-mailing was the reason for me to think about this I changed the
14:32:23 <iwakeh> description of #20128
14:32:59 <karsten> hmm? when did you change it?
14:33:00 <iwakeh> operators will have to think about choices like
14:33:00 <iwakeh> logging framework implementation
14:33:00 <iwakeh> log-level settings
14:33:00 <iwakeh> logging environment, e.g. path settings etc.
14:33:27 <iwakeh> i.e. no more commits, because of the log level etc.
14:33:40 <iwakeh> as in #20079
14:33:58 <karsten> well, we'll still have to give them reasonable defaults.
14:34:31 <karsten> which can simply be our choices.
14:34:39 <iwakeh> reasonable pointers. Of course, no default trace setting.
14:34:46 <karsten> yes. :)
14:34:51 <iwakeh> What I want to get to is
14:35:13 <iwakeh> to separate development and operation (even of the main instances).
14:35:27 <iwakeh> I do not use the default log for example
14:35:32 <iwakeh> that's why
14:35:47 <iwakeh> my mirror wasn't affected by the trace setting.
14:36:15 <iwakeh> operators could even prefer not to use logback, but some other slf4j implementation.
14:36:22 <karsten> and still, for development, we should make reasonable choices when and how often to use which level, for example.
14:36:35 <karsten> and be consistent between modules and products.
14:36:36 <iwakeh> yes, thats true.
14:36:46 <karsten> it was bad that the torperf module logged everything on trace.
14:36:53 <karsten> err, maybe it even still does.
14:36:56 <karsten> it is*.
14:37:06 <iwakeh> i think so.
14:37:19 <karsten> I'd be happy to change those things if we have reasonable guidelines.
14:37:26 <iwakeh> but, the ticket basically wants
14:37:46 <iwakeh> to reduce the comfort a little to enforce thinking.
14:38:00 <iwakeh> on the operators part.
14:38:12 <karsten> hehe
14:38:17 <iwakeh> to avoid blind usage of new log-settings.
14:38:20 <iwakeh> :-)
14:38:32 <iwakeh> and I
14:38:44 <iwakeh> assume good operators with a decent
14:39:09 <iwakeh> knowledge of linux/other operating system and java would prefer that.
14:39:22 <karsten> I'm all for leaving choices.
14:39:35 <iwakeh> forcing choices.
14:39:58 <iwakeh> An operator will have to choose actively
14:40:11 <iwakeh> what logging implementation and setting to use.
14:40:22 <iwakeh> and, they can be sure it won't
14:40:27 <karsten> well, what if they don't want to make choices about logging? we could just use "no logging" as default choice.
14:40:31 <iwakeh> change on a simple jar update.
14:40:40 <karsten> or "stdout logging".
14:40:59 <iwakeh> no logging is default if no implementation is supplied.
14:41:09 <iwakeh> It's slf4j setting.
14:41:25 <karsten> yes. I noticed that it puts out a warning and stays silent.
14:41:42 <iwakeh> right. and we have the documentation
14:41:43 <karsten> and we could even provide a jar to shut off that warning.
14:42:00 <iwakeh> for our choice which is logback, at the moment.
14:42:03 <karsten> I don't know whether that's the right choice.
14:42:06 <karsten> default choice*.
14:42:31 <iwakeh> I would provide the jars for logback as extra lib in the release.
14:42:48 <iwakeh> The logging framework configuration should be decoupled from CollecTor, i.e.
14:42:48 <iwakeh> remove default logback.xml from collector-<version>.jar
14:42:48 <iwakeh> add an example of logback.xml to src/main/resources
14:42:48 <iwakeh> provide the two logback-{classic,core}.jars with a release, but remove them from collector-<version>.jar
14:42:48 <iwakeh> add more logging info to the operating guide
14:43:37 <karsten> ah, you replaced the whole description?
14:43:43 <iwakeh> That also for the first Onionoo release.
14:43:54 <karsten> it doesn't say so in the diff.
14:43:57 <iwakeh> yes, I did. should have been new ticket.
14:44:06 <iwakeh> now I know.
14:45:23 <karsten> okay. I'm not sure what's the right thing to do here. I should read the ticket in more detail and think about it.
14:45:37 <iwakeh> yes :-)
14:45:48 <iwakeh> regarding the mailing ..
14:46:38 <iwakeh> its just an example that works for me as the server's settings are right.
14:47:07 <iwakeh> But, I'll respond with more detail to you're comment in the ticket.
14:47:29 <karsten> it would be good to learn something from this regarding how we should be using ERROR log messages in the code.
14:47:43 <iwakeh> definitely!
14:47:44 <karsten> it's one data point how people are using logging.
14:48:02 <karsten> if you can provide some ideas for how we should be using logging, I'd be happy to adapt some code.
14:48:04 <iwakeh> I hardly get ERRORs the only one is
14:48:32 <iwakeh> when I remove collector.properties. So, the level decision is a ver iportant question.
14:48:39 <karsten> agreed!
14:48:49 <iwakeh> new ticket for levels?
14:48:54 <karsten> yes, please.
14:49:03 <karsten> here's something somewhat related:
14:49:10 <iwakeh> but, not much more elaboration on the mailing as it is only logback?
14:49:28 <karsten> should we think about packaging collector et al. in a more standard way, including start script?
14:49:48 <iwakeh> I could supply a start script.
14:49:51 <iwakeh> but
14:50:01 <karsten> I'm just thinking that such a package might determine how we should be doing logging and configuration and so on.
14:50:17 <karsten> and it would be sad to redo many things now and have to redo them again later.
14:50:38 <iwakeh> the configuration should be the operator's task.
14:51:08 <iwakeh> I think we might rather need a special small operating
14:51:18 <iwakeh> git repo for the mainn collector instance?
14:51:37 <karsten> for the config?
14:51:49 <karsten> or run from git? (not sure what you mean)
14:51:53 <iwakeh> well, the server admins use git for their configs.
14:52:06 <karsten> right.
14:52:09 <iwakeh> not run from git, but docuent your settings there.
14:52:14 <karsten> oh, sure.
14:52:26 <iwakeh> an that way separates operation.
14:52:31 <karsten> but still, as packagers, we'd have to make the decision where the config file lives.
14:52:39 <iwakeh> I can use a different setup on the mirror.
14:52:40 <karsten> or where log files go.
14:52:51 <karsten> or what the start script does.
14:53:10 <iwakeh> well, ideally we could use a simple java-packager for first
14:53:22 <iwakeh> install and ask questions on first setup.
14:53:55 <karsten> well, maybe? I never used one of those.
14:53:56 <iwakeh> this would write the first config according to the suplied answers.
14:54:11 <karsten> ah, or maybe I did. I don't remember what tool exactly I used.
14:54:22 <karsten> I tried packaging metrics-lib a while ago. well, years ago probably.
14:54:41 <iwakeh> that doesn't need a setup
14:54:50 <karsten> and I'm not saying we must do this now. I'm just thinking whether we should do that before changing many things regarding logging and configuration and so on.
14:54:59 <karsten> no, that's why I used that. it was really easy. ;)
14:55:40 <iwakeh> well, the ticket suggest de-coupling of operational setup from what we package.
14:56:19 <iwakeh> first state all the operational questions in the operators guide
14:56:39 <iwakeh> and later we have the option of moving the question into a packager.
14:56:50 <karsten> ok!
14:56:57 <karsten> with that in mind, this sounds like a good process.
14:57:14 <iwakeh> and, no accidental full disks anymore!
14:57:21 <karsten> hehe
14:57:33 <karsten> trace logs did not fill disks entirely. not this time!
14:57:50 <iwakeh> eventually
14:57:51 <karsten> okay, I'll re-read that ticket.
14:58:00 <iwakeh> the disk would have been filled ...
14:58:07 <iwakeh> thanks.
14:58:08 <karsten> yes, but I receive warnings from nagios now.
14:58:11 <karsten> ;)
14:58:24 <iwakeh> I have a script
14:58:39 <iwakeh> for start, status and the like.
14:58:44 <karsten> oh, nice.
14:58:59 <iwakeh> could be added to resources?
14:59:04 <karsten> sure!
14:59:59 <karsten> alright.
15:00:00 <iwakeh> that's all for this topic.
15:00:15 <karsten> let's have another meeting next week before seattle?
15:00:31 <iwakeh> is that ok timewise for you?
15:00:41 <karsten> yes, sure.
15:00:46 <karsten> flight is on sat.
15:00:52 <iwakeh> fine.
15:01:14 <karsten> and feel free to set priority of tickets to high if I should be looking sooner.
15:01:22 <iwakeh> and during the seattle meeting?
15:01:37 <iwakeh> you probably won't have time there?
15:02:07 <iwakeh> I don't know if things need to be discussed then?
15:02:27 <iwakeh> Topics that arise in Seattle?
15:02:29 <karsten> 7 am.
15:02:41 <iwakeh> 7am?
15:02:43 <karsten> is that right?
15:02:56 <karsten> 14 utc = 7 pacific time.
15:03:13 <iwakeh> ah, ok. meet later?
15:04:04 <karsten> maybe 14 utc?
15:04:16 <karsten> or, let's talk about that early in the seattle week via email.
15:04:25 <iwakeh> that's fine.
15:04:33 <iwakeh> by mail then.
15:04:34 <karsten> okay, cool! but next meeting next week.
15:04:44 <iwakeh> type to yo then :-)
15:04:53 <karsten> heh, thanks for coming. bye! :)
15:04:58 <karsten> #endmeeting