14:30:15 <karsten> #startmeeting metrics team
14:30:15 <MeetBot> Meeting started Thu Mar 22 14:30:15 2018 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:30:15 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:30:32 <karsten> https://storm.torproject.org/shared/Oh4g0hNenh65QZNWRsIe5zxpB3e0axASeSgo5hKOp2A <- agenda pad
14:30:59 <karsten> still the same pad, but let's start with a new pad next week.
14:31:20 <iwakeh> a really long list
14:31:38 <karsten> yep. it's been a while since our last meeting.
14:31:41 <karsten> shall we start?
14:31:50 <iwakeh> fine
14:31:55 <karsten> * OONI Vanilla Tor and Bridge Reachability Testing (irl)
14:32:16 <irl> there are two things here and they both have the same name
14:32:37 <irl> the first is the simpler one, including data from OONI on Tor Metrics
14:33:02 <irl> for this one, I've spoken to hellais in Rome and we've come to a series of questions that we should answer in order to proceed
14:33:37 <irl> how did we envision that we would get data from OONI? are we expecting to have this data in collector or are we pulling it straight into metrics-web?
14:34:05 <karsten> when we discussed this in montreal, we said that ooni would preprocess the data for us.
14:34:19 <irl> but how would they give it to us?
14:34:26 <irl> they're ok to preprocess, but don't know how
14:34:40 <irl> Complete CSV file every day, or some pipeline where we get one line at a time?
14:34:44 <karsten> we can download an updated CSV file every day.
14:34:44 <irl> Should OONI perform pre-processing to remove values where there are not many reporting probes?
14:35:19 <karsten> if that is necessary for pre-processing, sure.
14:35:36 <irl> ok cool
14:35:53 <irl> so i should move forwards with this asking them to make available on an http endpoint a csv that we can download daily?
14:35:53 <karsten> I assume you'll work with them to discuss such things?
14:36:11 <karsten> that would work, yes.
14:36:40 <irl> #action irl follow up with OONI to ask them to make vanilla Tor reachability data available as a CSV on a web server that we will download daily
14:36:55 <irl> there is also then the second meaning for this item
14:37:00 <isabela> \o/
14:37:02 <karsten> we can later move more parts to our side. but that would take us more time now, and we wanted to start somewhere.
14:37:08 <irl> https://trac.torproject.org/projects/tor/wiki/org/meetings/2018Rome/Notes/OneClickCensorshipCircumvention
14:37:47 <irl> for this we would be looking at combining OONI data with our passive observations of how many directly connecting users and users using pluggable transports there are to determine which transports will work in a particular country
14:38:07 <irl> this is a more involved analysis that would need to be coordinated between OONI and Metrics, and possibly also Applications
14:38:21 <irl> isabela: i'm not sure about the time scale for this though, can you add something on that?
14:38:38 <isabela> yes
14:38:54 <isabela> we want to apply for a research grant
14:39:30 <isabela> so one question i had for the teams irl mentioned (ooni, metrics and applications) is what set of people (or if this can be done by a one person)
14:39:49 <isabela> we would need to answer the research questions we choose at the discussion in rome (link irl shared above)
14:40:10 <isabela> like 40% ooni 30% metrics 30% tbb ppl time for 12 months
14:40:25 <isabela> (a person time)
14:40:54 <isabela> maybe the best way to follow up on this request (and on the request of what could be the final deliverable of such research, a paper? a technical proposal?)
14:41:00 <isabela> via email thread on tor-internal list
14:41:01 <isabela> :)
14:41:18 <karsten> ok.
14:41:31 <karsten> timing is important here.
14:41:42 <isabela> deadline for proposal is march 31st
14:42:08 <iwakeh> tight
14:42:10 <isabela> probably will take a few months to know if we got it or not (3 to 6 i would guess.. hard to say since is a new place we are applying)
14:42:23 <isabela> yes, but is not a complicated proposal
14:42:35 <karsten> and the 12 months start then, or now?
14:42:39 <isabela> is like 2 pages with 5 or so questions
14:43:02 <isabela> 12 months starts when they get back to us, like 3 to 6 months after we submit i would guess
14:43:12 <isabela> is hard for me to give a start date now
14:43:17 <isabela> i can give estimations
14:43:36 <karsten> I'm just thinking of our sponsor 13 commitment, which is for the next 6 months.
14:43:46 <karsten> and which is going to keep us busy.
14:44:03 <karsten> but it sounds like the two won't collide much.
14:44:03 <isabela> yes, that is why i am asking the capacity question
14:44:31 <isabela> and what will be the final deliverable
14:44:36 <isabela> because we should not go too crazy here :)
14:44:49 <isabela> is not a lot of money for 12months of work that might be splitted between a bunch of ppl
14:44:49 <karsten> okay.
14:45:04 <karsten> let's follow up via email then.
14:45:11 <irl> i would like to produce a system that generates events based on changes in the data, that then a human looks at and distills into a list of countries and what transports should work
14:45:40 <irl> the analysis of the data to generate the events should be achievable in the timeframe, but we don't go all out promising a completely automated solution
14:46:13 <karsten> let's be careful with what we promise.
14:46:34 <karsten> we still have this bunch of things we already have. and we have commitments for the next months.
14:46:59 <irl> ok, let's follow up on the email thread
14:47:01 <karsten> let's rather collect ideas and make plans when our current roadmap is over.
14:47:06 <karsten> and that, yes.
14:47:27 <karsten> next item is:
14:47:33 <karsten> * New Censorship Team (irl)
14:48:08 <irl> we're looking at creating a new team that would be responsible for pluggable transports and censorship circumvention
14:48:32 <irl> until now this has sort of been under the network team but will now be more clearly defined
14:49:06 <irl> there will be work happening on the snowflake pt and metrics about snowflake will be important to help inform and guide that work
14:49:32 <irl> bridgedb is not used for distributing snowflake bridges and so we may need to instrument the broker that hands out the bridges
14:49:57 <irl> (the censorship team person will write the actual code for instrumentation, but we will then get given the metrics from it)
14:50:31 <irl> for now, i believe the only thing we will need to do is let isabela know a person that will act as a liason to this new team from metrics
14:50:45 <irl> the vanilla tor ooni data is also relevant, but we're already covering that
14:51:20 <irl> i am happy to be that person
14:51:21 <karsten> okay, so you already mentioned this via email.
14:51:24 <irl> yep
14:51:31 <karsten> I think that makes a lot of sense.
14:51:36 <iwakeh> yes.
14:51:43 <iwakeh> you started it already ;-)
14:51:56 <irl> heh (:
14:52:19 <karsten> and as you said, being the contact person doesn't mean doing all the work.
14:52:27 <isabela> hehe :)
14:52:44 <irl> isabela: should i reply to the thread or is it enough for you to read here?
14:53:06 <isabela> all good is on my list to reply to that thread will add this as part of it
14:53:18 <irl> ok cool thanks
14:53:26 <karsten> okay, great.
14:53:30 <irl> that's all for this topic then
14:53:37 <karsten> * Metrics that would be useful for the Network Team (irl)
14:54:18 <irl> measurements and metrics can be useful to inform and guide the development of the tor protocol and code
14:54:25 * isabela steps out for a bit
14:54:46 <irl> there were a whole bunch of things that people indicated that they would find useful, so i guess we can look at these as future roadmap tasks
14:55:12 <irl> the full list is on the pad, i don't think we need to go through these one at a time
14:55:22 <karsten> okay.
14:55:36 <karsten> I wonder where we should keep this list.
14:55:46 <iwakeh> ideas?
14:55:51 <karsten> Metrics/Ideas? yes, that.
14:55:51 <iwakeh> metrics/Ideas
14:56:04 <irl> ok, i can create tickets for each task
14:56:18 <iwakeh> maybe, one parent and many children?
14:56:31 <irl> #action irl create a ticket for each of the metrics that people requested at the rome meeting
14:56:35 <iwakeh> or a common tag?
14:56:36 <irl> iwakeh: yes, i'll group them
14:56:48 <iwakeh> fine :-)
14:56:53 <irl> and maybe add metrics-requested-2018-rome tag?
14:57:04 <karsten> sure.
14:57:42 <karsten> alright, thanks! moving on?
14:57:44 <irl> that's all for that topic, yes
14:57:49 <karsten> * Merging Relay Search patches (irl)
14:57:57 <irl> quick topic
14:58:01 <karsten> I think I just updated to latest relay search.
14:58:17 <irl> i fixed a couple of things, how do i tell you to update?
14:58:29 <karsten> send me a quick email.
14:58:37 <irl> ok cool
14:58:51 <irl> that's all for that topic.
14:58:56 <karsten> at some point we'll need to enable you and iwakeh to update metrics-web, too.
14:59:07 <karsten> but right now, that's the easiest.
14:59:10 <irl> ok
14:59:21 <karsten> * Atom feed (irl)
14:59:27 <karsten> I put that in.
14:59:34 <irl> i didn't have enough topics?
14:59:43 <karsten> just to point out that this is new. heh
14:59:59 <irl> http://metrics.torproject.org/news.atom - you can now subscribe to the news
15:00:11 <karsten> which is great!
15:00:24 <karsten> by the way, does that conclude our roadmap item?
15:00:42 <karsten> "    Provide metrics timeline events as both a table on Tor Metrics pages and as an RSS/Atom feed that is also syndicated via Twitter to increase community engagement (M; 60% done) "
15:00:44 <irl> i think our roadmap item said we had to tweet when there were new news entries
15:00:51 <irl> but it's like 90%
15:00:57 <karsten> ok.
15:01:06 <iwakeh> How to make it show up in the rss feed part of browser site info?
15:01:36 <irl> #25570
15:01:44 * iwakeh just subscribed successfully to the given link.
15:01:51 <iwakeh> ah, ok.
15:02:13 <karsten> alright. moving on?
15:02:20 <irl> ok
15:02:26 <karsten> * CollecTor webstats deployment (karsten)
15:02:26 <iwakeh> yep
15:02:37 <karsten> so, collector now runs the webstats module.
15:02:49 <karsten> it writes sanitized web logs to its recent/ and archive/ dirs.
15:03:06 <karsten> all in all, the collector side of this is done.
15:03:09 <iwakeh> and, the second instance synch's
15:03:16 <karsten> oh, and that.
15:03:24 <karsten> what remains:
15:03:43 <karsten> we need to update the collector page on tor metrics.
15:04:02 <karsten> and we need to change the webstats module in metrics-web to process these files rather than those on webstats.tp.o.
15:04:35 <karsten> I have a patch for the second part, but I'll need another day or two to clean that up.
15:04:43 <karsten> it does produce the same output.
15:04:51 <iwakeh> good.
15:05:09 <karsten> okay, that was mostly an update. nothing to do here.
15:05:23 <karsten> * CollecTor switch colchicifolium <-> corsicum (karsten)
15:05:31 <karsten> we need to do something there.
15:05:38 <karsten> colchicifolium fails too often.
15:05:49 <karsten> or, the host it's located on fails too often.
15:05:51 <iwakeh> One thought, why isn't onionoo drawing data from
15:06:02 <iwakeh> collector2 (= corsicum)?
15:06:20 <karsten> rather than collector?
15:06:40 <iwakeh> This would immediately solve the missing updates.
15:06:43 <karsten> you mean, switch over, not fetch from both?
15:06:49 <iwakeh> on onionoo's side.
15:06:59 <karsten> I think that's another way to hot-fix this.
15:07:16 <karsten> does corsicum sync everything we need?
15:07:27 <karsten> (we don't need microdescriptors, for example, which are not synced.)
15:07:31 <iwakeh> it should synch all.
15:07:50 <karsten> we can try.
15:08:00 <iwakeh> that would indicate
15:08:07 <karsten> I'll run a local onionoo instance that fetches from collector2.
15:08:13 <iwakeh> if there might be a problem elsewhere.
15:08:18 <karsten> true.
15:08:21 <iwakeh> good.
15:08:38 <karsten> okay.
15:08:48 <karsten> * metrics-web clients censorship detector Java migration (karsten)
15:08:57 <karsten> this is another update.
15:09:16 <karsten> I started rewriting the python script that identifies possible censorship events in java.
15:09:36 <karsten> the only remaining part is scipy's norm.fit, for which there's no good equivalent yet.
15:09:56 <iwakeh> well, it's actually python wrapper of minpack.
15:10:13 <iwakeh> http://www.netlib.org/minpack/
15:11:14 <iwakeh> hybrd and hybrdj.
15:11:14 <karsten> or that. :)
15:11:26 * isabela is back
15:11:31 <iwakeh> A fortran library for solving non-linear problems.
15:11:48 <karsten> is there really no java code to do this?
15:12:10 <iwakeh> Not really,
15:12:13 <irl> can you not fortran from java?
15:12:23 <iwakeh> commons-math stated they reimplemented
15:12:54 <iwakeh> (I think it was) lmstr
15:13:17 <iwakeh> from that same library, which seems to be the standard. and well tested.
15:13:38 <iwakeh> The latter referring to the entire fortran code from minpack.
15:13:58 <karsten> iwakeh: regarding our earlier email exchange, do you really need more test data for this?
15:14:04 <iwakeh> @irl I look into that,
15:14:19 <iwakeh> If you have any yes
15:14:32 <iwakeh> if you'd need to generate that.
15:14:37 <iwakeh> not so much.
15:14:45 <karsten> the example I sent you earlier would be good test data.
15:15:06 <iwakeh> we should make sure all
15:15:18 <iwakeh> tests we do now are conserved in unit tests.
15:15:31 <iwakeh> In order to keep the knowledge we gain now,
15:15:33 <karsten> okay, I have an idea what I can send you.
15:15:54 <iwakeh> and to be able to determine the usefulness of future changes.
15:16:11 <karsten> okay.
15:16:27 <karsten> that's all I have on this topic.
15:16:53 <karsten> do we have more topics that are not yet on the pad?
15:17:10 <iwakeh> not that I know.
15:17:14 <irl> not from me
15:17:23 <karsten> okay, great! long list indeed.
15:17:35 <iwakeh> and, we made it through :-)
15:17:40 <karsten> let's talk more next week! :)
15:17:47 <irl> oh
15:17:53 <karsten> oh!
15:17:53 <irl> no, there's another thing
15:18:15 <irl> we talked about having another roadmapping session in berlin
15:18:47 <karsten> yes, we should do that.
15:18:50 <irl> i think some priorities have changed in the last 6 months and we should look at what we might move around
15:19:05 <irl> especially to handle the new censorship team tasks and the ooni related tasks
15:19:40 <karsten> hmm, yes. that would mean that we should try to schedule the meeting really soon.
15:19:49 <karsten> well, really soon and *for* really soon.
15:20:30 <karsten> how do your next weeks look like?
15:20:37 * irl checks
15:20:45 <iwakeh> busy
15:21:52 <karsten> second week of april?
15:22:39 <iwakeh> I cannot really say right now.
15:22:46 <karsten> ok.
15:22:48 <karsten> email?
15:22:53 <irl> i think it will need to be email
15:23:00 <iwakeh> yep
15:23:01 <irl> there are things i know are happening that are not on my calendar
15:23:14 <iwakeh> same here
15:23:20 <karsten> okay.
15:23:44 <irl> ok, i'm really out of topics this time. (:
15:23:51 <karsten> heh
15:23:54 <iwakeh> oh
15:23:59 <karsten> oh!
15:24:04 * iwakeh kidding
15:24:08 <irl> heheh
15:24:08 <karsten> hehe
15:24:13 <iwakeh> sorry, couldn't resist
15:24:25 <karsten> alright, back to work. bye, bye! :)
15:24:32 <karsten> #endmeeting