14:30:15 #startmeeting metrics team 14:30:15 Meeting started Thu Mar 22 14:30:15 2018 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:30:15 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:30:32 https://storm.torproject.org/shared/Oh4g0hNenh65QZNWRsIe5zxpB3e0axASeSgo5hKOp2A <- agenda pad 14:30:59 still the same pad, but let's start with a new pad next week. 14:31:20 a really long list 14:31:38 yep. it's been a while since our last meeting. 14:31:41 shall we start? 14:31:50 fine 14:31:55 * OONI Vanilla Tor and Bridge Reachability Testing (irl) 14:32:16 there are two things here and they both have the same name 14:32:37 the first is the simpler one, including data from OONI on Tor Metrics 14:33:02 for this one, I've spoken to hellais in Rome and we've come to a series of questions that we should answer in order to proceed 14:33:37 how did we envision that we would get data from OONI? are we expecting to have this data in collector or are we pulling it straight into metrics-web? 14:34:05 when we discussed this in montreal, we said that ooni would preprocess the data for us. 14:34:19 but how would they give it to us? 14:34:26 they're ok to preprocess, but don't know how 14:34:40 Complete CSV file every day, or some pipeline where we get one line at a time? 14:34:44 we can download an updated CSV file every day. 14:34:44 Should OONI perform pre-processing to remove values where there are not many reporting probes? 14:35:19 if that is necessary for pre-processing, sure. 14:35:36 ok cool 14:35:53 so i should move forwards with this asking them to make available on an http endpoint a csv that we can download daily? 14:35:53 I assume you'll work with them to discuss such things? 14:36:11 that would work, yes. 14:36:40 #action irl follow up with OONI to ask them to make vanilla Tor reachability data available as a CSV on a web server that we will download daily 14:36:55 there is also then the second meaning for this item 14:37:00 \o/ 14:37:02 we can later move more parts to our side. but that would take us more time now, and we wanted to start somewhere. 14:37:08 https://trac.torproject.org/projects/tor/wiki/org/meetings/2018Rome/Notes/OneClickCensorshipCircumvention 14:37:47 for this we would be looking at combining OONI data with our passive observations of how many directly connecting users and users using pluggable transports there are to determine which transports will work in a particular country 14:38:07 this is a more involved analysis that would need to be coordinated between OONI and Metrics, and possibly also Applications 14:38:21 isabela: i'm not sure about the time scale for this though, can you add something on that? 14:38:38 yes 14:38:54 we want to apply for a research grant 14:39:30 so one question i had for the teams irl mentioned (ooni, metrics and applications) is what set of people (or if this can be done by a one person) 14:39:49 we would need to answer the research questions we choose at the discussion in rome (link irl shared above) 14:40:10 like 40% ooni 30% metrics 30% tbb ppl time for 12 months 14:40:25 (a person time) 14:40:54 maybe the best way to follow up on this request (and on the request of what could be the final deliverable of such research, a paper? a technical proposal?) 14:41:00 via email thread on tor-internal list 14:41:01 :) 14:41:18 ok. 14:41:31 timing is important here. 14:41:42 deadline for proposal is march 31st 14:42:08 tight 14:42:10 probably will take a few months to know if we got it or not (3 to 6 i would guess.. hard to say since is a new place we are applying) 14:42:23 yes, but is not a complicated proposal 14:42:35 and the 12 months start then, or now? 14:42:39 is like 2 pages with 5 or so questions 14:43:02 12 months starts when they get back to us, like 3 to 6 months after we submit i would guess 14:43:12 is hard for me to give a start date now 14:43:17 i can give estimations 14:43:36 I'm just thinking of our sponsor 13 commitment, which is for the next 6 months. 14:43:46 and which is going to keep us busy. 14:44:03 but it sounds like the two won't collide much. 14:44:03 yes, that is why i am asking the capacity question 14:44:31 and what will be the final deliverable 14:44:36 because we should not go too crazy here :) 14:44:49 is not a lot of money for 12months of work that might be splitted between a bunch of ppl 14:44:49 okay. 14:45:04 let's follow up via email then. 14:45:11 i would like to produce a system that generates events based on changes in the data, that then a human looks at and distills into a list of countries and what transports should work 14:45:40 the analysis of the data to generate the events should be achievable in the timeframe, but we don't go all out promising a completely automated solution 14:46:13 let's be careful with what we promise. 14:46:34 we still have this bunch of things we already have. and we have commitments for the next months. 14:46:59 ok, let's follow up on the email thread 14:47:01 let's rather collect ideas and make plans when our current roadmap is over. 14:47:06 and that, yes. 14:47:27 next item is: 14:47:33 * New Censorship Team (irl) 14:48:08 we're looking at creating a new team that would be responsible for pluggable transports and censorship circumvention 14:48:32 until now this has sort of been under the network team but will now be more clearly defined 14:49:06 there will be work happening on the snowflake pt and metrics about snowflake will be important to help inform and guide that work 14:49:32 bridgedb is not used for distributing snowflake bridges and so we may need to instrument the broker that hands out the bridges 14:49:57 (the censorship team person will write the actual code for instrumentation, but we will then get given the metrics from it) 14:50:31 for now, i believe the only thing we will need to do is let isabela know a person that will act as a liason to this new team from metrics 14:50:45 the vanilla tor ooni data is also relevant, but we're already covering that 14:51:20 i am happy to be that person 14:51:21 okay, so you already mentioned this via email. 14:51:24 yep 14:51:31 I think that makes a lot of sense. 14:51:36 yes. 14:51:43 you started it already ;-) 14:51:56 heh (: 14:52:19 and as you said, being the contact person doesn't mean doing all the work. 14:52:27 hehe :) 14:52:44 isabela: should i reply to the thread or is it enough for you to read here? 14:53:06 all good is on my list to reply to that thread will add this as part of it 14:53:18 ok cool thanks 14:53:26 okay, great. 14:53:30 that's all for this topic then 14:53:37 * Metrics that would be useful for the Network Team (irl) 14:54:18 measurements and metrics can be useful to inform and guide the development of the tor protocol and code 14:54:25 * isabela steps out for a bit 14:54:46 there were a whole bunch of things that people indicated that they would find useful, so i guess we can look at these as future roadmap tasks 14:55:12 the full list is on the pad, i don't think we need to go through these one at a time 14:55:22 okay. 14:55:36 I wonder where we should keep this list. 14:55:46 ideas? 14:55:51 Metrics/Ideas? yes, that. 14:55:51 metrics/Ideas 14:56:04 ok, i can create tickets for each task 14:56:18 maybe, one parent and many children? 14:56:31 #action irl create a ticket for each of the metrics that people requested at the rome meeting 14:56:35 or a common tag? 14:56:36 iwakeh: yes, i'll group them 14:56:48 fine :-) 14:56:53 and maybe add metrics-requested-2018-rome tag? 14:57:04 sure. 14:57:42 alright, thanks! moving on? 14:57:44 that's all for that topic, yes 14:57:49 * Merging Relay Search patches (irl) 14:57:57 quick topic 14:58:01 I think I just updated to latest relay search. 14:58:17 i fixed a couple of things, how do i tell you to update? 14:58:29 send me a quick email. 14:58:37 ok cool 14:58:51 that's all for that topic. 14:58:56 at some point we'll need to enable you and iwakeh to update metrics-web, too. 14:59:07 but right now, that's the easiest. 14:59:10 ok 14:59:21 * Atom feed (irl) 14:59:27 I put that in. 14:59:34 i didn't have enough topics? 14:59:43 just to point out that this is new. heh 14:59:59 http://metrics.torproject.org/news.atom - you can now subscribe to the news 15:00:11 which is great! 15:00:24 by the way, does that conclude our roadmap item? 15:00:42 " Provide metrics timeline events as both a table on Tor Metrics pages and as an RSS/Atom feed that is also syndicated via Twitter to increase community engagement (M; 60% done) " 15:00:44 i think our roadmap item said we had to tweet when there were new news entries 15:00:51 but it's like 90% 15:00:57 ok. 15:01:06 How to make it show up in the rss feed part of browser site info? 15:01:36 #25570 15:01:44 * iwakeh just subscribed successfully to the given link. 15:01:51 ah, ok. 15:02:13 alright. moving on? 15:02:20 ok 15:02:26 * CollecTor webstats deployment (karsten) 15:02:26 yep 15:02:37 so, collector now runs the webstats module. 15:02:49 it writes sanitized web logs to its recent/ and archive/ dirs. 15:03:06 all in all, the collector side of this is done. 15:03:09 and, the second instance synch's 15:03:16 oh, and that. 15:03:24 what remains: 15:03:43 we need to update the collector page on tor metrics. 15:04:02 and we need to change the webstats module in metrics-web to process these files rather than those on webstats.tp.o. 15:04:35 I have a patch for the second part, but I'll need another day or two to clean that up. 15:04:43 it does produce the same output. 15:04:51 good. 15:05:09 okay, that was mostly an update. nothing to do here. 15:05:23 * CollecTor switch colchicifolium <-> corsicum (karsten) 15:05:31 we need to do something there. 15:05:38 colchicifolium fails too often. 15:05:49 or, the host it's located on fails too often. 15:05:51 One thought, why isn't onionoo drawing data from 15:06:02 collector2 (= corsicum)? 15:06:20 rather than collector? 15:06:40 This would immediately solve the missing updates. 15:06:43 you mean, switch over, not fetch from both? 15:06:49 on onionoo's side. 15:06:59 I think that's another way to hot-fix this. 15:07:16 does corsicum sync everything we need? 15:07:27 (we don't need microdescriptors, for example, which are not synced.) 15:07:31 it should synch all. 15:07:50 we can try. 15:08:00 that would indicate 15:08:07 I'll run a local onionoo instance that fetches from collector2. 15:08:13 if there might be a problem elsewhere. 15:08:18 true. 15:08:21 good. 15:08:38 okay. 15:08:48 * metrics-web clients censorship detector Java migration (karsten) 15:08:57 this is another update. 15:09:16 I started rewriting the python script that identifies possible censorship events in java. 15:09:36 the only remaining part is scipy's norm.fit, for which there's no good equivalent yet. 15:09:56 well, it's actually python wrapper of minpack. 15:10:13 http://www.netlib.org/minpack/ 15:11:14 hybrd and hybrdj. 15:11:14 or that. :) 15:11:26 * isabela is back 15:11:31 A fortran library for solving non-linear problems. 15:11:48 is there really no java code to do this? 15:12:10 Not really, 15:12:13 can you not fortran from java? 15:12:23 commons-math stated they reimplemented 15:12:54 (I think it was) lmstr 15:13:17 from that same library, which seems to be the standard. and well tested. 15:13:38 The latter referring to the entire fortran code from minpack. 15:13:58 iwakeh: regarding our earlier email exchange, do you really need more test data for this? 15:14:04 @irl I look into that, 15:14:19 If you have any yes 15:14:32 if you'd need to generate that. 15:14:37 not so much. 15:14:45 the example I sent you earlier would be good test data. 15:15:06 we should make sure all 15:15:18 tests we do now are conserved in unit tests. 15:15:31 In order to keep the knowledge we gain now, 15:15:33 okay, I have an idea what I can send you. 15:15:54 and to be able to determine the usefulness of future changes. 15:16:11 okay. 15:16:27 that's all I have on this topic. 15:16:53 do we have more topics that are not yet on the pad? 15:17:10 not that I know. 15:17:14 not from me 15:17:23 okay, great! long list indeed. 15:17:35 and, we made it through :-) 15:17:40 let's talk more next week! :) 15:17:47 oh 15:17:53 oh! 15:17:53 no, there's another thing 15:18:15 we talked about having another roadmapping session in berlin 15:18:47 yes, we should do that. 15:18:50 i think some priorities have changed in the last 6 months and we should look at what we might move around 15:19:05 especially to handle the new censorship team tasks and the ooni related tasks 15:19:40 hmm, yes. that would mean that we should try to schedule the meeting really soon. 15:19:49 well, really soon and *for* really soon. 15:20:30 how do your next weeks look like? 15:20:37 * irl checks 15:20:45 busy 15:21:52 second week of april? 15:22:39 I cannot really say right now. 15:22:46 ok. 15:22:48 email? 15:22:53 i think it will need to be email 15:23:00 yep 15:23:01 there are things i know are happening that are not on my calendar 15:23:14 same here 15:23:20 okay. 15:23:44 ok, i'm really out of topics this time. (: 15:23:51 heh 15:23:54 oh 15:23:59 oh! 15:24:04 * iwakeh kidding 15:24:08 heheh 15:24:08 hehe 15:24:13 sorry, couldn't resist 15:24:25 alright, back to work. bye, bye! :) 15:24:32 #endmeeting