15:59:42 #startmeeting OONI gathering 2017-04-10 15:59:42 Meeting started Mon Apr 10 15:59:42 2017 UTC. The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:42 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:59:57 hellos 16:00:08 hey 16:00:09 hello! 16:00:26 yo 16:00:57 hi 16:01:11 #topic 1. OONI pipeline status 16:01:59 ooni-pipeline 16.10 status is -- going to finalise ansible playbook to document it's deployment 16:02:26 ok 16:02:27 it got more time than I originally expected, got two bugs in docker/ansible integration module :-/ 16:02:40 anadahz: was there something in particular you wanted to discuss about this? 16:03:30 hi 16:03:34 are there any externally visible changes? 16:03:50 I assume, anadahz has two concerns: 1. when can we drop chameleon?, 2. when can we drop ooni-pipeline-db storing third copy of the data? 16:04:05 willscott: you can look at https://github.com/TheTorProject/ooni-pipeline/tree/pipeline-16.10 16:04:06 Yep I was asking the status of the pipeline with response to the inclusion of 3rd (non OONI) data sources also. 16:04:30 willscott: well, there is "visible" airflow web-ui, but I consider that it should be password-protected as it has no anonymous read-only access 16:04:46 willscott: I think this is probably what you could find most useful: https://github.com/TheTorProject/ooni-pipeline/blob/pipeline-16.10/docs/pipeline-16.10.md 16:05:42 wrt 3rd party data inclusion I think we first should reach feature parity in terms of 1st party data inclusion 16:06:22 anadahz: please, be so kind to be less "meta" :) what data sources do you mean? 16:06:24 darkk: 1 and 2 are valid concerns, since we are running out of disk space quite often on these servers 16:06:25 i guess, does submission of measurements change at all? 16:07:03 willscott: submission is not changed, storage & processing is 16:07:10 willscott: no, submission of data is not changed. 16:07:47 darkk: 1st party data inclusion: a bunch of OONI data that are not publicly available. 16:08:47 anadahz: which OONI data is not publicly available? 16:08:52 darkk: 2nd party data inclusion: satellite sources you can find some previous work here https://github.com/TheTorProject/ooni-pipeline/pull/1 16:08:54 One thing that I was wondering - Ooni has the "news" blog on its front page, but it would be useful to have a calendar to see Ooni-related talks, research meet-ups etc, in some time in advance, and have the minutes of these sessions on the website etc. I'm not sure if these are put on the Tor blog calendar? 16:09:19 hellais: Some UK and VZ reports 16:10:18 * landers helo 16:10:28 the UK data that was previously unprocess should work fine with the new pipeline (since the many small reports will be batched together) 16:10:59 If by VZ, you mean VE, then that will have to be ingested in the pipeline as well and I don't see issues with that  either 16:11:27 yulax: like on the torproject's index webpage? 16:11:51 yulax: yeah, we do use the tor calendar (sometimes) to mention where we are at. WRT to talks we published a short blog post summarising the events we attended the past month, with links to the sesions and eventual slides or notes 16:11:56 yep I meant VE 16:12:01 https://ooni.torproject.org/post/ooni-iff-rightscon/ 16:13:31 hellais: for events that are planned well in advance it would be good if you could put them up when you are confirmed 16:13:37 anadahz: thanks, I've not looked at satellite data yet and I'm not sure how to use it at the moment. Will's paper is still in my to-read list :) 16:13:46 @yulax we'll also be revamping the OONI website (hopefully sometime soon), and this could potentially include a section for such updates and info 16:14:13 to what extent will the website be redesigned? 16:14:58 @yulax we'll be re-writing the copy and we'll be restructuring how the information is presented. and if budget allows it, we may also include some new design 16:16:08 @yulax any further suggestions from you would be appreciated 16:17:36 next(topic) ? 16:18:04 yes, let's proceed to the next topic 16:18:06 fwiw i find the grey text at the top on white background difficult to see as i have poor eyesight.. that's kind of a side issue maybe though 16:18:54 @yulax agreed. Please add your suggestions as tickets here: https://github.com/TheTorProject/ooni-web :slightly_smiling_face: 16:20:30 I'm jumping on 3rd topic since we have discussed already the 2nd one: ...adding measurements from third party sources 16:20:58 3. Status of up-to date Debian/Ubuntu ooniprobe packages 16:21:11 #topic 3. Status of up-to date Debian/Ubuntu ooniprobe packages 16:21:23 is irl in 'da house? 16:21:36 I am not sure it makes sense to discuss this without him 16:21:50 idle: 7 days, 00 hours -- he was waiting for us a week ago 16:23:10 OK perhaps we can move to the next topic and maybe irl is comes back later. 16:24:03 so going to the 4th topic -- this is not a public event, but rather limited to our partners 16:24:20 The event will take place during the last week of June 16:24:32 The most likely dates are 26th-27th June 16:24:42 okay, so let's keep that OTR, I was just wondering if the dates are already "set" in stone or not :) 16:24:53 But this hasn't been confirmed yet because there are other moving components which will determine the final date 16:25:35 ah, okay, let's assume that these dates are pieces of desinformation for anyone reading the logs :-P 16:25:37 @darkk They're not set in stone yet, but the most probable dates are 26th-27th June 16:25:44 It makes sense to do do an open hackathon or similar before/after this event. 16:25:57 @darkk waiting to hear back from funders etc... 16:26:23 agrabeli: 27 june may be not so optimal for me 16:27:09 @sbs there is still some flexibility in terms of the dates... I guess we can discuss this further OTR 16:27:53 agrabeli: okay, thank you 16:28:10 @anadahz I agree that it would be nice to also organize a hackathon. However, we have submitted funding proposals that do not take into account the extra costs for the extra day or 2 for a hackathon.... 16:28:45 @agrabeli: The OONI hackathon doesn't need to be a funded event. 16:29:43 there are many people that live in the area of Berlin and flights/accommodation in EU can be fairly cheap. 16:29:48 @anadahz yes but if we have the hackathon right after the Partner Event, it would've been nice to be able to invite the partners to that too, rather than asking them to fly back one day before the hackathon 16:29:59 and the budget doesn't include an extra day for a hackathon 16:30:11 so I'd suggest that we organize a hackathon at different dates 16:30:33 *for different dates 16:30:43 @agrabeli It's not going to be easy for me to be around in different dates. 16:31:46 @anadahz to have an effective hackathon we'd need to do outreach at least a month in advance and have some budget to fly people in 16:32:03 @anadahz otherwise we risk having a hackathon with only 2 new people and very little (if any) diversity 16:32:14 @anadahz all of this requires $ and time 16:32:45 @anadahz and so I'd suggest that we plan this for the fall 16:33:16 @agrabeli It depends on the type of hackathon tha we want to have, usually I would if I can stay with some people hack and come up with some nice ideas. 16:34:32 In terms of time announcement we are good, since it's 2+ months in advance. 16:35:59 ..also we have selected Berlin as many OONI people and developers are hanging around in Berlin 16:37:13 @anadahz sure. Please create a plan regarding the hackathon and we can explore the possibility of making it happen. 16:38:34 @agrabeli Great! 16:38:47 @anadahz I think we can easily try to book the onionspace and invite people to join us there. But perhaps we want to plan beyond that so that it is an inclusive hackathon. 16:39:21 i.e. to ensure that it's not just "2 new people" 16:39:28 @agrabeli Absolutely! That's why I brought this discussion so early! 16:40:15 @anadahz cool 16:41:45 Do we have any more topics to discuss? 16:41:47 Going back to the 2nd topic (adding third party data sources) --- I think this is more of a long-term goal. We probably want to start integrating third party data sources (e.g. Satellite data) as part of the broader revamping of OONI Explorer and once we have released a stable pipeline. 16:42:15 @anadahz what did you specifically want to discuss in terms of integrating 3rd party data? 16:43:08 @agrabeli we have discussed this already during the 2nd topic ^ 16:43:12 IMHO, every 3rd-party data should pass through human-in-the-loop before being added to pipeline, pipeline is just a scheduler for periodic task... 16:44:48 @anadahz it's not clear to me what you wanted to discuss specifically 16:46:09 what would human-in-the-loop mean? i guess, is there rate-limiting stuff that would be lifted? i'm not sure if there's anything else that would be expected from the ooni side 16:47:18 @agrabeli It was actually a combined topic (that was later got split by someone/something) The status of the pipeline with respect to adding 3rd party sources to the OONI repositoty. 16:47:20 willscott: I mean, that IMHO we should have a clear idea (backed by some proof-of-concept code) regarding _usage_ of some 3rd-party data data before ingesting it. 16:47:56 willscott: I think what darkk is saying, and he can correct me if I am wrong, is that in order for us to integrate third party data we should probably first have a use case for it, implement a PoC of the use case in explorer/measurements "manually" and then automate the ingestion of said data via the pipeline 16:48:01 anadahz: I'm sorry for splitting it, I misunderstood it when I've seen the title 16:48:57 sure, i guess i'm thinking about measurements that are a subset of existing ooni client measurements 16:53:34 willscott: do you have an example of these sorts of measurements? 16:53:54 satellite is the dns resolution component of web request 16:54:11 the augur/spooky-scan thing from princeton is the tcp connectivity component of web request 16:54:31 I think we would probably treat each of these as different "entities" and not mix them up with OONI measurements 16:59:10 Do we have anything more to discuss? 16:59:32 I don't think so 17:00:10 Thanks everyone for attending! 17:03:24 #endmeeting