16:02:23 <hellais> #startmeeting OONI gathering 2016-01-16
16:02:23 <MeetBot> Meeting started Mon Jan 16 16:02:23 2017 UTC.  The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:02:23 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:02:25 <hellais> https://pad.riseup.net/p/ooni-irc-pad
16:02:28 <hellais> hellos!
16:02:43 <sbs> yo!
16:02:48 <hellais> do we have an agenda?
16:03:02 <anadahz> hi
16:04:16 <darkk> link to agenda… link never changes…
16:05:42 <hellais> Is there something somebody would like to discuss?
16:06:38 <hellais> <audio src=crickets.wav />
16:10:23 <anadahz> I have an agenda topic. :)
16:10:36 <hellais> anadahz: excellent, add it to the pad!
16:10:47 <sbs> maybe we can just hint that we're about to make a ooniprobe for mobile release and if somebody is interested we can provide a beta?
16:11:34 <hellais> sbs: great point!
16:11:42 <hellais> For the android version you can download the apk here: https://github.com/measurement-kit/ooniprobe-android/releases/tag/v0.1.0-rc.1
16:12:01 <hellais> For the iOS version send an email to contact@openobservatory.org and we will invite you to test-flight
16:12:29 <anadahz> agenda topic added
16:12:39 <hellais> #topic Cannonical way of running ooniprobe test on VPN services and proxy lists
16:13:14 * landers here
16:13:38 <hellais> I think we have discussed this in the past already and conclusion was that VPNs and even moreso proxies are fairly unreliable and don't give an accurate picture of censorship happening in a certain country
16:14:10 <hellais> moreover it's also the case that geolocating these endpoint is hard and often times very inaccurate, so we may not even be measuring the thing we think we are measuring
16:14:32 <hellais> The fact that we don't rely on VPNs and proxies I think is also one of the key things that distinguishes us from other measurement projects out there
16:14:52 <anadahz> hellais: Agreed!
16:15:19 <anadahz> What could be the cannonical to do this with ooniprobe private collector?
16:15:42 <hellais> anadahz: what do you mean by "canonical"?
16:16:01 <landers> update from me: still waiting to sign an actual contract from otf, but i'd maybe like to write some code this week. been cramming c++ and think i can hammer out one of the easier ooni tests without too much trouble (IM reachability probably)
16:16:11 <hellais> anadahz: you setup a VPN on the machine and you run ooniprobe?
16:16:41 <anadahz> hellais: By canonical I mean a way that will not "harm" OONI infrastructure.
16:16:49 <anadahz> I do still
16:17:17 <anadahz> believe that adding VPNs will add a significant amount of baseline measurements.
16:17:49 <sbs> landers: that would be epic!
16:17:51 <hellais> landers: ah, if you are to begin implementing the IM reachability test it would be best if we have a chat before you go ahead with that. I have some idea of how that can be done in a way that is a bit abstracted and generalised for other IM apps as well
16:18:18 <hellais> landers: we should probably also have a general sync-up call together with sbs sometime this week to discuss the plan for your fellowship
16:18:26 <anadahz> and obviously we can't stop anyone from just firing up a number of ooniprobe tests via VPN
16:19:14 <hellais> anadahz: maybe a good thing to do you be to annotate the reports with something that says it's a VPN, maybe also including the VPN provider name
16:19:19 <landers> sbs: hellais: sounds good
16:20:30 <anadahz> hellais: Have we decided which annotation format are we going to use?
16:22:04 <hellais> maybe {vpn: $VPN_NAME}?
16:24:55 <anadahz> darkk: do you see any issues by using annotations to the reports?
16:27:57 <darkk> anadahz: no, but I see other issues ranging from ethical (e.g., using vpngate to test if jihadonline is blocked) to data sanity ("usual" for-profit VPN provider should probably reside in a non-filtered location to be useful to user)
16:28:33 <darkk> anadahz: what is your use-case for VPN usage?
16:29:05 <darkk> I mean use-case of doing measurements over VPN
16:29:13 <darkk> I can imagine several of them
16:30:01 <darkk> a) it's easy to ask a source to run L2/L3 VPN server, much easier than install VM with ooniprobe and scary lunix konsole
16:31:20 <darkk> b) parasite on open-proxies deployed on hacked toasters & VPNs like vpngate to gather more data (VPNGate claims to have two Turkmenistan exit nodes right now, for example)
16:31:33 <anadahz> darkk: I was thinking to contact some VPN providers and ask them if they are OK that using their servers to run ooniprobe tests.
16:32:07 <darkk> c) test via "commercial" VPN providers...
16:32:18 <darkk> and I totally don't get the point of (c)
16:33:46 <darkk> anadahz: can you name some examples of provides you're going to engage? what sort of data do you expect to get?
16:35:03 <anadahz> The main reason of doing this is to have a baseline comparison with the current OONI measurements that also some of them have been gathered from VPN or VPS servers.
16:36:07 <darkk> anadahz: do you want to get better control measurements having diverse "control" vantage points?
16:36:36 <anadahz> darkk: Find out if there are using the network in the country that they suppose to be running.
16:37:31 <anadahz> darkk: Check if the VPN networks are doing phishy things
16:37:34 <hellais> anadahz: if the purpose is that of having baselines we should first actually have logic in the pipeline that uses such baselines as we currently don't use as baslines other ooni measurements, nor have we thought of how we would be doing that
16:39:11 <hellais> anadahz: also if the goal is that of having a baseline is probably some better strategy to running them that doesn't necessarily require a VPN (such as deploying them on VPSs we run) and storing only the data we need for the baseline.
16:40:03 <hellais> "Find out if there are using the network in the country that they suppose to be running." - to answer this question I think probably running ooniprobe will not yield the best results
16:40:32 <hellais> I think it would be more useful to run some latency measurements to a set of endpoints in known locations and triangulating based on that
16:40:50 <hellais> "Check if the VPN networks are doing phishy things" - what do you mean by "phishy"?
16:41:10 <anadahz> Back in the days when the http-requests test was there was a vast collection of measurements from Tor in a similar way it will be useful to have VPN measurements.
16:42:11 <darkk> IMHO, "same country" test is different from ooni tests, specific censorship is not a strict attribute of country, but geographical borderline is. Well, it's possible to find something interesting in terms of network filtering by VPN providers, but IMHO it'll be mostly True Negative tests... I just don't know. AFAIK, it's the whole point of VPN providers -- to sell "uncensored" access to the
16:42:13 <darkk> Internetz (whatever it means).
16:43:10 <anadahz> hellais: Phishy things deployed proxies or other intermediate services that should not be deployed.
16:43:33 <hellais> anadahz: well the key here is that you need to define the question you want to answer and then figure out how to get the data to answer it.
16:44:01 <hellais> on the latency measurement thing this is a possibly relevant paper: http://crpit.com/confpapers/CRPITV102Arif.pdf
16:45:36 <darkk> anadahz: what may be the reason for VPN provider to do that?
16:45:42 <hellais> anadahz: I guess the middle box tests could be interesting to run on VPN endpoints, though I am a bit uncertain if you will be able to claim based on the data collected that it "should not be there". After all if you run a VPN service it would make sense to run some caching proxies in front of it.
16:46:06 <hellais> Clodo: probably can tell us if that is indeed the case.
16:46:51 <anadahz> darkk: Exactly since the VPN providers are selling uncensored access will be good to find out that out.since many of these providers are operating in multiple geographical locations will be very difficult to control or find out if they end up triggering a filtering/censorship box somewhere in transit
16:48:05 <anadahz> hellais: For instance: there are a number of VPN providers that do data collection for advertisement purposes.
16:49:30 <darkk> data collection is usually 100%-passive
16:50:28 <hellais> yeah I also think in most cases it's going to be entirely passive. You would see evidence of it only if they really screwed up how they are doing it.
16:50:34 <darkk> anadahz: btw, do you mean IP-level VPN saying "VPN" or do you mean stuff like hola / luminati.io as well?
16:52:05 <anadahz> darkk: Having measurements from there will also be interesting.
16:53:35 <anadahz> We had talk this talk about VPNs in the past but we were mainly limited by the fact of the low disk space in our infrastructure.
16:54:16 <anadahz> It will be very interesting to have a view with results from XXX VPN providers in explorer.
16:54:58 <anadahz> If the VPN providers accept to run measurements from their network.
16:55:42 <darkk> Should we design [OONI Certified] badge? :-D
16:55:43 <anadahz> hellais: and yes many times services screw up as ISPs do.
16:57:15 <anadahz> I suspect if we start running ooniprobe measurements from VPNs to have a significant amount of processing and disk space requirements.
16:58:05 <sbs> anadahz: does it make sense to do that, i.e. to use significant space for something that does not quite seem like ooni's #1 objective (imo)
16:58:40 <sbs> ?
17:01:52 <anadahz> sbs: That's why we are having this talk anyway. If it's not worth anything I can setup a private collector/test-helpers/explorer/pipeline and these measurements will not appear in ooni-explorer/ooni-measurements or stress the ooni infrastructure.
17:05:28 <hellais> It seems like a lot of work to have measurements that it's unclear exactly what is going to be done with them, but if you really think it's worth it go for it I guess
17:05:29 <darkk> anadahz: well, hellais asks absolutely correct question. Running a measurement over VPN is a tool, it's not a goal, so I suggest to convert this discussion into slow asynchronous brainstorming regarding the goals that can be fulfilled by this tool.
17:06:16 <hellais> do we have more to be discussed?
17:06:43 <sbs> darkk: +1
17:06:57 <anadahz> Hm.. well the VPNs also run on a network and perhaps will be good to have measurements run there as well.
17:10:30 <anadahz> sbs: Is is  possible to find out if android/ios users are using a VPN or VPN like service?
17:11:38 <sbs> anadahz: probably nuke knows better
17:11:43 <hellais> anadahz: in some cases it's possible I believe, but we don't do this at the moment. I think we had a ticket open about this and other related mobile specific telemetry
17:12:39 <hellais> anyways we are 12 minutes overtime, so unless there are some last urgent thing to talk about, we should defer this to tickets and async discussion.
17:12:44 <hellais> sound good?
17:12:50 <sbs> yep
17:13:05 <darkk> I'd say, let's move VPN discussion to ooni-talk@
17:13:14 <anadahz> sure!
17:14:00 <hellais> sounds good
17:14:03 <hellais> #endmeeting