17:00:04 #startmeeting 17:00:04 Meeting started Mon Aug 22 17:00:04 2016 UTC. The chair is anadahz. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic. 17:00:31 Hello everyone and welcome to OONI's weekly meeting. 17:00:43 Who is around? 17:00:45 Hello oonitarians! I'm de-lurking for the meeting this week. I have a couple of things to ask when it's appropriate to do so. I added them to the pad. 17:01:42 Hi graphiclunarkid glad that you are around! 17:02:05 Good to see you too, anadahz :) 17:02:15 * darkk is still here :) 17:02:44 here 17:02:54 * r2r0 hellais here on 3g 17:03:15 I'm here 17:03:56 Nice! Let's start with the agenda topics. 17:04:18 #topic SWUpdate/OSroot news 17:05:42 It seems that both OSTree and SWUpdate as well as swupd seem like a good fit for our over the air update mechanism. 17:07:38 do we consider filesystem damage a risk? Do we want to have two rootfs (ChromeOS and SWUpdate schema) to be able to distribute raw image? 17:07:41 I have installed and used OSTree and as it seems the process is pretty straight forward though we may need to change the way that we build the Lepidopter RasPi image, 17:09:08 darkk: Taking filesystem damages is indeed needed, though SWUpdate does specifically designed for NAND storage in mind rather than SD cards. 17:09:18 have you found any stress tests for OSTree? AGL claims that OSTree does not survive power cut, OSTree claims it does, but it relies on Filesystem to survive power cut. 17:10:03 -Side note- I wasn't able to build and run SWUpdate https://github.com/sbabic/swupdate/issues/16 17:11:32 AFAIK, we don't need MTD support, but Idon't know if it actually makes things easier 17:12:35 darkk: I haven't really looked into any tests (or stress tests). I mainly focused on finding out how each software works and if there is support for what we want to achieve; OTA updates 17:13:28 Ah! got it. Excuse me for being paranoid about last-mile consistency. 17:14:32 moving to ENOSPC? 17:14:39 I will actually like to test SWUpdate if I manage to build and run this software! 17:15:12 #topic ENOSPC 17:16:15 I'll try to blew off some dust from my embedded skills to try to build SWUpdate. But I doubt I'll test in through. 17:17:16 OONI infrastructure is officially running low on disk space :) 17:17:39 anadahz: how low? 17:18:05 Coming ENOSPC — I'm checking data stats to see if we can easily store compressed json data. Current compression factor for largest json files is ~0.2...0.3, that may give us plenty of space within current hardware. Plan B is to attach some storage from a nearby box in same rack (NFS?). 17:18:26 r2r0: ~8..9 days given current data flow 17:19:20 IMHO, we can't have meaningful migration to new infrastructure within this time bound and we have to add some ad-hoc crutch to stay alive :) 17:20:21 Plan C is to drop scratchpad data. Iff we're sure we can regenerate it and/or don't need it anywhere in the pipeline and it should be treated as temporary data. 17:20:30 Yes that seems reasonable. I think what we can do as the quick fix, that I have already been doing to give us a bit more mileage the past month, is to zap the intermediate normalised stage of the pipeline 17:21:34 r2r0: was there any reason to store intermediate data ±forever instead of marking it temporary? 17:22:04 Also we can consider deleting from the pipeline machine the historical original data since its backed up in s3 17:22:26 /dev/mapper/hammerhead--vg-root 265G 244G 21G 93% / 17:22:37 /dev/mapper/chameleon--vg-root 2.0T 1.9T 143G 93% / 17:22:47 And this is after reducing reserved filesystem blocks and reducing the size of swap partition in hammerhead 17:22:56 It seems that people love OONI and submit quite some reports than before in fact 8 times more than 3 months :D 17:23:35 anadahz: IMHO, that's web_connectivity that produces rather big datapoints (uncompressed, base64-encoded web-pages) 17:24:05 but that's speculation, I have no numbers right now. 17:24:06 darkk: the reason for that was that the workflow of the pipeline considers the stage complete if it sees a file in the previous stage to exist, by deleting the intermediate stages we loose the ability to selectively rerun the pipeline on some slice with resume support 17:25:00 794M /data/ooni/sanitized-reports/2016-05-22 17:25:02 5.4G /data/ooni/sanitized-reports/2016-08-22 17:25:08 darkk: yeah web_connectivity and http_requests are the bulk of the data 17:26:06 We can probably should store the published reports compressed as its also better for people that need to do bulk downloads 17:26:48 r2r0: the webserver probably compresses it, so we're burning CPU, but not bandwidth. 17:28:04 darkk: true 17:28:18 I've also seen some discussion regarding new hosting, they suggest to use VMs of different shape and avoid usage of huge disk images (makes perfects sense from OpenStack point of view). The discussion is still ongoing. 17:28:45 So... Egypt? 17:28:57 \o/ 17:29:04 #topic EGYPT. What the hell is going on there? 17:29:30 Would anyone like to say something or should I explain what's going on? (this will be long) 17:30:01 TheNavigat: please go ahead 17:30:42 Okay. For a few years now the government has been trying to block certain internet material. For example, they're currently blocking almost all VoIP services including Skype, Facebook Calls, Whatsapp Calls, etc. 17:30:46 (others can always say something) 17:31:53 But since last January things have been a little bit different. They're blocking/throttling VPNs and certain websites. Last July they suddenly blocked a big number of HTTPS websites, and they're doing this again now. The current incident started about a couple of days ago and it's on almost the same scale. 17:32:29 When we debugged what happened last July we figured that while the TCP handshake went through, the SSL handshake itself didn't pass. The packets were dropped. However, what's happening now is even worse. 17:33:04 They've once tested blocking porn sites by intercepting HTTP requests, MODIFYING them so that they return a redirection header to an advertisement website. However, they unblocked that the second day. 17:33:11 They reversed that * 17:33:50 Multiple reports have been sent to hosting companies including DigitalOcean and Linode. The response is always "check with your ISP", who are not very helpful 17:34:36 The interruptions affect SSL, SSH, and the dropped packets also include the not-very-normal ones. They also throttle SSL downstreams to throttle VPNs under SSL tunnels. 17:35:15 However, the blocking effects themselves are not stable and differ from one connection to another, but for most of the blocked stuff it's the same. For example, currently laracasts.com is down, for everyone. 17:36:06 However, not everyone's VPN connection is blocked 17:36:36 TheNavigat: the http version of `laracasts.com` is up? 17:36:50 Sometimes UDP's working, sometimes it isn't. Same goes for TCP as well. The currently stable VPN connections are XOR TCP connections and SSL tunnels, but SSL tunnels are throttled 17:37:21 anadahz: http://laracasts.com itself works but it redirects to https, so yea :/ 17:37:51 The problem is that this time it affected companies hosting their servers on Linode and such who have literally asked us to switch them to HTTP so taht users can access the site 17:38:28 It isn't only laracasts btw. wzo.org.il is also blocked, for example. 17:39:25 The way packets are dropped, reordered, routes changes, and HTTP requests are intercepted makes it definite that the government /IS/ performing DPI and MITM 17:39:28 (what is an XOR TCP connection?) 17:39:40 landers: OpenVPN TCP with XOR patch 17:40:15 And this isn't the first time the government has performed something like this. A few years ago, Google caught the Egyptian government injecting an SSL certificate to perform MITM attacks. 17:40:30 (thx) 17:40:33 TheNavigat: is the throttling blocking happening more consistently towards certain sites rather than others? 17:41:30 r2r0: Throttling is uniform, SSL connection dropping isn't, but for something like laracasts and WZO it is 17:42:29 Here's the article Google released regarding the SSL MITM attempt, https://security.googleblog.com/2015/03/maintaining-digital-certificate-security.html 17:42:34 My plan is to review ooni-probe measurements we have and collaborate with TheNavigat gathering the evidence with and without ooni-probe (if the ooni-probe is not enough). I'll probably start with those claims that can be cross-validated via other means. I hope there is still some ISP diversity in Egypt. 17:43:21 darkk: There /was/. TE-Data represents the government and 70% of the internet users inside Egypt are on TE-Data 17:43:47 They have monopolized fiber optics connections 17:44:15 And we've also discovered a few weeks ago that other ISPs have to redirect their traffic through TE-Data before the packets leaves the country 17:44:41 As in mtr literally shows a hop with a te-data IP when we're connected to another ISP 17:45:00 I'm not sure if all ISPs do that though or is it just the one I'm currently connected to 17:45:07 Redirecting all the external traffic through single backbone ISP sounds familiar to me :-) 17:45:27 Yes that sounds good. If we have somebody that can run some tests in egypt it would be useful to also enable pcap capture in ooniprobe to better undertand at what point in the handshake the dropping is happening and how 17:45:38 The plan right now is to basically release all this on a trusted portal so that it can be covered by newspapers and such 17:46:12 r2r0: We've already set up ooniprobe to run twice a day here, I'll be sending the logs to anadahz (and everyone else interested) regularly 17:46:30 TheNavigat: ah great! 17:46:33 And regarding the newspapers bit, we already have people waiting for this to get released 17:46:56 I don't think it's blocking as much as inducing chaos among the developer community 17:47:26 And it's working 17:48:08 People are screaming on technical groups on Facebook and such. No one understands what's going on and no one understands how to fix it. 17:48:09 TheNavigat: the results will be sent to us automatically, no need to manually send logs to anadahz. 17:48:35 r2r0: I heard they take some time to be uploaded 17:48:39 r2r0: tor seems to be blocked, maybe the bouncer is blocked too 17:49:28 darkk: I tried using Tor personally without ooniprobe, it didn't work 17:49:38 TheNavigat: So you think that this is intended for developers to fallback using HTTP? 17:49:40 TheNavigat: they are uploaded immediately, they just take a while to get analyses automatically, but we can still retrieve them 17:50:19 I'm not a Tor guy (actually I've used it only once before) so I'm not exactly sure why, but Tails didn't show a working connection to the entry servers 17:50:22 entry relays * 17:50:33 anadahz: I don't exactly think so. Here's the case, this will get political. 17:50:36 * darkk remembers github being blocked in .ru, that was a mess :-) 17:50:55 The army is ruling everything here, and one of the things they don't control (yet) is internet business. 17:51:20 I do see reports coming in from egypt so it should be working fine, though if it's not it's better to change ooniprobe to use the https pr cloud front collector 17:51:44 Let me make this clearer. When the army needs money in the buildings business, they take all the business from the private companies, and make a shitton of money out of it 17:51:50 Also without a working control channel the web-connectivity test will not work properly 17:51:57 They've recently started doing that to the internet guys, too. 17:52:32 r2r0: TheNavigat has been sending other logs and reports such as PCAPs 17:52:53 In my humble opinion I think it's rather destroying the development business itself. It might be to force them to fallback to HTTP, but I think that isn't the biggest problem they have right now. 17:53:18 r2r0: Well, that's the best we can do :/ 17:54:49 I can't for sure say why is this happening and I can't just guess their intentions, but in the end the one thing I can say is that we're getting screwed 17:55:24 Yeah it seems like a pretty illogical move. Also many site owners that have a https only default may not notice the fact they are loosing eg users 17:56:08 What happens for big sites like facebook, twitter, google, etc.? 17:56:42 Do any of these exhibit the SSL handshake dropped behavior? 17:57:02 r2r0: A colleague noticed one a "your connection is not secure" on google 17:57:09 That happened once yesterday 17:57:18 Otherwise nope, those are working perfectly 17:57:25 That's sketchy 17:58:18 Anyways I am quite curious to look more into this once I return to a computer. When are you aiming to publish this analysis? 17:58:18 For some reason most of the targetted IPs seem to be digitalocean/Linode servers 17:58:30 r2r0: I'm waiting for you guys to publish it :) 17:59:07 r2r0: We have started collecting useful data for this! 17:59:44 I'm compiling a list of HTTPS websites 17:59:46 TheNavigat: ah great! (Been afk the past week :P) 18:00:08 r2r0: I got on the IRC less than 30 hours ago, you're good :) 18:00:49 anadahz: excellent 18:00:50 We can run some OpenVPN tests and use the WhatsApp test. 18:03:03 Right now UDP doesn't work, nothing after this 18:03:04 Mon Aug 22 20:02:28 2016 TLS: Initial packet from [AF_INET]188.166.23.247:1194, sid=89b7c209 cc4cab39 18:03:11 TheNavigat: are you aware of any censorship circumvention tools being being blocked? 18:03:28 TCP works perfectly 18:03:39 Yesterday UDP worked perfectly, so yea 18:04:17 And on other days TCP doesn't work 18:06:41 ok, do we have anything valuable regarding Egypt right now? should we move to next topic or postpone them till next meeting? 18:07:16 Next topic it is, Norwegian court :) 18:08:18 OK 18:08:28 #topic Adding URLs blocked by Norwegian court orders to the Citizenlab corpus 18:09:51 Hi - I added that one. 18:09:56 Sorry, was AFK. 18:10:42 I wanted to ask about adding sites blocked by court order in Norway to the corpus of URLs being used by ooniprobe - particularly lepidopter. 18:10:54 References: https://torrentfreak.com/pirate-sites-must-pay-legal-costs-of-own-blockade-court-rules-150902/ and https://torrentfreak.com/expanding-pirate-site-blocks-spark-censorship-fears-160714/ 18:11:52 graphiclunarkid: There is an excellent guide to add sites in the URL lists used by ooniprobe and lepidopter 18:12:01 (searching for the link) 18:12:07 I now have a RPi running anadahz's lepidopter image running on my home ISP so I can check whether this blocking is happening in practice, at least to some extent, but should I be just doing the testing manually or is it best to be adding URLs to the corpus for general testing? 18:12:24 anadahz: Thanks :) 18:12:56 https://ooni.torproject.org/get-involved/contribute-test-lists/ 18:13:03 graphiclunarkid: what is the size of the url list? 18:13:23 graphiclunarkid: Thanks for running ooniprobe and lepidopter :) 18:14:27 r2r0: Only about 15 URLs, but the order also mandates that all subdomains should be blocked. https://en.wikipedia.org/wiki/Internet_in_Norway#Internet_censorship 18:15:45 anadahz: I also wanted to ask how I check that my tests are being submitted correctly. I took a look at https://explorer.ooni.torproject.org/country/NO but I realise it might take a while for results to show up there. Is there somewhere else I can check= 18:15:46 ? 18:15:55 graphiclunarkid: ah okay in that case it should for sure be all added 18:16:29 r2r0: OK. 18:17:50 r2r0: Are the test-lists scheme-sensitive i.e. do I have to specify http or https correctly? 18:18:05 graphiclunarkid: it takes ~1 day of the reports to be processed and show up in https://explorer.ooni.torproject.org and https://measurements.ooni.torproject.org/ 18:19:23 graphiclunarkid: yes, if the site supports https you should specify it like such otherwise http 18:19:36 anadahz: OK - well the RPi has been running for about a week now so something is wrong somewhere. Probably the reporting from my end. I will take a look. 18:20:08 r2r0: The blocking order doesn't specify, so should I just inspect each site and add the schemes it's using, or would it be reasonable to add both schemes for all sites just in case? 18:21:10 graphiclunarkid: please submit any bugs in https://github.com/thetorproject/lepidopter/issues 18:21:27 OK, I guess I just wanted to mention that you can expect some results from Norway starting to trickle in from now on. 18:21:28 * graphiclunarkid doesn't want to take up much more of everyone's time with this - let's move on. 18:22:47 graphiclunarkid: you should just pick one. The default if you don't know should be http 18:23:05 graphiclunarkid: If they have only domains you can add http since by default the site will redirect. 18:23:55 In the Greek blocklist the htps links were timing out 18:24:25 OK - I'll add them all as HTTP unless I know otherwise. 18:24:56 Though by default they should be showing a block page. 18:25:12 Not sure what to do about the subdomains - but I guess it's enough to track the blocking of the apex domains for now. They do all show a block page as far as I can tell. 18:25:13 OK next topic? 18:25:59 graphiclunarkid: Do you have the link of the blocked URLs? 18:26:44 anadahz: There are two PDFs of court decisions but no itemised list linked online. I have created a text-file list for use with ooniprobe though. 18:27:45 I will suggest you to add the subdomains in the URL list as well. 18:28:27 It would be awesome to have a link to PDF files in the pull request if the Court publishes verdicts online. 18:29:13 I can't list the banned subdomains. The order instructs that all subdomains be banned but does not specify any. It's effectively a block on example.com and *.example.com 18:30:02 I can include the PDFs in the PR - though they are not hosted by an official source. Amazingly the Oslo district court doesn't seem to publish its decisions online for public consumption any more! 18:31:43 Here's the most recent order, from June this year: http://svw.no/contentassets/3da8e8b86a93471586bb91d4331624d6/20160622-kjennelse-fra-oslo-tingrett.pdf (the appendix at the end contains the URL list) 18:33:02 ok, next topic? 18:33:59 It seems that more and more EU countries are issuing a blocklist and filter URLs... :/ 18:34:24 #topic Current status of OONI explorer 18:34:48 anadahz: I think you just answered my question about that above :) Next topic? 18:34:55 OK 18:35:16 #topic Monitoring censorship in the Russian Federation 18:35:32 My last one :) 18:35:52 I see the Citizenlabs URL list for Russia was last updated nearly 2 years ago. 18:36:25 There's a lot of censorship activity in Russia right now. The government is proactively blocking many websites. The blocks seem to be IP based rather than domain. 18:36:42 I declare an interest: this is affecting the sites of some websites at my $DAYJOB. 18:37:17 rublacklist.net is a well-maintained database of sites censored by law in the Russian Federation. 18:37:22 graphiclunarkid: that depends on ISP. There is SNI-based https-blocking & targeted URL-based HTTP-blocking. 18:37:59 other ISPs have to null-route whole IPs due to lack of "intelligent" blocking sw/hw 18:38:01 darkk: OK - so not all ISPs are blocking by IP - however there is some of that going on. 18:38:32 Clearly this is a problem when multiple sites are hosted by the same IPs - I read somewhere that ~ 30% of Cloudflare IPs are on the blocklist! 18:38:54 rublacklist.net has an API here: https://reestr.rublacklist.net/article/api 18:39:22 I'm wondering what we can do to get increased visibility of web censorship in Russia on the ooni side by updating the list of URLs we're testing. 18:39:31 there is also http://ozi-ru.org/ (sorry, in Russian) measuring cross-boundary connectivity using RIPE databases 18:39:40 (Leaving aside for the moment the fact that we don't seem to have any active probes in Russia right now...) 18:39:57 darkk: Ooh, useful. Thanks. 18:40:10 IMHO rublacklist.net is exaggerating. 18:40:15 darkk: We have some Russian-speakers on our staff. 18:40:40 graphiclunarkid: send them my Privets :) 18:41:12 darkk: If our interest is in getting a list of URLs at risk of censorship so that we're conducting meaningful tests, though, might rublacklist.net not be a useful source? 18:41:32 yep, it's likely a good source of URLs 18:41:43 At worst we could be validating their blocklist data. 18:42:09 My agenda here is that I want to monitor connectivity for our clients without creating a test-list of just their URLs, thus outing them as our clients ;) 18:42:41 but 1) not every URL in the registry is actually blocked, as far as I know 2) it's interesting to verify rublacklist's claim regarding IP-based blocking 18:42:54 darkk: Indeed. 18:43:06 How many URLs are in rublacklist.net ? 18:43:19 I tried to promote ooni-probe to one of active LUGs, they were a bit afraid of informed consent being to vague :-) 18:43:47 anadahz: ~37k 18:47:49 If we get some RU probes we can perhaps check the percentage of IP blocking or domain/URL based blocking 18:49:28 I can ask some people about establishing probes in Russia. No promises though. 18:49:58 graphiclunarkid: I'm actually in Russia at the moment, that's why it seems to be, that rublacklist.net is exaggerating. But my ISP may be "intelligent" one, so I don't want to do any broad claims. 18:50:47 IMHO, more probes @ .ru will be useful for sure. 18:51:17 darkk: I'm just learning about the details of the situation in Russia. Any information at all is interesting to me right now! 18:52:20 Of course our clients also lack this information. Some know their sites are on the blacklist. Others suspect they might be victims of "collateral censorship" due to sharing IPs with censored sites - however we have no data that can confirm that. 18:53:22 I guess I'm raising it here to say that we're interested in exploring this issue further, can possibly offer help in doing so, and to ask whether other oonitarians are keen. 18:54:40 I hope they do not have to move to hidden services due to being blocked =) 18:57:08 graphiclunarkid: Thank you for raising awareness! Updating the test list for RU will be the first since every ooniprobe in RU will be testing this list. 18:59:12 anadahz: Yes - that's why I'm interested in updating the central lists as opposed to using a custom one. 18:59:13 Perhaps collecting these URLs that the people see as blocked will be the first step. 18:59:51 graphiclunarkid: are you comfortable with discussing your questions in public at #ooni or ooni-talk@ mailing list? 19:01:20 darkk: I think so, yes, in general terms. I would need to avoid revealing the identities of our clients though. 19:06:22 How about if we were to include the IP addresses of our CDN in a test-list for ooniprobe? We're more concerned about IP-based blocking than we are about domain-based right now. Would that tell us useful things about the extent of that in Russia? 19:07:41 IMHO that's going to be quite a long discussion (as we have near-to-zero probes at Russia right now), should we move to mailing list or continue on IRC? 19:08:12 You're right. We should move it to the list. 19:08:51 I will discuss this again internally and then send an email. Which list is best for the discussion? 19:08:54 graphiclunarkid: AFAIK, most of blocking happens at the last mile in Russia, I've seen null-routing only twice, so we have to place plenty of probes. 19:09:17 yes we can also resume this discussion in next's week meeting 19:09:43 graphiclunarkid: you may want to run several tests using RIPE Atlas HTTP and/or SSL check, there are ~400 active RIPE Atlas probes in Russia 19:09:54 that may give some baseline 19:10:42 darkk: I'll pass on the link to ozi-ru.org internally. Thanks again for that. 19:11:50 graphiclunarkid: ooni-talk is a good fit, though most people read ooni-dev mailing list 19:12:12 it was ~500 RIPE probes three months ago: https://twitter.com/mathemonkey/status/735951606571802624 19:14:53 anadahz: Thanks. I guess I'll aim for ooni-talk then. 19:16:05 darkk: Wow - that's quite a change! 19:16:27 * darkk hopes, he's not the reason for the meeting taking twice as long the second time. We need more data to be sure :-) 19:17:29 Nah - that'll be me taking up too much of everyone's time. Thanks for your patience, folks! 19:19:28 so, unless anyone has anything else to tell...... 19:19:37 OK it seems that we are a "bit" overtime :) 19:19:56 If anyone has nothing to say we can end this meeting. 19:22:11 Thanks everyone for attending! 19:22:22 #endmeeting