16:59:56 #startmeeting OONI dev gathering 2016-04-18 16:59:56 Meeting started Mon Apr 18 16:59:56 2016 UTC. The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:59:56 Useful Commands: #action #agreed #help #info #idea #link #topic. 17:00:02 greetings people! 17:00:08 who is here? 17:00:19 * sbs is ehre 17:01:10 * willscott waves 17:01:30 hello 17:04:31 excellent 17:04:36 what have you been up to last week? 17:05:09 I was offline last week. working on catching up on everything 17:08:02 I mainly worked on the web_connectivity branch, that is starting to take a pretty decent form at this point. It's quite frustrating though how accessing the same site multiple times in a brief period of time will yield inconsistent results. 17:08:35 what part is inconsistent? 17:09:59 willscott: like that you will try to do a TCP connect to the IP of the site on port 80 and 3 times you can establish it and once in 4 you get a connection refused 17:10:27 or you will do the same exact request multiple times and on occasion you will get a different response 17:10:47 and these behaviors are all due to server-side issues and not to tampering 17:11:50 so I am making the blocking heuristics a bit more lax and consider, for example, tcp/IP blocking to happen only if both the separate TCP connect + the http request (that includes a TCP connect) fail 17:14:22 here. got a WIP branch for adding an https endpoint to the oonibackend 17:14:23 eof 17:15:16 nuke: are there any updates from iOS land? 17:15:29 Focused on: Audit, rebuilt and test OONI sysadmin ansible recipes and docker build images, integrate some tests to know when ooni-probe and ooni-backend fails to install, still working on a safe upgrade solution for live (cannonical) ooni-backend --EOF 17:15:49 also i think hellais wants me to get oonibackend working under some other twisted process manager? 17:15:50 I have mered support for HTTPS minus certificate validation to MK. This is a necessary step to support a OONI HTTPS collector. I have also done general MK improvements and iterated over the NDT prototype. EOF 17:16:02 s/mered/merged/ 17:16:46 iOS App 0.1 is practically done. Now working on Android interface to make it like the iOS version 17:17:10 Btw, hi everyone :) 17:17:37 hellais: doing a TCP connect to the IP of the site on port 80 can introduce many false positives since most of the websites are running in shared hosting environments 17:18:37 landers: regarding the process manager, you should look to see how twistd is invoke here: https://github.com/TheTorProject/ooni-probe/blob/feature/webui/ooni/webui.py 17:19:59 anadahz: the purpose of doing the TCP connection is to verify if the blocking is just IP:port based or if there is knowledge of the HTTP protocol 17:20:26 no data is actually sent, I just connect and tear down the connection immediately after 17:21:44 hellais: so if the HTTP request returns a block page and a TCP connection to port 80 is successful on a given input what would be the result? 17:22:34 anadahz: it depends if the dns response is consistent or not with the control 17:22:49 if the dns responses are consistent then we would flag that as being blocked due to 'http' 17:23:05 if the dns responses are inconsistent then we would flag it as being blocked due to 'dns' 17:23:31 hellais: and if any of these checks fails? 17:24:01 anadahz: what do you mean if they fail? 17:24:09 like that you can't get a control measurement? 17:24:22 hellais: if tcp check fails but dns and http succeeds what would be the outcome? 17:25:11 ^ from the probe side 17:25:23 hellais: is there a reason why the tcp measurement is not taken opportunistically while connecting for performing the http test? 17:25:35 anadahz: that blocking is happening due to tcp_ip based blocking 17:25:38 https://github.com/TheTorProject/ooni-probe/blob/feature/web_connectivity/ooni/nettests/blocking/web_connectivity.py#L283 17:25:50 ^^ here you can see the logic for determining blocking 17:27:21 hellais: maybe I'm confused with your explanation at willscott about the inconsistent parts 17:27:52 sbs: I was evaluating that possibility at the beginning, but for one it's hard to hook the call to socket.connect() inside of the twisted agent (it's burried very deeply inside of the code for the http Agent) and second I actually think it's more robust to do these checks twice, since I noticed that in some cases you will have the tcp connection failing, but in the end the http request actually works fine (a 17:27:57 nd this is due to serverside issues) 17:28:25 hellais: I see quite some complecity there so I guess we should a bit careful of how we interpret web_connectivity reports 17:29:35 hellais: understood... from the point of view of not doing strange things to reduce fingerprintability, I think it would be better to use perform one connection, but I understand the Twisted related issues... 17:30:14 hellais: since in MK it's doable to do connect and tcp without going mad, I think we should do just one connection when we implement web connectivity for MK 17:31:02 hellais: 17:31:51 sbs: yes I agree that if it's doable to do it cleanly then it makes sense to re-use the connection. It's just that I don't want to add another layer of hacks on top of the existing hacks for the twisted Agent. 17:32:58 hellais: yeah, I agree that with ooni it's better to avoid further complicanting the agent 17:32:58 sbs: I think though we should perhaps have some retry for anomalous measurements, like consider blocking to be happening only once we have tried connecting to the site in question 2-3 times 17:34:20 anadahz: yes I think the interpretation of the derived 'blocking': 'XXX' key should be taken with care. I bet that as we start looking at real results though we will see if there are ways of improving the detection logic 17:36:06 hellais: returning to what you said earlier, is my understanding that you have seen a pattern where tcp is more likely to fail than http correct? 17:36:58 sbs: in my very limited testing yes 17:37:26 I have noticed inconsistencies with http response pages as well, but that is not very prevalent 17:37:34 and it fits withing 2 types of categories 17:38:00 1) The site is runing some bad code that will every X requests return 5xx status codes 17:38:40 2) The site is behind a mis-configured load balancer and you sometimes get a certain page and other times you get another (I saw this hapenning quite a bit with domains that are parked on go-daddy) 17:38:50 hellais: this is very interesting! on top of my head I cannot immediately think at a reason why something like this could happen 17:45:38 relevant to our discussion a real case: https://paste.debian.net/439292/ 17:46:51 anadahz: a real case of tcp vs http failure? 17:49:00 sbs: http "failure" 17:52:21 sbs: and the TCP connect test: https://paste.debian.net/439294/ 17:53:41 thre is no DNS hijacking happening 17:54:47 hellais: I would like to have this input URL running in web_connectivity test to do a comparison 17:55:44 anadahz: mmm, now /me is confused: does this tell that connect is successful and http is hijacked by a proxy? 17:56:00 so DNS: OK, TCP connection on port 80: ok, HTTP: 302 redirect 17:56:23 anadahz: do you also have a dns consistency test result? 17:57:22 anadahz: okay, but we can explain this with a transparent proxy, right? the other way round (tcp connect fails and http is ok) is more complex to explain 17:58:10 [13ooni-backend] 15hellais pushed 1 new commit to 06feature/web_connectivity: 02https://git.io/vwYqK 17:58:10 13ooni-backend/06feature/web_connectivity 14dc52a39 15Arturo Filastò: Add monkey patch for bug in twisted RedirectAgent:... 17:58:47 yeah I agree this is the "expected" behavior when a transparent proxy or some filtering technology that is looking at HTTP is present 17:59:04 hellais: sbs running a dns consistency test now 18:00:07 ^ https://paste.debian.net/439297/ 18:01:05 tampering: false 18:01:51 yeah it look ok 18:02:06 anyways I guess we should move into next steps since we are already overtime 18:02:23 so this would be a false positive for web connectivity ? 18:03:22 I will continue work on the web_connectivity test, work on the measurement kit test: https://github.com/measurement-kit/measurement-kit/issues/403, adding SSL cert validation and possibly also look into the JNI hooks 18:03:34 anadahz: no, this would return blocking: true 18:03:35 err 18:03:40 blocking: http 18:04:11 anadahz: it would be this case here: https://github.com/TheTorProject/ooni-probe/blob/feature/web_connectivity/ooni/nettests/blocking/web_connectivity.py#L306 18:05:02 hellais: so all HTTP 302 redirected input will be mentioned as 'blocking: http' ? 18:06:26 like all HTTP URLs that redirect to HTTPS 18:09:04 I will work on MK and specifically on NDT test and on adding support for retrieving IP address from Ubuntu and for performing geolocation using geoip 18:09:13 EOF 18:09:34 next steps: continue work on OONI infrastructure syadmin and test tasks -EOF 18:09:45 anadahz: no, it follows redirects until the end, compute the body length and compares the body length from the client and backend 18:09:52 s/backend/test_helper/ 18:13:29 rumor has it that by using as factor for body length difference 0.7 this leads to a true positive ratio of 95% (http://www3.cs.stonybrook.edu/~phillipa/papers/JLFG14.pdf) 18:15:17 interesting ^ 18:16:42 if there are no more closing remarks 18:17:29 I would say we adjourn, thanks for attending! 18:17:32 #endmeeting