15:58:59 <phw> #startmeeting anti-censorship team meeting
15:58:59 <MeetBot> Meeting started Thu Sep 10 15:58:59 2020 UTC.  The chair is phw. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:58:59 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:59:02 <phw> hi everyone!
15:59:06 <hanneloresx> hi!
15:59:09 <phw> here's our meeting pad with the agenda: https://pad.riseup.net/p/tor-anti-censorship-keep
15:59:39 <phw> let's start with the announcements
16:00:06 <phw> i started filing a bunch of "first contribution" issues for rdsys:
16:00:10 <phw> https://gitlab.torproject.org/tpo/anti-censorship/rdsys/-/issues?label_name%5B%5D=First+Contribution
16:00:24 <phw> these are easy-ish issues to get started
16:00:42 <hanneloresx> awesome
16:01:53 <phw> we also created a new "research" label that we can use to label issues that have a research component to it
16:01:56 <phw> https://gitlab.torproject.org/tpo/anti-censorship/censorship-analysis/-/issues?label_name%5B%5D=Research
16:02:32 <phw> and finally: there will be a "pluggable transport implementers meeting", organised by internews, next month
16:02:52 <phw> here's how one can sign up: https://docs.google.com/forms/d/e/1FAIpQLSd778O0ChoPwvilhpSMUvufsfE-6HzkHtG6kO6aDWI8TcQiAg/viewform
16:03:29 <phw> (i got the link from karl, one of the organisers, and he asked me to forward it to folks who would find it interesting)
16:04:58 <phw> moving on. we have a discussion item about a "tech talk"-style blog post. i was thinking about summarising the technical design of rdsys and blog about it, in the hope that we get some feedback that way
16:05:28 <phw> even if there won't be useful feedback, at least we would get good documentation out of it
16:05:38 <phw> any thoughts on that?
16:06:31 <phw> *crickets* means "all good"!
16:06:35 <hanneloresx> that sounds useful to me.  That's also the kind of thing we can link to the chinese community on solidot
16:06:54 <phw> hanneloresx: yes, that's a great idea
16:07:12 <phw> i mostly want to hear from other projects if rdsys can be useful to them
16:08:55 <phw> ok, moving on to our 'needs review' section
16:09:58 <phw> https://gitlab.torproject.org/tpo/anti-censorship/bridgedb/-/issues/31871 can wait until cohosh is back but if anyone wants to look at pretty charts and suggest follow-up analyses: this is your ticket!
16:10:39 <phw> https://gitlab.torproject.org/tpo/anti-censorship/trac/-/issues/31874 also can wait until cohosh is back. unless anyone wants to review a pile of go code ;)
16:11:53 <phw> i'll wait for cohosh, then. let's move on to our reading group
16:12:16 <phw> today's paper is httpt and sf_, its author, joined us today!
16:12:48 <hanneloresx> hi sf_!
16:12:59 <phw> here's the paper: https://www.usenix.org/system/files/foci20-paper-frolov.pdf
16:13:06 <sf_> hi hi
16:13:12 <phw> thanks for coming sf_ :)
16:13:28 <phw> and here's a very short summary of the paper: httpt is a circumvention protocol that's based on a proxy that hides behind a web server. clients can reach this proxy by requesting a secret url or by deriving a special tls master secret that the proxy can recognise. to a censor, httpt looks like a long-lived http connection.
16:13:58 <sf_> > special tls master secret that the proxy can recognise
16:14:16 <sf_> client reach the proxy by request a special link in the URL
16:14:21 <sf_> requesting
16:15:00 <dcf1> so how the httpt design is different from the former httpsproxy design:
16:15:23 <dcf1> HTTP already has a built-in way to express a proxy: you send a request with a CONNECT method
16:15:39 <dcf1> You can even put it behind authentication in a standard HTTP
16:15:40 <dcf1> way
16:16:21 <dcf1> The problem is that it's not probe resistant: if someone tries to use the server as a proxy, they get a 407 (I think) "Proxy Authentication Required" status code that reveals the presence of a proxy
16:16:22 <sf_> it's done that way because we hide the proxy behind is a regular unmodified Apache, nginx server, such that the censors are probing that, and we simply configure the web server to proxy things to the httpt server behind it
16:17:02 <sf_> dcf1 is correct
16:17:07 <dcf1> _sf wrote https://github.com/sergeyfrolov/httpsproxy, which is a probe-resistant version of a CONNECT proxy, but it requires custom code for each web server and only works for Caddy
16:17:42 <phw> i have a question related to the "switch tls master secrets" trick: the switch is observable to the censor, right? if so, how common is it for tls implementations to switch master secrets after the negotiation?
16:17:44 <dcf1> The advantage of HTTPT is that it works with any web server (also, you have more design freedom to do things like add padding, because the server-side proxy is not the same process as the web server)
16:19:13 <sf_> "switch tls master secrets" is an alternative design, proposed in the paper, use of secrets in URL is a slightly better design. I am not aware of how this would be observable to censor
16:20:13 <phw> oh, i see! i assumed that the switch is observable in tls's record layer but i may be wrong
16:20:38 <sf_> "switch tls master secrets" is not "TLS renegotiation", we just swap the secrets on both sides at some point in the middle of live connection
16:20:43 <dcf1> Yeah I think it manifests as encrypting with a different key, which should not be observable
16:21:00 <phw> gotcha, thanks
16:22:52 <phw> while reading section 3.2 (on creating web site content) i was reminded of how people experimented with gpt-3
16:23:11 <dcf1> How would something like HTTPT integrate with rdsys?
16:23:45 <sf_> gpt-3-generated websites would be an exciting and very useful direction of future research if someone has the expertise and cycles
16:23:47 <phw> and how feasible it would be to tell a gpt-3 instance to "create a blog post that talks about golden retriever puppies" and then use the result as decoy content
16:23:50 <dcf1> HTTPT doesn't work well with BridgeDB assumptions because the web server proxy is not necessarily the same host / IP address as the actual Tor bridge, which is what BridgeDB wants to see
16:24:25 <phw> dcf1: as i understand it, there are several deployment models to integrate httpt in tor
16:24:43 <dcf1> You could even have all HTTPT proxies forwarding to one single Tor bridge, but the resource you want to distribute is web site URLs, not the bridge IP address.
16:25:04 <phw> rdsys is meant to expose an api that lets proxies register themselves, eg by sending an http request to bridges.torproject.org/register
16:26:55 <phw> dcf1: right, that would make httpt work similar to how snowflake currently works: an httpt server knows what bridge to talk to but it's not started by a tor process
16:28:17 <phw> a tor client speaking httpt seems more straightforward: we could add sf_'s existing code to obfs4proxy
16:29:28 <dcf1> So a call to run HTTPT bridges would look like
16:29:47 <dcf1> (the simplest I can imagine)
16:29:52 <dcf1> Set up a centralized bridge
16:30:20 <dcf1> Produce documentation for generate a secret URL and set up forwarding to the bridge for a variety of web servers
16:30:42 <dcf1> Get people to send their web server URLs and secret paths to us (this could be automated using rdsys)
16:30:55 <dcf1> EOL
16:33:00 <phw> right, we could teach the httpt server to take as input 1) the centralised bridge and 2) the secret URL, and it then automatically registers the url with rdsys
16:34:35 <phw> for what it's worth, i think httpt is very promising and i'd like to eventually get it into tor browser. dcf1's deployment strategy may be a good way to get started with this
16:34:44 <phw> sf_: do you have any thoughts on that?
16:37:06 <sf_> yeah, if would be interested in that too
16:37:07 <sf_> so there were 2 unsolved questions wrt HTTPT, first one being "where do we get the content of the website, can we use gpt-3 or what", but with the deployment strategy of asking volunteers it is no longer a concern
16:38:05 <sf_> another one is how do I implement a TurboTunnel. I tried to roll my own implementation with no luck, as my productivity took a nose dive in the pandemic
16:38:29 <sf_> adding functionality to speak Tor PT is easy
16:39:21 <phw> so, if i have an existing web server (with "real" content), i simply need to tell the web server to forward all traffic to /super-secret-url to the httpt server that's running next to the web server, right?
16:40:00 <dcf1> Yeah HTTPT itself is mainly the probe-resistance design; there is still plenty of room for design of the internal protocol: padding, session continuity, whether to wrap in encryption between the web server and the bridge
16:40:20 <sf_> yeah, I already have instructions on what lines to add to nginx/apache configs to make that happen
16:40:27 <dcf1> phw: right
16:40:32 <phw> sf_: that's great, thanks
16:40:58 <dcf1> You configure it as if you were configuring WebSocket forwarding, but it turns out that web servers don't inspect the forwarded connection, so you don't even have to use WebSocket framing inside.
16:44:09 <sf_> It should not be difficult for me to add functionality to allow volunteers to automatically register themselves with whatever BridgeDB 2.0 you implement
16:44:49 <phw> sf_: yes, i'll create a gitlab issue in which we can talk about the details
16:46:22 <phw> another action item is to create an obfs4proxy patch that includes httpt -- ideally with turbo tunnel
16:47:36 <sf_> not very familiar with obfs4proxy: it is actually some sort of meta-PT and not necessarily obfs4, right?
16:47:57 <dcf1> right
16:48:01 <phw> sf_: yes, the name is misleading. it actually includes several pluggable transports
16:49:02 <phw> we already ship obfs4proxy in tor browser, and it's also written in go, so it seems natural to turn the httpt client into an obfs4proxy transport
16:52:36 <phw> ok, this has been a very productive discussion. i will create a bunch of gitlab issues that summarise the above and we can continue the discussion on gitlab
16:52:41 <phw> anything else before we wrap up?
16:54:43 <phw> ok, let's end today's meeting
16:54:48 <phw> thanks a lot for coming sf_!
16:54:51 <phw> i really appreciate it
16:54:54 <phw> #endmeeting