16:00:08 #startmeeting S27 10/08 16:00:08 Meeting started Tue Oct 8 16:00:08 2019 UTC. The chair is pili. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:08 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:12 hola! 16:00:14 who's around? :) 16:00:19 Here is the pad: https://storm.torproject.org/shared/yTwqR1i28l2lvDc-qWVXqrMfXAPel-qVscj9Ioq0prc 16:00:20 hi 16:00:27 o/ 16:00:31 \o 16:00:33 no need to add updates today (unless anyone really wants to :) ) 16:00:39 hey tjr glad you could make it! :) 16:00:40 o/ 16:00:46 tjr: thanks! 16:00:55 o/ 16:01:23 \o 16:01:45 so the main agenda topic for today is to try to resolve any incompatibilities between prop 304 (socks5 for error codes) and 309 (optimistic socks) 16:02:16 and brainstorming other potential solutions for 304 if we can't resolve those 16:02:22 is everyone good with that? 16:02:25 yes 16:02:26 anything I'm missing? :) 16:02:32 And - to make sure I understand correctly and everyone is one the same page - the only incompatibility is for onion links. We can implement 309 with an exception for onion links 'today' (for you know, a less optimistic value of 'today') 16:02:35 or other solutions for 309, maybe 16:03:12 tjr: yes 16:03:27 I am not sure what the implications are of prop 309 for usability within the browser. 16:03:28 tjr: we can do that, but IMO we should not leave onions without that performance optimization. 16:03:46 (asuming that is as good performance optimization as i think it is) 16:03:54 I think users will not get distinct errors for “no such domain” vs. “server is not reachable” and so on (for http sites) 16:04:01 errr… non-onion 16:04:33 mcs: So we tested the behavior for this manually and the experience - which I dont' remember exactly - was that it wasn't all that bad. 16:04:33 mcs: right :) 16:04:48 tjr: that’s good to know :) 16:05:11 There was a slightly different error page but it was really subtle. And we could address it in the browser to make it better. 16:08:25 ok 16:08:26 ok, so, shall we start with a discussion on whether we can have both 304 and 309 co-existing peacefully? :) 16:08:52 So the goal of 304 is to communicate a particular error code about the connection to Tor Browser so it can handle it. 16:09:08 Let' try to think of all the ways that could happen. 16:09:17 SOCKS Error Codes - which is what's implemented 16:09:22 aha 16:09:31 It could make the data available on the control port to be polled by the Browser. 16:09:40 (I don't know if the control port can push data?) 16:09:55 right (it can) 16:09:58 there are events == push 16:10:26 If we reported a successful SOCKS connection, we could have Tor fake a reply from the server in a way that Tor Browser would recognize as coming from Tor. (This is an ugly layering violation.) 16:10:57 wow 16:11:10 that's ugly :D 16:11:35 Yes. Here's a patch of it in action: https://trac.torproject.org/projects/tor/attachment/ticket/5915/tor-optimistic-data.patch 16:12:06 * tjr is looking at the SOCKS protocol for other potential tricks 16:12:40 i admit i havent quite read the socks shortcut proposal 16:12:48 so im not sure how the trick works 16:13:22 One of the features of SOCKS is that the protocol “gets out of the way” once the initial negotiation is done… at least that is my understanding. 16:13:51 But that means there is not channel for errors once the client enters the data transfer phase. 16:14:00 asn: AIUI the basic answer is Tor replies to the browser saying "The SOCKS replay came back and it succeeded" right away. Instead of waiting for the actual reply 16:14:01 s/not channel/no channel/ 16:14:15 And yeah - that prevents Tor from passing along an error. 16:14:48 When the browser receives the "Oh great you connected" it will send the HTTP Request or TLS ClientHello 16:14:51 i see 16:14:58 Which Tor will queue until the _real_ success comes in 16:15:07 And send when it does. Or discard if there's a failure. 16:15:14 I have not verified this discarding behavior. 16:15:29 so with optimistic data, the only way to report an error is to hang up right ? 16:15:34 A lot of this understanding comes from a cypherpunk who says "Yeah this works, I've been doing it for years." 16:15:50 so perhaps the error can indeed be a fake Tor->Browser HTTP error code 16:15:55 dgoulet: Probably? 16:16:00 yeah 16:16:06 that's how torsocks worked 16:16:09 (1.x) 16:16:29 the socks rfc says 'the client may now start passing data. If the selected authentication method supports encapsulation for the purposes of integrity, authentication and/or confidentiality, the data are encapsulated using the method-dependent encapsulation.' 16:16:30 but we can use the controlport for providing an actual reason for the hang up 16:16:31 error code are sync with SOCKS, that is for sure so 16:16:39 Tor Browser uses SOCKS 'authentication' for stream isolaiton I believe 16:16:52 sysrqb: originally we wanted that but it appears matching tab with the .onion requests was somehow hard? 16:16:55 mcs: ^ ? 16:17:00 So what is the encapsulation for that - and can we use it for passing control data too? 16:17:48 tjr: but the error is from tor to torbrowser, and only after the domain has been received. wheras the stream isolation auth is from torbrowser to tor IIUC 16:18:00 *the prop304 error 16:18:05 Correct 16:18:10 dgoulet: tor browser does match the request with a stream, so i think that shouldn't be too hard 16:18:19 it is a little trickier with subresources 16:18:28 ooook so lets recap here 16:18:29 What I'm wondering is if the encapsulation for the stream auth includes a place we can shove some extra data 16:18:30 but that is already a UX problem in browsers 16:18:40 with SOCKS it is not possible to return errors with optimistic data on 16:18:43 bottom line ^ 16:18:54 unless maybe some trickery that imo we should avoid as much as possible 16:18:55 * tjr is not 100% certain about that yet 16:18:56 sysrqb: I think Tor Browser does not really match requests with streams, but maybe I am wrong. 16:19:15 There is isolation of course but that is not a 1:1 match. 16:19:16 so then the remaining option is control event which TB then needs to match ^ 16:19:23 or HTTPCONNECT 16:19:31 what say you TB team!? :D 16:19:32 tjr: i think no, most are username+password auth, but there's no actual encap/decap involved 16:19:35 (LOTR voice) 16:19:41 http connect does not solve anything because we would need to do optimistic data there too 16:19:49 mcs: true 16:20:00 asn: but could return an HTTP error code at least? 16:20:07 dgoulet, asn: really what we need is a unique identifier per stream 16:20:19 and i think that isn't something we have available right now 16:20:32 sysrqb: https://trac.torproject.org/projects/tor/ticket/14389#comment:52 16:20:37 unique identifiers we can make 16:20:55 sysrqb: okay yeah https://tools.ietf.org/html/rfc1929 shows no further encapsulation =/ 16:21:12 we did for circuits (HiddenServiceExportCircuitID) ... I guess we could for stream maybe? 16:21:25 The other problem with using the control port is timing: we cannot guarantee that control port info is received at the “right time.” 16:21:36 right 16:21:45 i feel like this is a problem 16:21:52 There is some complexity fitting this into Firefox’s networking code :) 16:21:55 how long does the browser wait for a control port answer? 16:22:19 i guess it keeps on loading the page, unless a control port event comes in, and at which point it displays the error page and stops the request 16:23:05 How does SOCKS 'hang up'? It's not a message inside the protocol... Does it tear down a TCP connection? 16:23:28 asn: dgoulet: hrm. i think we can use the u+p isolation values for the unique identifiers, because those are bound onto the circuit, and the circuit will fails if the onion address is bad or the onion service is down 16:23:33 right? 16:23:35 optimistic SOCKS will cause the request to end deep in the browser’s networking code, so we would need to deal with that correctly. 16:23:42 tjr: after negotiation phase, SOCKS doesn't specify anything to report conn. errors 16:24:13 sysrqb: as long as you can match multiple tabs for same .onion with the same u+p ? 16:24:13 asn: dgoulet: by 16:24:37 sysrqb: else we end up opening multiple RP circuits for the same .onion 16:24:46 by "circuit will fail" i mean "any part of the rend process will fails, descriptor lookup, intro pt, rend pt. timeout 16:25:08 i believe we reuse the same u+p for different tabs 16:25:18 unless the user explicitly requests a new circuit 16:25:23 I believe we do, yes 16:25:33 ok so would you then be OK with control port events? :) 16:25:41 async error reporting basically from the optimistic socks 16:26:29 tjr: tor just closes the tcp connection 16:26:36 (for socks hanging up) 16:27:21 And we certainly don't want to be stuffing data inside of TCP options.... 16:27:22 Right..... 16:27:29 dgoulet: i think we can test how quickly the error is reported vs. how quickly firefox times out the connection 16:27:40 tjr: ha ha. 16:27:54 tjr: lol! hardcore 16:27:55 tjr: well.. 16:27:58 Before we switch to control port events, we should do some experiments to see if it can be made robust. 16:28:09 yeah 16:28:17 I'm all about the layering violations. Fuck you OSI! 16:28:55 dgoulet: asn: i guess the worst part about this will the the onion address typo 16:29:03 detecting that is so simple and quick 16:29:12 sysrqb: it is really like few lines of code I bet in TB to do that validation :P 16:29:17 and we'd nee d to wait on an async control event... 16:29:27 yeah, i wonder if we should just do that in tor browser 16:29:30 sysrqb: well, that's fine 16:29:36 i mean, it would just be yet another error 16:29:43 yeah 16:29:50 i think having it in tor is useful 16:29:53 of course we could do it in the browser, but if we have the mechanism for all the other errors, that's just another one 16:30:00 it is already an error we defined in prop304 iirc 16:30:06 so you would get it for free :P 16:30:07 yeah 16:30:10 maybe we can see what real world performance is like? 16:30:11 it is not actually 16:30:18 mcs: yes 16:30:25 mcs: i think some testing is required to make this robust and nice 16:30:33 mcs: yeah, it just seems a little silly. but it's worth testing this first 16:30:43 before we go through the effort of reimplementing that in the browser 16:30:58 i shouldn't sidetrack this discussion :) 16:31:04 i think we're making progress 16:31:25 we have a plan it appears :) 16:31:33 the way i see it there is two ways to do it: either control port events, or fake HTTP responses from Tor to TB 16:31:42 i think The Right Way is the former 16:32:01 asn: agreed, if we can make it work :) 16:32:02 the latter is hard/impossible on tls connections 16:32:11 sysrqb: good point 16:32:35 but tor can guess if "this" connection can support fake responses 16:32:40 also there are protocols other than HTTP (but maybe no one uses FTP any more) 16:33:09 i think if we do it the control port way, it will be useful in the future too 16:33:11 * sysrqb gasps with surprise 16:33:17 this mechanism seems like a thing that will be useful in the long future too 16:33:44 +1 16:33:49 indeed 16:33:58 I would feel better about committing to control port events if we could experiment with it on the browser side first. 16:33:59 Fake HTTP responses also includes fake TLS responses 16:34:12 But I know that means work for the network team 16:34:21 wouldn't be that crazy ^ 16:34:26 I originally did it for control port the patch :P 16:34:29 Sorry that was mentioned, but i think two types of fake requests is better than lookin for http on TLS 16:34:52 this is actually for us purgin the prop304 code from that s27 ticket and going control port which is easier for us... the spec side is a bit more work usually but for experimenting, this is easy 16:34:54 tjr: hm! tls does have an alerting 16:35:11 i wonder how nss would handle that 16:35:20 but this seems error pronbe 16:35:23 prone 16:35:34 Yeah; both HTTP and TLS have mechanims to do this in a not-that-ugly fashion. It's just a lyering violation and code in the browser in odd places. 16:35:43 also code in tor in odd places 16:35:43 And yeah that would be error prone and rebase-unhappy. 16:35:54 My guess is that the fake responses approach would actually be easier to handle correctly on the browser side (but it is somewhat ugly). 16:36:00 because tor would have to ... complete an SSL handshake with TB? 16:36:29 asn: no, i think tor would inject an alert message into the stream 16:36:30 L) 16:36:35 :) 16:36:38 wow 16:36:43 random tls 16:36:48 u can do that? 16:36:54 as an MITM? 16:36:56 *shurgs* :) 16:37:16 16:37:19 yeah. i don't know if openssl gives you that option 16:37:20 alerts aren't part of an authenticated TLS connection 16:37:31 fun 16:37:56 Anyway yeah I think control port events are much more desirable 16:37:57 so little-t-tor would have to learn to do situational MITM on HTTP and TLS 16:38:22 yeah. i think using the control port for async is still the best option 16:38:25 agreed 16:38:26 ok 16:38:28 so what is needed there? 16:38:31 a synchronous message is definitely easier for the browser 16:38:32 So it seemed like where we ended up was mcs would like a POC to test with? 16:38:41 but we should try async first 16:38:59 yeah, that was my understanding 16:39:00 I can do that PoC branch from tor side quite easily. 16:39:09 well, there we would not need it before near the end of October but soon is great 16:39:16 ack 16:39:27 we have 3/5 deliverables depending on this situation 16:39:37 And I lost track - did we also need to inject a random random identifier on the browser side for tor to pass back for tab-matching? 16:39:38 what changes the game here given that we investgiated the control port option before and deemed it too problematic 16:39:44 ? 16:39:47 GeKo: ye good question 16:39:57 tjr: no, we think we can reuse the current username+password identifier 16:40:02 i mean we tried hard to get this working 16:40:18 because it obviously is the right thing to do 16:40:33 but we still thught it would be not really feasible 16:40:48 what extra information does us put back to square zero now? 16:41:15 (we will also have to trash prop304 and the #30382 code by doing so) 16:41:29 i mean i am all for that option 16:41:32 GeKo: do you know if anyone wrote the brokers somewhere? 16:41:37 *blockers 16:41:49 on the ticket mcs pointed out 16:41:51 yes it's on the ticket 16:41:54 ah. 16:42:02 https://trac.torproject.org/projects/tor/ticket/14389#comment:52 16:42:05 he might even have pointed you to the respective comment 16:42:18 (i have not checked) 16:42:35 I am not sure much has changed but sysrqb has a new take on the unique ID problem. 16:43:06 I think the problem of getting the error via the control port at the Right Time still could be an issue. 16:43:25 It will definitely collectively cost both teams some engineering time. 16:43:41 But it seems like people really want optimistic SOCKS for .onions 16:44:01 sure 16:44:19 It's also possible to punt this work down the road. and do optomistic socks for non-onion today, easier-ly 16:45:00 i think spending engineering time for quality things is OK, I'm just a bit sad that we had to wait for October to figure this out 16:45:03 I would personally prefer that to be Plan B ^ as in "control port event have a show stopper" 16:46:09 mcs: my thought is that each tab knows the username and password used for that socks request 16:46:37 mcs: and if the control port error message event contains that info, it shouldn't be hard to link these 16:46:54 mcs: but maybe i'm not considering something 16:47:37 the timing issue is definitely a concern, and something that needs testing 16:47:38 how would you put the error to the correct tab 16:47:47 if mutliple tabs use the same username and password 16:47:50 ? 16:47:53 GeKo: any tabs with that as the first party 16:47:57 they all fail 16:48:58 right? 16:49:28 hm 16:50:21 GeKo makes a good point… we don’t want to present several client auth prompts at the same time, for example. 16:50:48 ah, right, in that case we should only prompt once :) 16:50:55 i was only thinking about the error pages 16:51:08 btw, I think we can create unique identifiers for specific streams and expose them, if that's useful in any way. It's easy for us to make unique identifiers. 16:51:47 sysrqb: well, i don't know. let#s say i have a.com open in tab2 16:51:53 aand am readin stuff 16:52:49 and i would load a.com/something in tab3 16:52:56 but switch back to reading in tab2 16:53:04 and now the erro happens in tab3 16:53:21 it seems you suggest it's fine to interrupt the reading in tab2 16:53:26 and show an error page 16:53:32 because it's any tab 16:53:33 do we remember which u+p we used when loading tab2? 16:53:37 but that's weird 16:53:42 if tab3 uses a different u+p? 16:53:53 well the same as for tab3 16:54:01 because it uses the same u+p 16:54:08 but the first load succeeded 16:54:14 and later on the circuit faile 16:54:15 d 16:54:40 Could that happen? 16:54:41 hrm. 16:54:49 so why should my reading in tab2 be interruppted and i would get an error page instead 16:54:58 even though nothing is loading on tab2 anymore 16:54:59 right, it shouldn't :) 16:55:15 but the error comes from laoding stuff in tab3 16:55:27 where the cricuit with the same u+p is failing now 16:55:28 i was thinking this should only happen for tabs with in-progress requests 16:55:39 we have about 5 minutes left for the hour, do we want to continue this discussion here? 16:55:44 but it is possible part of tab2 is loaded 16:55:50 enough for reading 16:55:57 and that shouldn't be replaced with an error page 16:56:01 yep 16:56:10 It is still a tricky to handle all cases with frames, subresources, etc. 16:56:19 yes, that additionally 16:56:29 That’s one of the reasons I like SOCKS errors :) 16:56:58 (i am actually not really her, sorry for interrupting :) ) 16:57:04 *here 16:57:12 But the consensus seems to be to give control port events another go 16:57:13 pili: i think we have more we can discuss 16:57:20 sysrqb: I agree :) 16:57:27 It seems like SOCKS would have the same problem though? 16:57:31 but maybe these are edge cases we can discuss a little later 16:57:33 I was just wondering if everyone has time to continue this conversation 16:57:41 If I have a.com open in two tabs - they use the same SOCKS stream don't they? Same First Party domain. 16:58:25 they'll probably use the same tor circuit, but different streams on that circuit 16:58:39 and they'll each send their own requests 16:58:46 But what about the SOCKs connection? Different SOCKs connections? 16:58:52 yeah 16:58:57 Oooh. 16:59:10 Poop on a poop. 16:59:13 synchronous error handling is much easier, that is for sure 16:59:16 Wait is this meeting logged? 16:59:27 tjr: yes 16:59:48 ¯\_(ツ)_/¯ 17:00:37 (meeting after the hour mark!) 17:00:42 so how do we approach this folks? 17:00:51 So then maybe we do need that random identifier per tab 17:00:56 We can avoid some of GeKo’s issues if we add a new unique ID. 17:01:13 right. now we need an ID for every resource 17:01:22 so we know which ones failed 17:01:25 is that equal to a stream? 17:01:30 When you say resource you mean....? 17:01:42 we need to unify the TB and Tor terminology if possible 17:02:09 any request that could result in a new tcp connect being established 17:02:25 *connectoin 17:02:54 but maybe one-per-tab is enough 17:03:21 I think we could unify this behavior with how the browser treats HTTP Auth requests. 17:03:30 Also, I am not 100% sure, but I think the browser needs to create and assign the unique IDs like it does the stream isolation u+p info 17:03:40 If you hit a top level page that requests HTTP auth - you get a username/password prompt 17:04:01 But if you request and that wants auth - I am pretty sure you do not get that prompt. 17:04:11 (It would be a phishing extravaganza for example.) 17:04:18 right 17:04:22 I'm not sure about iframes 17:04:35 But preuming only top level page loads get the auth prompt, you only need a unique id per tab 17:05:23 It is still useful to know that a subresource failed to load vs. the top level page within a tab (this is not just for auth prompts) 17:05:33 yeah 17:06:33 So... it's useful but what can you do with it? 17:06:57 Can you reach across and go 'This image: you're never getting a response so I'm going to cancel the load' ? 17:08:12 tjr: what do you mean by “reach across?” 17:08:40 From whever you have this knowledge (tor-button code?) to the page. 17:09:01 There must be some web api to cancel a load of, say, an image - but I don't know what it would be... 17:09:22 Nor how you would find it in the page. e.g. if it was a random AJAX request you'd never be able to find that variable and cancel it. 17:11:32 The image load will be cancelled when the SOCKS connection is killed, right? It is more a question of looking at the control port error data and retrieving a precise error. 17:12:16 Okay yeah you're right. 17:12:47 So you now know that the load of a thing failed because of auth but again... what do you do with that? (I mean sure you can log it to the console which is useful but...) 17:13:12 Ideally, we would keep the browser model where errors are associated wiith channels. 17:13:42 I don’t actually remember what our current onion v3 client auth implementation does with respect to images or other subresources. Will need to check. 17:13:53 Not prompting would be good :) 17:14:27 Other errors might be worth passing to the channel, e.g., an AJAX request could expect useful error info 17:15:32 But that's going to be a race we'll never win. The SOCKS connection will close, the channel with get its status and then you come along and fix up the status? 17:16:18 And that’s what makes the control port approach difficult to fit into Firefox’s network stack / channel model. 17:16:27 Right 17:17:59 So what are the next steps? 17:18:24 I guess play with the POC and see if these limitations are detrimental enough to re-evaluate the strategy? 17:18:43 at this point, i agree 17:18:51 it's worth experimenting 17:21:07 is everyone okay with ending the meeting here? 17:21:08 ok 17:21:09 should we set up a date for a next meeting? will the browser team will continue working on that? 17:21:23 the next meeting is planned for the 15th october already 17:21:32 cool, next week 17:21:33 at least the next S27 meeting 17:21:41 antonela: yes, but i don't think we'll have much time until 9.0 is releases 17:21:46 *released 17:21:51 so we may not make much progress 17:22:01 +1 17:22:02 right 17:22:31 the plan was to pick up the S27 work end of october/start of november 17:22:38 sysrqb: yep 17:23:21 the network team can resurrect their control port event implementation and maybe make a proposal for unique IDs 17:23:28 (if they have time) 17:23:41 we can 17:23:42 we have time 17:23:58 thanks! 17:24:00 okay, great. thanks everyone 17:24:31 #endmeeting