18:01:40 #startmeeting trac migration 18:01:40 Meeting started Tue Sep 17 18:01:40 2019 UTC. The chair is gaba. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:40 Useful Commands: #action #agreed #help #info #idea #link #topic. 18:02:01 First, we can check if everybody is ok with agenda and/or wants to add anything else. 18:02:22 i don't have anything to add 18:02:40 * catalyst is here 18:02:45 Does migration encompass "data preservation"? if not, let's add that 18:03:10 what is the migration point, actually? 18:03:11 let's add it 18:03:31 the actual process on migrating tickets 18:03:41 i added user registration, is that the same as anonoymous users? 18:03:51 i was thinking it is the same 18:04:10 okay, i'll rename 'migration' 'data migration process' to clarify it's the process and it's more than tickets 18:04:15 unless we want to stick to tickets? 18:04:26 not, it is about data 18:04:31 good 18:04:51 * GeKo is somewhat here 18:05:52 ok. The update on this is that we are revisiting plan for migration (https://nc.riseup.net/s/SnQy3yMJewRBwA7) and ahf has been looking at tickets migrations. There are a few things that still need to be resolved and decided. 18:06:23 We can start with the list of things that still need to be resolved. 18:06:57 The first one (that we mentioned in the document) is the IRC ticket number bot and is similar to redirection to the gitlab ticket. 18:07:28 What is the question exactly? 18:07:32 anarcat has some ideas on how to do this 18:07:33 one is irc bot the other is https://bugs.torproject.org/XXX right? 18:07:52 right 18:07:58 the question is how we are going to do it 18:08:01 zwiebelfreunde uses the short URLs, i think? 18:08:10 in trac there is an unique ticket for all projects 18:08:11 #12345 18:08:15 in gitlab we have tickets per project 18:08:18 the question is how to keep ticket numbers consistent across the migration if we want to have tickets split between multiplep rojects 18:08:25 sorry gaba go ahead 18:08:30 i have a plausibile solution for the latter and the former i think i have a solution for too that involves coding a plugin to zwiebelbot 18:08:56 yes, there are a few things here. Once is across the migration with tickets that are already here and the other will be with new tickets related with bugs.torproject.org/XXXX 18:09:20 let's start with the solution for the irc bot 18:10:17 i think we need to write a plug-in for the bot that is running zwiebelbot to fetch the data via gitlab's api. we will lose the short hand ticket id syntax (#xxx) and have to use project-name#xxx instead with that 18:10:39 so there is a social part of it that needs to change there, but technically it is a half a day task i think to write the plug-in 18:11:04 can we retain #xxx for old tickets? 18:11:17 and we can do some aliases there, so you can write tor#xxx instead of core/tor#xxx 18:11:22 i also believe it should be fairly simple to hack zwiebelbot to do the right thing 18:11:29 if we know what the right thing is 18:11:30 nickm: i think we can, with the plan for bugs.torproejct.org that i was planning on mentioning 18:11:38 ack 18:12:16 so my plan for bugs.torproject.org is that a part of the migration from tickets from trac to all the different gitlab project, we save a mapping between the old ticket ID's and their new project paths, which would allow us to update the redirection service with a tool that looks at htis mapping and does the redirect 18:12:36 as nickm suggested at one point, we then make that all tickets above #50000 is gitlab tickets 18:12:37 seems like the right way to do it 18:12:46 i have an alternative proposal for this 18:12:55 this allows us to handle this mapping up to some N and then dont map those above that N 18:13:01 anarcat: go! 18:13:16 so i think such a mapping will be hard to maintain in the long term 18:13:24 it would be a 40k lines long mapping file 18:13:29 it's kind of annoying to carry that around forever 18:13:51 plus it means we can't easily find those bugs in gitlab without that magic oracle 18:13:59 what i'm proposing is that all tickets first get migrated to a single project 18:14:08 where there's a one to one mapping between ticket IDs in trac and gitlab 18:14:20 so bugs.tpo/N point to dip.tpo/legacy/N or whatever 18:14:41 once that's done, you split the legacy tickets in multiple projects, by *moving* each ticket to the right project 18:15:01 when we move each ticket to the right project, will it retain its number? we really need to retain numbers 18:15:02 when you move a ticket in gitlab from legacy/N to (say) tor/X, some sort of link is kept between the two tickets 18:15:03 anarcat: does gitlab do something like redirects for issues that have been moved? 18:15:06 yes 18:15:11 anarcat: that means we have to migrate with the components as a label in the issue and then manually move issues to the right project 18:15:16 redirects are kept 18:15:27 we might be able to keep the ticket numbers in the move as well, but i'm not certain of that part 18:15:36 number retention is essential to us 18:15:43 my hope is that it would be sufficient to have ticket numbrs in the legacy project to redirect to the new one 18:15:49 no, that's not enough 18:15:50 isnt this just a question of where the mapping lies? if it lies at gitlab or if it lies at the redirection service and the bot? 18:15:51 yes, i understand ticket retention is essential 18:16:05 but my argument is that you don't get ticket retention across project with the approach ahf is suggesting 18:16:10 We need every tor bug to have exactly one bug number. Not one old number and one new number. 18:16:16 i don't personally think a 40k lookup table is that much of a concern since it is static 18:16:17 because you need a magic oracle outside of gitlab to tell you where tickets are 18:16:38 oh 18:16:39 anarcat: +1 yeah that means gitlab keeps that lookup table for us 18:16:46 ahf: as the person who will have to carry one more static site around forever, i disagree :) 18:16:47 (we also need every number to have at most one bug) 18:16:49 you do get to preserve the numbers? 18:17:02 the trac ID's are the same in the gitlab projects, they justh ave holes between them 18:17:07 nickm: my hope is that renaming the tickets might allow us to preserve ticket numbers for those projects who it matters a lot to 18:17:13 but that needs to be confirmed 18:17:40 what i am certain of is that moving all tickets in a single project at first will preserve ticket numbers there and redirections will ensure a transition 18:17:51 I don't feel strongly about "legacy" vs "remapping table" -- but I do feel stronly about ticket number preservation. 18:18:00 nickm: that wont hold for newer tickets for each project though. #50001 wont be the same ticket for tor and tor-browser 18:18:07 that is fine 18:18:14 good 18:18:29 I mean, every number should have one bug per project 18:18:29 I just tried moving an issue from one project to the other and it did not keep the same number. But the redirection is permanent. 18:18:51 gaba: i am hoping that we can hack that through the API 18:18:54 https://dip.torproject.org/ahf-admin/tor-trac/issues here is a list of migrated core tor/tor trac tickets to a gitlab project. you should be able to see that there is a hole between the ticket ID's here 18:18:54 err, at most one bug per project 18:19:03 gaba: after all, if ahf can do it on import, i don't see why it can't be done in move 18:19:39 anarcat's solution does seem more elegant, if it works. 18:19:54 it would also make the bot easier 18:19:56 i cannot set the IID when i move a ticket with gitlab's api 18:19:59 because it wouldn't need to talk to bugs.tpo 18:20:08 ahf: "IID"? 18:20:26 the visible ID in the ticket list 18:20:32 the one we want to migrate. not the internal db ID 18:20:43 for example, I mean that #31107 must eventually be tor#31107. It is okay if it is also legacy#31107, but it can't be legacy#31107 and tor#12345 18:20:45 the ones that becomes #28930 instead of, say, 4 18:20:48 ahf: so you're saying it's not possible to preserve the ticket number on move? 18:21:03 yeah, i cannot specify the ID it gets when it gets moved 18:21:14 how do you specify the id on import? 18:21:17 https://docs.gitlab.com/ee/api/issues.html#move-an-issue 18:21:20 yes, it gets move to the next available number 18:21:25 i have an admin token that is allowed to set the IID 18:21:33 on import you can create an issue with a specific number, right? 18:21:40 sigh 18:21:41 yeah 18:21:55 i guess that breaks my elegant solution if we want the final projects to have the original ticket numbers 18:22:09 i thought that part didn't matter as much if we had the original ticket numbers 18:22:12 i think we should decide here that i experiment with your idea? 18:22:15 the mapping for me is the same 18:22:25 if i need to output to a json file after or if i need to move based on the mapping is both fine with me 18:22:30 i'd love if you could dig a little more into this, but i'm ready to give up on the idea if there's no other way 18:22:31 i think doing the move would be more elegant too 18:22:40 nickm: you think the final issue should have the same number so it can use with the irc bot? 18:22:43 yeah, i also want to try it out, i think that could be nicer 18:22:49 gaba: not just for the irc bot 18:22:54 this is important, so please let me explain 18:22:57 it also means i can migrate ALL tickets and then focus on moving the ones that needs to be moved 18:23:06 For almost 20 years now we have been using this namespace of ticekts 18:23:09 *tickets 18:23:40 All of our changelogs use this numbering to explain what changed and why 18:23:45 so do all of our commit messages 18:24:10 ahf: yeah, that's one thing i'm concerned about as well... with my strategy, some tickets could just stay in legacy forever 18:24:16 Our approach to remembering "what happened when" with our code relies on "ticket #98" meaning exactly one ticket, and having exactly one name. 18:24:17 anarcat: yeah, i like it 18:24:30 with anarcat's solution we would keep the number in bugs.torproject.org/#NNNN and that will get redirect to the new issue. 18:24:58 It would mean that our entire history got renumbered 18:25:07 All of our old changelogs would refer to one namespace, 18:25:14 and all of our new changelogs would refer to another namespace, 18:25:15 with the gitlab transition, #NN becomes ambiguous unless you scope it 18:25:20 i disagree 18:25:28 you end up with two main namespaces 18:25:29 oh. I see 18:25:30 nickm: if we partition the number space, old references can remain valid 18:25:30 #NN 18:25:31 i think you are talking passed each other. we need to make sure the ID's in legacy are the same as the ones we get when we move to core/tor/#nnn 18:25:32 and foo#NN 18:25:43 #NN refers (implicitely) to legacy#NN 18:25:54 and foo#NN is just what it is, new ticket numbers 18:25:55 also what anarcat said 18:26:20 ahf: i am not sure it's necessary to fulfill nickm's requirement of historical consistency 18:26:26 I think it is. 18:26:31 i think it does 18:26:36 okay :) 18:26:42 :) 18:26:55 also we can change future practices for references in changelogs 18:27:04 (and even have CI make sure they're correct) 18:27:08 yes 18:27:11 cool, so i am gonna experiment with doing the move approach on friday when i have my gitlab hack day, i think it sounds like a good idea and i think if we can make it work it would make our lives easier 18:27:12 it's true that if you commit something in gitlab with "Closes: #NNN", it will implicitely refer to "core/tor#NNN", not "legacy#NNN" 18:27:20 so it would be important to have that working 18:27:33 and then all projects after they get moved will get a ticket created at 50k so their next ticket begins from that ID and on 18:27:34 i was going under the assertion such a transition would be possible with ticket moves 18:27:37 but maybe that's not possible 18:27:51 I do not want this situation: 18:28:16 I do not want us to close tor#MM and then have to look for old discussion of legacy#NN. 18:28:22 for NN != MM 18:28:43 if legacy#NN is a core tor/tor ticket, then #NN in tor#NN will be the same as legacy#NN 18:28:46 Let's assume that Tor is still under development in 10 years' time 18:28:54 ahf: good. 18:29:09 suppose some bug from today is still relevant in 10 years' time 18:29:27 We don't want to force people then to have to remember that bugs before 2020 can have two different numbers 18:30:13 right 18:30:19 i agree with that 18:30:34 * anarcat as well 18:30:45 i was hoping scoping would fix this issue, but it's true there are implicit scoping things going on 18:30:56 so if there exists legacy#NN, it shoudl either be the same as tor#NN, or there should be no tor#NN 18:31:15 similarly, if there exists tor#NN, its legacy ticket should be legacy#NN, or it should have no legacy ticket 18:31:17 i think legacy#NN will be a redirect to tor#NN nce the move has happened 18:31:22 once* 18:31:28 ok. does everybody agree on the next step that ahf will experiment on moving tickets from legacy project into its own? If that does not work then we will look into that mapping for redirection taht ahf was proposing first. 18:31:31 I hope so 18:31:41 gaba: that would be okay w me 18:31:52 gaba: yeah, or come up with something nicer than having the state file. i can understand anarcats concern here 18:32:11 ok. This is for legacy tickets. How are we going to manage bugs.torproject.org/#NNN for new tickets in gitlab? 18:32:14 gaba: agreed 18:32:19 ok 18:32:33 it will have to be bugs.torproject.org/tor/50001 18:32:41 or something like it 18:32:45 gaba: my hope for bugs.tpo/X is that it will be either #NNN that points to legacy/NNN or scope#XXX that points to scope/XXX 18:32:52 bugs.torproejct.org/obfs4/50001 18:32:57 yes 18:32:57 that's cool with me 18:33:09 do we have a point about the hierarchy of projects? 18:33:15 because that's a thing here as well 18:33:27 adding structure of projects to the agenda 18:33:27 right now everything (but webpages/) is under torproject/ 18:33:32 we need to build some aliases for commonly used products i think, but in gitlab we have the $namespace/$project/xxx i think 18:33:33 oh ok 18:33:54 no, wait, $namespace/$group/$product i think 18:33:55 not sure why webpages is not under torprojects. I was thinking everything will be under torproject 18:34:09 i think web was the first one we added to gitlab? 18:34:18 like before we were thinking of any structure at all 18:34:19 tails may use our gitlab in the future (or other related project) 18:34:25 yep 18:34:38 my concern with having a toplevel torproject/ is that it makes things very long 18:35:02 you'd have to do bugs.tpo/torproject/tpa/dns/auto-dns#123 for one extreme case here 18:35:07 that's quite a mouthful 18:35:09 fwiw, we have never had two projects with the same name under different top-level namespaces 18:35:32 like, we have never had "core tor/website" and "torproject/website" 18:35:33 while i could flatten my stuff to be tpa/auto-dns#123, it still means bugs.torproject.org/torproject/tpa/auto-dns#123 18:35:34 and I hope we wouldn't 18:35:37 the issue about having a group torproject is to have a place where we can see issues for all projects and groups 18:36:05 i would rather have tpa/auto-dns#123 18:36:32 gaba: i am not sure i like the tradeoffs this implies :) 18:36:39 can't we make the redirection service "smarter" here using some aliases? bugs.torproject.org don't have to worry about tails, so we can strip out "torproject/" for all our things 18:36:46 the tradeoff is only the long of the url, right? 18:36:48 gaba: i understand where you're coming from, but are you sure it's going to make sense to have all those tickets at once? 18:36:54 +1 to what ahf said 18:37:03 ahf: it's not just bugs.tpo, it's also within issues 18:37:08 anarcat: it is what we have now in trac 18:37:15 ahf: in gitlab, you can refer to project#123 and it auto-links to the other thing 18:37:20 we can have a board for all Tor 18:37:37 yeah, but that feature is pretty smart isn't? i wont have to write the fully qualified name there most of the time 18:37:40 ahf: now i'd have to type torproject/project#123 to link to another ticket 18:37:52 anarcat: i do not totally undersatnd your concern 18:37:53 ahf: i think you would have to write a FQN there :) 18:37:58 project#123 if project is in the same namespace as your project is? 18:38:11 ahf: i'm not sure about that at all, but i'm ready to believe you :) 18:38:22 i am not 100%, but that could be tested 18:38:25 sure 18:38:44 if we do have one single namespace for everything, can we call it "tor" or "tpo" or something shorter than "torproject" at least? :) 18:39:18 hehe, sure 18:39:25 +1 to tpo or tp; -1 to tor 18:39:36 https://dip.torproject.org/ahf-admin/obfs4-trac/issues/29287 18:39:37 yes. i prefer tpo 18:39:40 seems to do the smart thing here 18:39:53 it found out that my tor-trac project were also under ahf-admin/ 18:40:07 * nickm suggests a rule: do not have two elements in the project naming tree with the same name, even if they are not ambiguous 18:40:20 ahf: cool 18:40:21 that is, let us never make "websites/tor" and "core tor/tor" 18:40:28 ok 18:40:30 it even rewrote ahf-admin/tor#28930 to tor#28930 18:40:31 and "tor/projects/tor" 18:40:34 nickm: case sensitive? ;) 18:40:42 core/Tor core/tor :p 18:40:49 argh 18:40:50 XD 18:40:55 It seems that we are moving into structure. 18:40:59 * nickm suggests that nothing be case sensitive ;) 18:41:07 cool! 18:41:08 or, everything be lowercase 18:41:09 either own 18:41:13 *either one 18:41:16 * anarcat likes lowercase 18:41:31 gaba: sorry, kind of hijacked that :/ 18:41:32 structure is maybe the part i have thought the least about, but i have the feeling both gaba and pili have made some thoughts here 18:41:33 In the document we are proposing a structure for projects under the tor proejct group 18:41:44 Group TEAM X - all team members have ownership on this group. 18:41:44 Project X1 - the ones related to repositories. Example: snowflake or little-t tor. 18:41:47 Group XX1 - example pluggable transports for anti-censorship team 18:41:49 Project Y - at organization level, example scalability that may touch all teams. 18:41:52 that is the structure we propose 18:42:05 Are people ok if we move to structure? 18:42:10 yep 18:42:15 yep 18:42:58 ok 18:43:05 now the question if everybody is ok with that structure? 18:43:26 and if people are ok to rename 'the tor project' to 'tpo' for the umbrella group 18:43:30 so let's see if i get this right... a project like snowflake is in tor/anti-censorship/snowflake right? 18:43:39 * ahf prefers tpo and things without spaces 18:43:40 can we get more examples of what the paths would look like? 18:43:55 i suspect we can't put spaces in project names in the first place 18:44:08 right 18:44:09 yeah, everything has a slug i think too 18:44:12 Let's seek short names, if we'll be typing these a lot. 18:44:12 i am not sure what Project Y is 18:44:18 nickm: +1 18:44:46 example could be scalability that touches many different teams 18:44:59 +1 to short names 18:45:17 gaba: what would be the full path to the scalability project, and what would be in it? 18:45:34 like tor/scalability? 18:45:57 ok, maybe is not the best example actually. But this is for projects that may be outside of the scope of the team 18:46:02 tpo/scaling 18:46:03 https://dip.torproject.org/torproject/scalability is the path 18:46:06 yes, tpo 18:46:26 or tp/scaling 18:46:59 could those be /torproject/-level milestones instead? or tags? 18:47:16 it could be labels 18:47:22 I think it is better to use labels if possible 18:47:25 i'm wondering what would be in such a project... if it's a ticket, can i close it from (say) torproject/tpa/tor-puppet.git? 18:47:26 rather than projects 18:47:31 but we may need a project outside teams 18:47:41 (can we just rename /torproject/ /tpo/ for the purpose of today's meeting? :) 18:47:53 everybody ok with tpo instead of torproject? 18:48:08 i guess i would just be jealous of the scalability project 18:48:20 not having to type their team name! :) 18:48:33 we will have most of the labels at a tpo group level 18:48:38 tpo +1 18:48:40 and then teams apply those labels 18:48:41 in other words, i'm not sure we should have an exception, otherwise everyone will want to get that exception 18:48:42 tpo +1 18:48:55 anarcat: why do people will want that exception? 18:49:01 if they do then there may be a good reason :) 18:49:33 ok. Can we move to the next one? We have been meeting for 49 min already :) 18:49:34 gaba: less typing? 18:49:38 sure 18:49:47 yep 18:50:02 next one is 'anonymous users/ user registration' 18:50:06 a thing to think about then: what if a project changes teams, as happened when PTs moved from network team to the newly created anticensorship team? 18:50:19 no need to answer now, let's just think about it 18:50:30 i am personally completely OK with not having our anonymous user, but i know there are many different views in the project 18:50:38 in Stockholm we mentioned salsa.debian.org as something we may want to have. But this is a next step and not a requierment for a migration, right? 18:51:00 i think having a shared anonymous account is a negative in terms of creating an inclusive environment 18:51:01 salsa have a special registration form for account creation that we might wanted 18:51:01 i thought we had a strong concensus in stockholm about not keeping the cypherpunks user 18:51:06 I think it is important to have a low-overhead process so that people can report bugs without a lot of troble. 18:51:12 It does not need to be a shared anonymous account 18:51:19 anarcat: i think so too, but i also think it has been debated afterwards 18:51:48 i think our options are (1) cypherpunks (2) salsa custom signup form (3) akismet + recaptcha (4) write a custom captcha plugin 18:51:57 I don't think that we would have created a cpunks account if we had a better low-effort way for people to report bugs. 18:51:58 (3) is not really an option, of course 18:52:01 just stating it for the record 18:52:06 +1 to having a way for people to report bugs that do not involve a lot of process 18:52:08 nickm: i'm ok with taking community feedback from github.com etc. to avoid the moderation pain of allowing general public signups 18:52:41 one thing that i have been thinking about around what nickm just said was how our ideal bug creation setup looks like in the future. we suddenly have a system with a mighty nice API with confidential bugs and such. we could with some development have forms for people to submit bugs with where we prompt for various questions (like please post output of tor --version here) and have tickets created based on 18:52:48 that 18:52:49 with more friendly wizard like setups 18:52:52 specifically about friction, one option that was discussed in STO as well is the idea of shoving that to a forum outside of the bugtracker 18:52:53 catalyst : i'm only okay doing that if we can autosync. It might not be a great idea if we can't 18:53:08 i like the idea of (ab)using the gitlab API 18:53:23 I wonder if there is a possibility of a moderation queue? 18:53:34 i think there is the possibility of a moderation queue 18:53:36 you can build it with confidential tickets i think 18:53:39 the problem with this is we either need to create a GitLab user for the submitter on the fly or the submitted doesn't have feedback on progress 18:54:01 can we make it easy to create new users in a "heavily moderated" group? 18:54:05 anarcat: we can create a user for them just with an email and a random password when they submit a ticket there 18:54:09 nickm: i donøt know 18:54:10 if we leave the gitlab API open to outside registration, that moderation queue will be hell 18:54:19 anarcat: +1 18:54:44 and i don't mean cypherpunks hell, which is a special kind of hell, i mean spam hell, which is another special kind of hell 18:54:56 the issue is what we can compromise for this with the capacity we have right now for this migration. What is an acceptable first step to use in gitlab? 18:55:01 (and for the record, i'm a metalhead, so hell also has positive conotations to me ;) 18:55:21 gaba: that's a good question 18:55:34 gaba: i'm not willing to compromise by allowing a shared anonymous account 18:55:53 catalyst: that is what we have right now 18:56:04 We need to have some way for users to report bugs that works okay. 18:56:30 It shouldn't be frustrating, horrible, or high-effort. 18:56:31 gaba: i've never been ok with it and it's one of the reasons i initially had misgivings about joining Tor 18:56:34 I think that's the initial MVP 18:56:51 nickm: "gitlab" is already going to be an incredible improvement over "trac" in that matter, fwiw :p 18:57:02 anarcat: only if you have an account on it 18:57:05 we donøt have user sign up at all right now 18:57:07 nickm: sure 18:57:20 14:52:20 <+anarcat> i think our options are (1) cypherpunks (2) salsa custom signup form (3) akismet + recaptcha (4) write a custom captcha plugin 18:57:23 i think that might be more interesting to discuss than the anonymous account? if user account creation is easy, is this a concern? 18:57:27 i heard strong objections on (1) 18:57:32 i assume (3) is out of the question 18:57:40 i don't think we can remove enough rights from a user to not make the cyberpunks account a mess 18:57:41 so we're left with the salsa thing and custom captcha, right? 18:57:46 like i think the first user can just login and change the pw 18:58:00 and i don't think we get gitlab to implement support for us having something like the cpunks account 18:58:08 i agree 18:58:17 Maybe our first step is to just allow user registration in gitlab 18:58:30 yes, but the question is how 18:58:31 fwiw, i personally find captchas to be yet another very specific hell 18:58:35 i also fail them very often 18:58:37 for sure 18:58:44 i don't mean recaptcha, by (4) 18:58:51 that's (3) :) 18:58:52 gaba: i'm ok with allowing user registration in gitlab if we commit the resources to do adequate moderation 18:59:04 but yeah, i hear you, it blocks people with certain disabilities as well, for example 18:59:08 in my **ideal** world, we'd do something like this: 18:59:14 i seem to be very bad at any type of captcha i have encounterd >< 18:59:29 1. we allow user registration, with some mechanism to avoid too many spam accounts getting created. 18:59:32 brabo: i'm sorry to hear that... :/ 18:59:53 brabo: have you encountered good moderation systems out there that we could implement? :) 18:59:56 2. new accounts get all their tickets moderated until after their first valid bug is submitted 19:00:12 can we get tickets moderated for new accounts in gitlab? 19:00:17 not sure if 2 is a feature or not 19:00:17 3. admins can turn moderation on or off on any account, and on new accounts, at any time. 19:00:20 That would be my ideal. 19:00:26 i don't moderation is an answer 19:00:30 anarcat: that's the problem no, if no captcha must moderate to allow new accounts? 19:00:37 [We also need folks handling the moderation queue.] 19:00:44 you would just get an immense pile of spam in the moderation queue, and you'd be incapable of finding the good stuff in there 19:00:47 I think that moderation is not an answer to spam, but it is an answer to jerks. 19:00:48 it wouldn't be visible to the public 19:00:57 but it would not be usable 19:00:58 similarly, captchas don't handle jerks, but they do handle spam 19:01:12 my concern with gitlab registration is not jerks, it's spam 19:01:29 both jerks and spam make our job hard to do 19:01:36 we need a plan for both 19:01:39 jerks are easy to solve, you kick them out and they generally don't have much incentive to create sock puppets 19:01:48 those are two very different problems 19:02:00 * ahf is trying to figure out what we *can* do with gitlab right now 19:02:01 anarcat: we have had significant sock-puppetry and abuse in the past 19:02:12 nickm: through cypherpunks? or random accounts? 19:02:17 both 19:02:38 some folks are willing to click through 1000 captchas if they know that they get to annoy people every time 19:02:51 anarcat: yeah, both, especially when we do something that some close-minded people find politically distastefull 19:03:17 so a moderation queue is great for jerks, but a catastrophe for spam 19:03:33 my primary concern is spam, so i don't think we should work towards a moderation queue at first 19:03:41 but i'm new here, and maybe i'm missing the big picture 19:04:00 uh. I'm having a hard time seeing what would be the solution for this that we can work on 19:04:06 but i will warn that any gitlab instance that opens up registration unprotected will get totally hammered by spammers, to the point of being completely unusable 19:04:16 i thought we'd deploy the salsa registration form as a first step 19:04:19 it doesn't deal with jerks 19:04:23 but it might deal with spam 19:04:25 what people propose that we can do first? 19:04:28 gaba: i suggest we manually register gitlab accounts for known good contributors 19:04:39 catalyst: what about people that want to just report a bug? 19:04:40 and open up contributions from github issues 19:04:47 i would take into consideration that regarding captchas, automated tools can outperform humans, and development in that field will not stop 19:05:19 catalyst: how the contributions from github would work? 19:05:28 how do they get moved into gitlab 19:05:39 I'm okay with doing whatever form of protection on the account creation seems necessary at first, and experimenting with ways to open it up. 19:05:40 we could manually create gitlab issues, or eventually automate it 19:06:01 brabo: i used to think that, but i found this article very inspiring in that regard https://kevv.net/you-probably-dont-need-recaptcha/ 19:06:06 when we had issues open on github.com/torproject/tor, we didn't get that many issues opened 19:06:09 [It sucks to have a procedure that only 95% of people can use to create accounts, but that is better than having a process that 0% can use, it seems to me] 19:06:20 1 hour since we started the meeting. Are people ok going 30 min more? We can try to wrap up this item and then go through the other quickly 19:06:27 catalyst: I think we would have more if we told people that was the official way to create tickets 19:06:39 gaba: i am ok, i find it very productive still 19:06:45 brabo: when i mean "custom captcha", i don't necessarily mean "some gobledydook that no one can read", it can also mean topical questions like "what is the favorite vegetable of the tor community" 19:07:06 nickm: you may be right 19:07:08 gaba: okay either way 19:07:13 ok, leet's try to focus on proposals that people are giving 19:07:40 1) (catalyst proposal) only accounts in gitlab for contributors that we bring - open github issues for anybody else. We move things manually at first... 19:07:56 For moderation: I want to know that having users be moderated is possible, and have a plan to do it if necessary. 19:08:10 I think we need both modderation and spam-limiting, and they need to be separate 19:08:52 i have just asked that question in #gitlab nickm 19:09:05 it's always possible to kick people out after, but what i understand is it's a cumbersome process 19:09:20 that's kind of a "post" moderation system 19:09:22 i am not aware of an existing "pre" moderation system 19:09:32 anarcat: i think questions are a lot nicer than images. it may also be a lot nicer in regards to accessibility. 19:09:32 nickm: so you are saying you have a 2nd proposal that is open registration and moderate accounts. right? 19:09:51 anarcat: i understood (from mike) that removing accounts in gitlab was not an easy process 19:09:54 brabo: yeah! it was a nice article, thinking outside the box 19:09:58 from micah... sorry 19:10:01 anarcat: i think the problem with kicking people out *after* is that then people have to see all the stupid crap they write first 19:10:04 gaba: that's right 19:10:11 ahf: agreed 19:10:14 I'm saying "open registration with some spam-limiter, and moderate accounts" 19:10:15 i was just stating how things work now 19:10:29 and trying to clarify what "moderate" means (pre or post?) 19:10:42 with spam-limiter would mean having salsa custom signu pform 19:10:49 we need post-moderation in any case, and it needs to be fast and easy 19:10:50 we have the option of the akismet thing, but i don't think that will help against our crazy user, but only against spam and it involves sending the users IP to a central service which probably flags all tor users as bad 19:10:52 because if it's "pre" moderation, it hinders bug submissions, because people can't submit stuff until we approve them 19:11:01 no 19:11:08 nickm: i don't think gitlab does fast post-moderation 19:11:13 and i am not sure it does pre-moderation at all 19:11:25 they submit a bug, the bug goes in the queue, the bug gets approved and public 19:11:48 (that's how to do pre-moderation with allowing people to submit stuff before they are approved) 19:12:04 nickm: well that's a nice idea in theory, but it probably means a lot of coding in Rails if you actually want it 19:12:08 i don't see how we can do this without using the API and build a small webservice for it 19:12:13 because i'm pretty sure that doesn't exist right now 19:12:16 right 19:12:18 i see benefits to that, a lot, at the expense of developer time 19:12:22 it could be a separate thing 19:12:32 i do see a lot of benefits to that. we can do bug creation wizards and such 19:12:37 for first time submitters 19:12:38 people have suggested using RT for this, but that's just dodging the "who will moderate all that crap" problem 19:12:43 once we approve them they can do everything though 19:12:44 yeah 19:12:52 we can also take bug reports from TBB (for example) 19:12:59 once approved they can only write issues and make comments 19:13:04 that is the guest account 19:13:09 yep gaba, but they can write whatever they want 19:13:11 and have those submissions private by default 19:13:14 as for "fast" post-moderation: it needs to be less effort to block a user, log them out, and remove their last 24 hours of activity than it is to create an account and abuse it. 19:13:21 they can start by submitting something good, and then go full loco afterwards 19:13:37 spammers are so creative 19:13:42 nickm: right 19:13:56 ok. Let's check with gitlab if there is any other way they deal with spam. 19:13:57 i think that is also an infrastructure thing we need to solve 19:14:08 nickm: that would be great, but it's not how gitlab works right now, as far as i know 19:14:12 like if 3 people in #tor-internal says "block user X" to some irc bot, then we do it 19:14:13 I can send a mail asking about this and seeing if there is any other solution. 19:14:17 at least that's the experience that micah related 19:14:20 instead of having to prod the right admin who might be on vacation right now to do it 19:14:38 ahf: make more people admins then 19:15:01 sure 19:15:19 so more research needed? 19:15:23 seems so 19:15:24 or what's the next step 19:15:25 i think so 19:15:32 ok 19:15:37 let's leave this on standby then 19:15:42 i mean i am not sure there's much to find in gitlab 19:15:53 in itself, what can also help to stop bots creating accounts, is to just ask an email invite on irc. it raises the bar, and that might turn away legit users 19:15:54 there's no good moderation system, they assume you setup akismet 19:15:57 and recaptcha 19:16:11 i think the salsa signup form could provide a stopgap measure to including new users without enabling spammers 19:16:34 brabo: irc is also a huge barrier to entry :) 19:16:38 Salsa signup was what we thought it was the best option in Stockholm. 19:16:47 yep 19:17:02 anarcat: it can be, but if you add a link to a webchat page to the information, it helps 19:17:04 i think we could give it a shot 19:17:05 we had one additional concern in stockholm that made it more attractive there: we wanted to postfix all new usernames with -guest 19:17:21 The issue is that we need to resolve this before migrating. We need to take a decision on something 19:17:24 to avoid future collisions between ldap users and gitlab registered users 19:17:33 yes 19:17:40 not saying it would work for a project this size and user base, but i've seen it used before. and bots have a hard time :) 19:18:08 ok. Next one is data migration process 19:18:31 would it be helpful to discuss preservation requirements before migration process? 19:18:38 gaba: let's say "research and salsa signup" for now? 19:18:47 ok anarcat 19:18:52 nickm: ok 19:19:07 "Preserving existing trac data" next 19:19:24 so the important thing for me is that nothing is lost. 19:19:39 Earlier versions of this proposal didn't include that, and I want to make sure we're on the same page now. 19:19:48 When we go to gitlab, it will be our 4th bugtracker 19:20:00 tickets #1 through #40 started life on bugzilla 19:20:05 There is a place in the plan document that maps fields from trac into gitlab. I think we will not lose anything. 19:20:14 tickets #41 through #1369 started on flyspray 19:20:22 and #1370 through present started on trac 19:20:25 is nothing actually ***nothing*** or is it nothing'ish? For example: do you want comments that have been modified in trac to have their whole history over too? 19:20:37 ahf: good question 19:20:38 usually these are typo fixes 19:20:45 * catalyst thinks it's more realistic to outline what data losses are acceptable 19:20:48 ahf: I think we should enumerate everything we will lose, and make sure we are okay with it. 19:20:49 i have moved to the strategy after last weeks talk about this about moving EVERYTHING 19:20:55 but everything might not be what we want 19:21:00 right 19:21:03 that is what i have started doing 19:21:07 sounds good 19:21:13 so my end goal here is to show some list of "here are things you wont get" 19:21:22 I think edit history is ok to drop in the long run. 19:21:29 and then at some point i will ask people to find their most wicked tickets and tell me if the hsitory is preserved probably 19:21:30 we shouldn't lose closed tickets or comments 19:21:34 and then we will have to do some sampling there 19:21:44 have people seen the data we migrate right now? 19:21:59 https://dip.torproject.org/ahf-admin/tor-trac/issues is how tickets look now 19:22:00 not in the last ~10 days 19:22:07 some things from trac are moved over to labels 19:22:17 such as (task|project|enhancement|issue) and such 19:22:30 this is some 500 tickets migrated as an experiment the other night 19:22:40 ignroe the username being all me, that is to not spam people with users 19:23:16 https://dip.torproject.org/ahf-admin/tor-trac/issues/29607#note_9352 is an example of the thing i was talking about with comments that have been modified in trac 19:23:23 notice the _comment0, _comment1, etc. 19:24:22 Are people ok with this? 19:24:27 if people spot anything they dont like here, please create issues at https://dip.torproject.org/ahf/trac-migration so gaba and i can do something smart 19:25:20 and another thing: there is this tool called tracboat which i think we might want to use for the wiki migration 19:25:25 i have not looked at wiki migration at all 19:25:26 next step here is to come with a list of things that you will not see in gitlab from trac. right? 19:25:47 gaba: yes 19:25:48 i need to do that list 19:25:51 We did not talk much about wiki migration but I was thinking it could be by group/team 19:25:57 and main wiki be the one from the tpo group 19:26:00 and then i suspect we need to discuss it on tor-dev@ or something 19:26:09 ok 19:26:16 gaba: i haven't spend more than 10 seconds thinking about the wiki 19:26:27 my head is mostly full of what to do with tickets 19:26:27 the wiki is another "legacy" thing, IMHO 19:26:36 it could be just moved to the legacy project, and then split up by hand 19:26:41 yes, but we may be able to migrate it easily to gitlab 19:26:42 i don't think it's realistic to do it any other way 19:26:49 yes 19:26:50 yeah anarcat, i think so right now too 19:26:55 we will have to manually figure out where it should go 19:26:55 redirections would work the same way and all 19:27:12 it's going to be a huge change, that thing, and hard to do 19:27:13 anarcat: legacy project OR to the group tpo 19:27:14 * catalyst needs to be afk now 19:27:15 but it can be punted to later 19:27:21 o/ catalyst 19:27:24 gaba: either works i guess 19:27:31 catalyst: thanks for joining in on this and the useful feedback here! 19:27:37 if we move it to tpo group then we may only have to move stuff related with teams 19:27:49 catalyst: see you later! 19:28:11 Let's talk about 'data migration process' last and then it would be great if people can go through the notes to be sure everything is reflected and next steps are clear 19:28:44 unless there is any other comment/question on preserving data 19:29:52 * ahf cannot think of any that havent already been up 19:30:36 Data migration. So far ahf is working on it and the process will depend if we move everything to legacy project first or we just go directly to each project 19:30:48 yeah 19:30:59 but the data content will be the same no matter which one we do 19:31:05 just a matter of where it goes first 19:31:27 Teor was bringing a few questions in their last mail in tor-project@ 19:31:32 mostly about testing if it worked 19:31:49 We will not remove trac right awaay but only have it in read only 19:32:20 there is going to be a testing phase at some point 19:32:27 The idea for the migration is to remove the project and import all tickets. If something does not work then we remove project and re-iport. 19:32:31 and at some point we also need to decide on when the migration should happen if we get to that 19:32:42 like it must be a weekend where some of us will be around and see that hings are going OK 19:32:49 i think we will need to "freeze" gitlab if not entirely, at least some projects that will be affected by the migration 19:32:50 to avoid messing too much with people's workday 19:32:54 otherwise ahf's will be misery 19:33:07 freeze it? 19:33:10 ahf: i recommend against weekend migrations 19:33:16 that means no weekends 19:33:21 no weekends = bad 19:33:22 my script starts by deleting the old repos to make sure i start from a clean slate when i migrate 19:33:26 you do migrations on a monday 19:33:31 ahf: well that's one way to go about it :p 19:33:43 ahf: what if someone was using that repo before? you just destroyed their work! :) 19:33:47 my monday is so PACKED already :s 19:33:48 that's what i mean by freezing 19:33:56 well tuesday then 19:33:56 ah, right 19:34:07 ok, we can find a date later when we have found a solution to all the unknowns 19:34:14 this will mean people will have to continue working with a ticketing system 19:34:41 and we are guessing that this will be 1 week work when things will be fine to be switched to gitlab after that 19:34:45 moving 500 tickets took me 20 min. with the script i think, then you can calculate how long it will take for 32k tickets 19:35:16 but that can be made faster, not something we need to worry on now 19:35:23 ahf: i will go through all the projects and add them to your .ini file. There are some projects that we need to see if people are alredy doing work in gitlab (like gitlab) 19:35:30 s/gitlab/gettor 19:35:32 thanks gaba! 19:35:35 that would be awesome 19:36:02 If there are projects that people are already working on we may need to import into some other project and then manually move tickets later 19:36:07 i think we'll have a flag day when people will have to live without a bugtracker for a 12-24h 19:36:11 i think that's a fair ask 19:36:14 yes 19:36:18 Any other consideration for this last step? We may need to write down a better plan for it 19:36:23 that was what i wanted to be in a weekend though :S 19:36:24 yes 19:36:41 so people show up monday morning to y2k problems on gitlab 19:36:51 one possibility we may want to consider: assume we will screw up the first migration, and plan to wipe it and do it again. 19:36:58 if we are lucky, we won't have to. 19:37:06 right, that is what the tool does now 19:37:07 yes 19:37:10 it deletes the repos 19:37:13 creates them again 19:37:16 enables issues on them 19:37:18 and starts migrating 19:37:21 Right, but I'm talking more about planning 19:37:24 yes 19:37:25 ah, yep 19:37:28 we need to have a better plan for this 19:37:35 yeah, a more concrete plan 19:37:40 If we have to wipe 24 hours after our migration, we will be glad that we told people in advance "we will maybe wipe after 24 hours" 19:37:41 write down what we need to be looking for when testing 19:38:00 good point, nickm 19:38:03 yeah, make a plan, check it twice, and send it with a date when we're sure 19:38:16 ahf: let's talk later about a plan and add something more concrete to that big nc document :) 19:38:27 i sent a decomission plan for jabber like this to tor-project@ that can serve as a good template 19:38:31 and those of us involved in this should probably prepare for having a week where we do quite a bit of support when we do this 19:38:37 i can help review this 19:38:53 yep gaba 19:38:55 ok 19:39:08 also, is there a way to first import all the repos, while trac and such is still online, test it is all good, and then freeze and update everything? 19:39:24 brabo: i don't think it's incremental 19:39:26 it's one shot 19:39:27 if tested before, that'd be close to flipping a switch 19:39:43 ok. Anything else about this meeting? 19:39:50 please, take a look at the notes: https://pad.riseup.net/p/e-q1GP43W4gsY_tYUNxf 19:40:18 thanks for the notes gaba!! 19:40:20 the notes i have seen so far looks good 19:40:30 I should take off now; I'll have another note before I go 19:40:34 thanks, everybody! 19:40:39 thanks! 19:40:43 #endmeeting