15:00:43 <karsten> #startmeeting metrics team
15:00:43 <MeetBot> Meeting started Thu Aug  1 15:00:43 2019 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:43 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:00:47 <karsten> https://storm.torproject.org/shared/5h1Goax5eNusxjXJ_Ty5Wl7hFR1uqCReUiN8xdlBG8T <- agenda pad
15:00:56 <acute> hi
15:01:02 <gaba> o/
15:01:11 <karsten> hi acute and gaba!
15:01:40 <karsten> please add topics to the agenda.
15:01:45 <gaba> sorry that I'm just going back to Tor life this week. hope you all are doing fine
15:01:55 <karsten> welcome back! :)
15:02:00 <acute> welcome back!
15:03:03 <karsten> shall we start with what's on the agenda?
15:03:26 <karsten> ah, gaba is adding more.
15:03:33 <gaba> yes
15:03:36 <gaba> i'm fine
15:03:50 <karsten> ok.
15:03:59 <karsten> let's start then.
15:04:04 <karsten> Update on dependency management (karsten)
15:04:23 <karsten> this is an update to our discussion about depenency management last week.
15:04:45 <karsten> I looked at ant ivy, and I'm aware that maven and gradle exist.
15:05:11 <karsten> I think that in our current situation the quickest solution would be to just add ant ivy to our build processes.
15:05:21 * irl has no idea what this is
15:05:37 <gaba> "Using an alternative build system for Metrics Java codebases"
15:05:40 <karsten> it fetches jar files from the maven repository and provides them in a local lib/ directory in the code repository.
15:05:50 <irl> ah ok
15:05:59 <karsten> it doesn't require changes to our project structure, for example.
15:06:15 <irl> this sounds ok for an intermediate step to solve the urgent problem
15:06:26 <karsten> we should still plan to switch over to maven or gradle at some point, but that will take more time.
15:06:28 <irl> it doesn't look like a very active project though
15:06:33 <irl> which is a little scary
15:06:37 <irl> 1 release in 15 years
15:06:42 <irl> oh, 5 years
15:06:46 <irl> still that's not very active
15:06:56 <karsten> well, the good thing is that we don't need to rely on them for dependencies themselves.
15:07:00 <irl> heh, and it's a release candidate
15:07:01 <karsten> that's all in maven.
15:07:04 <irl> right
15:07:11 <irl> which should make the switch to maven easier later
15:07:30 <karsten> I have read good things about gradle, so I think I'd want to look into that as time permits.
15:07:37 <karsten> but probably not in the next months.
15:07:56 <irl> ok cool
15:08:30 <karsten> yes, or look at maven, too.
15:08:33 <irl> have you already tried adding ivy to any codebases
15:08:39 <karsten> yes.
15:09:15 <karsten> it's not many lines of code to add to our build.xml, plus a new ivy.xml.
15:09:28 <irl> that sounds easy enough then
15:09:35 <karsten> I stopped at the point where I got jar files in the local lib/ directory, so I did not *successfully* try it out yet.
15:09:46 <irl> heh
15:09:58 <karsten> okay, I'd continue with ivy then.
15:10:11 <karsten> and create a ticket for looking at gradle and maven later.
15:10:19 <irl> sounds good to me
15:11:07 <karsten> okay, great!
15:11:23 <karsten> gaba: what did you mean with that quote above?
15:11:48 <karsten> is this already on a roadmap somewhere?
15:11:49 <gaba> I was refering to this discussion last week
15:11:50 <gaba> for irl...
15:11:54 <karsten> ah!
15:11:59 <karsten> okay.
15:12:16 <gaba> not many progresses on bug fixing https://issues.apache.org/jira/projects/IVY/issues/IVY-1586?filter=allopenissues
15:13:10 <karsten> well,
15:13:23 <karsten> what we could also do is stick to the libs we have at the moment.
15:13:43 <karsten> and kill the buster and stretch jenkins builds of metrics-lib.
15:13:49 <gaba> you both make the decision on this. I was just looking at their issues
15:13:58 <karsten> and then look into gradle and maven a bit earlier.
15:14:05 <karsten> well, you have a point.
15:14:12 <karsten> and irl.
15:14:34 <karsten> we don't *have* to do anything here now.
15:15:11 <karsten> this is also a plausible way forward for me.
15:15:14 <gaba> it is a lot of work to do gradle and maven? I understand that this is the preference
15:15:49 <irl> i think that doing ivy is worthwhile
15:16:05 <irl> not just for CI but also to see how the latest versions of libs actually work together
15:16:06 <karsten> hard to say how much time it would take to do gradle or maven.
15:16:53 <karsten> okay.
15:16:58 <gaba> ok
15:17:35 <karsten> I'll give that a try with a time limit for all the servlet+JSP stuff that I ran into last time.
15:18:24 <karsten> alright. moving on to the next topic?
15:18:38 <gaba> ok
15:18:41 <irl> ok
15:18:46 <karsten> Plan for hosting BGP updates from Counter-RAPTOR paper (karsten)
15:18:58 <karsten> a while ago we said we'd host contributed data on tor metrics.
15:19:04 <karsten> well, collector, to be precise.
15:19:23 <karsten> last week or so we have been asked how this is going, for something that we committed to in the past.
15:19:43 <karsten> this isn't on my list at the moment.
15:19:57 <karsten> and I'm considering to say that we might not get to that in the near future.
15:20:16 <irl> this would go in the data portal
15:20:21 <gaba> ^
15:20:31 <irl> although perhaps it wouldn't even go anywhere if we're running low on disk space
15:20:44 <gaba> I was going to say that this could be something for the data portal
15:20:51 <irl> how much data is it?
15:20:57 <karsten> hmmmm.
15:21:10 <karsten> a fine question.
15:21:23 <karsten> do you want to ask them?
15:21:49 <irl> i don't think it would change the answer from "we're not going to do this soon"
15:22:06 <karsten> what's the rough time frame there?
15:22:09 <gaba> karsten: is this in a ticket?
15:22:44 <karsten> maybe? the part I refer to is an email thread.
15:22:46 <irl> it may even be that it is not in this year, until after july next year
15:23:08 <irl> we ideally want some reliable, backed up, high availability storage
15:23:48 <karsten> still, even if this still takes a while, this is a more solid plan than "we're thinking about adding it to tor metrics somewhere, sometime".
15:23:57 <gaba> yep
15:24:20 <karsten> do you want to reply to them, or should I do that? (and cc gaba)
15:24:24 <irl> ok, well the plan would be that we store it in a web-accessible location and index the metadata in our dataset catalog
15:24:33 <irl> but so far neither the storage location nor catalog exist
15:24:48 <irl> i'll reply to them after the meeting
15:25:01 <karsten> okay, thanks!
15:25:42 <karsten> speaking of emails:
15:25:46 <karsten> Gaba's Tor metrics data portal email (karsten)
15:25:49 <gaba> :)
15:26:07 <gaba> this is what they are presenting for funding. It is not for us
15:26:26 <gaba> Everything seems fine to me in that proposal (I made only a few comments/changes)
15:26:43 <gaba> but was wondering if you could just read it and flag anything that may not what you were thinking about
15:26:58 <irl> what is wrong with publishing tor's financial records?
15:27:13 <irl> https://www.torproject.org/about/reports/
15:27:14 <gaba> I think they are already published somewhere else
15:27:28 <irl> they are data
15:27:31 <gaba> there is nothing wrong about publishing them. I was not thinking this data portal could be the place for that
15:27:38 <gaba> mmm
15:27:47 <irl> the portal is more an index than it is storage
15:27:47 <gaba> so you are thinking that anything data related could go there?
15:27:50 <irl> yeah
15:27:59 <irl> even if it doesn't live there it can reference it somewhere else
15:28:35 <irl> we also don't just link to the pdfs that are published but the IRS publishes electronic versions as open data that we can link to also
15:28:54 <gaba> well, I guess it could. I didn't think it would be all data related to Tor
15:29:35 <irl> i think the scope includes anything that a journalist might want to know about Tor in order to represent it accurately in stories they write
15:29:40 <irl> which would include financials
15:29:57 <irl> if there is some management decision that it doesn't though, then i'll go along with it
15:29:57 <gaba> oh, ok. In this case it is not only a metrics data portal but a data portal for Tor
15:30:01 <irl> yeah!
15:30:08 <gaba> no, no decision so far
15:30:15 <irl> just like the research portal is also about ux and usability
15:30:22 <gaba> Looks good to me if the scope is for a "data portal for Tor"
15:30:36 <gaba> karsten: any thoughts on this?
15:30:49 <karsten> sounds good to me.
15:30:54 <gaba> ok
15:31:09 <gaba> I will rollback my changes on their proposal then :)
15:31:34 <karsten> I mean, we'll see how much financial records stand out from the rest as soon as the data portal exists.
15:32:16 <karsten> the PDF vs. electronic version argument is a very good one, though.
15:32:44 <karsten> irl had a diagram somewhere explaining how well a file can be processed, from PDF to CSV to etc.
15:33:01 <irl> the onion of open data
15:33:10 <karsten> it's even an onion!
15:33:12 <gaba> nice :)
15:33:15 <antonela> :)
15:33:36 * gaba in her previous life worked on a system to OCR pdfs into csvs...
15:33:49 <karsten> nice!
15:34:35 <gaba> we can move on, right?
15:34:41 <karsten> I'm wondering:
15:34:42 <irl> yes
15:34:53 <karsten> do you need more input on that document you sent in your email?
15:34:58 <karsten> gaba: ^
15:35:08 <gaba> karsten: only if there is anything there that is not right
15:35:21 <irl> i have not yet looked at it at all
15:35:24 <irl> how long do i have?
15:35:34 * karsten just requested access.
15:35:44 <gaba> ah, ok. Let me reply to them and cc you
15:35:51 <irl> i have to request access
15:35:57 <irl> so i can't actually see it just yet
15:36:20 <karsten> ah, should I not have requested access?
15:36:25 <karsten> if so, oops.
15:37:48 <gaba> They will give you access
15:38:06 <karsten> okay, great.
15:38:32 <karsten> I think we can move on now.
15:39:05 <irl> ok
15:39:13 <karsten> Last roadmap - how are we doing? (gaba)
15:39:22 <karsten> yay, we're using trello!
15:39:35 <karsten> I made quite a few changes to the trello roadmap.
15:39:48 <gaba> thanks irl for setting up trello.
15:40:57 <karsten> and there's a trac query that I didn't see before.
15:41:11 <gaba> Yes. Right after the Tor meeting I digitalized the roadmap
15:41:26 <karsten> what is the canonical place for our current roadmap?
15:41:27 <gaba> I sent you both a mail with the spreadsheet, right?
15:41:46 <irl> i'm making progress on debian#932901 and should be finished within 1 point, then i'll be moving on to #28322
15:41:47 <karsten> yes, I saw a spreadsheet.
15:41:52 <gaba> that is what I want to agree with you all. Clearly storm didn't work.
15:42:11 <irl> so the roadmap does accurately reflect what i'm doing
15:42:20 <gaba> I'm setting up gitlab for the network team to try.
15:42:24 <gaba> ok
15:42:33 <gaba> and irl was suggesting to use trello for metrics
15:42:46 <irl> works well with ipad
15:42:49 <karsten> trello works very well for me.
15:42:50 <gaba> It is still having to be manually sync with trac
15:42:56 <acute> same here
15:42:57 <gaba> but for sure it works better than storm
15:43:02 <karsten> yes, it does.
15:43:05 <gaba> ok
15:43:09 <gaba> let's do trello then
15:43:16 <karsten> so, are we going to use trello as central point for now?
15:43:19 <gaba> and we sync it in every meeting
15:43:27 <irl> sounds good
15:43:34 <gaba> to get the big picture of what metrics is doing
15:43:49 <irl> acute: are you working on the onionperf ansible scripts?
15:43:58 <acute> yes
15:44:07 <irl> ok moved that to in progress
15:44:26 <acute> there are also a bunch of fixes/small improvements that are not documented in the roadmap
15:44:33 <karsten> added the ivy thing for me.
15:44:35 <irl> are they in gitlab tickets?
15:44:42 <acute> that I would like to add to onionperf
15:44:55 <irl> or did we not move the tickets yet?
15:45:04 <acute> they will be
15:45:10 <acute> and we can see how that works out
15:45:18 <irl> ok cool
15:45:40 <irl> acute: you can probably fix this one easily https://dip.torproject.org/torproject/metrics/onionperf/issues/1
15:45:47 <gaba> Let's still create tickets in a public place where people can make comments to it (trac or gitlab depending on the project)
15:45:53 <acute> haha
15:45:58 <karsten> gaba: yes, agreed.
15:46:09 <irl> yes, sounds good to me
15:46:17 <acute> yes, sounds good
15:46:27 <karsten> how is this related to our earlier plan to do sprints?
15:46:48 <irl> looks like currently we are doing 1 week sprints
15:46:51 <irl> if we review each week
15:47:04 <gaba> yes.
15:47:07 <irl> i'd have to read a book on agile or something to know if i'm talking nonsense
15:47:10 <karsten> with the goal to finish everything that's under In Progress?
15:47:33 <karsten> if so, I wonder if we need to phrase things more clearly regarding when we can consider something done.
15:47:34 <gaba> the goal of sprints is at the beginning add things to your sprint that you will do it all
15:47:51 <karsten> in the past we have had cards stuck in In Progress for weeks.
15:47:54 <gaba> or get smaller ticket/issues that are part of a story
15:47:58 <karsten> and we did work on them for weeks, so that was correct.
15:47:59 <gaba> yes
15:48:02 <gaba> that is not so good
15:48:14 <karsten> right, it doesn't work so well for sprints, is my understanding.
15:48:23 <karsten> we could divide them up to what's doable in 1 week.
15:48:32 <acute> agreed
15:48:37 <gaba> agree
15:48:59 * irl swapped a ticket for a smaller set of tickets
15:49:10 <irl> (and also more specific)
15:50:03 <karsten> and is there a way to see all progress since last week?
15:50:03 <gaba> If we have a trac/gitlab ticket that is too big getting smaller ticket as children and then do them one at a time per week may be a good idea.
15:50:21 <gaba> karsten: what do you mean?
15:50:48 <karsten> gaba: I mean, how do we see what has changed since last week on trello?
15:51:05 <karsten> or are we mainly interested in whether In Progress is empty by thursday?
15:51:12 <gaba> we are using trello as a kanban board
15:51:13 <karsten> (and everything has moved to Done?)
15:51:20 <gaba> done or review
15:51:30 <karsten> ah, okay.
15:51:35 <gaba> if we think that 1 week is too short we can do 2 weeks
15:51:44 <gaba> and review the roadmap every two weeks
15:51:56 <gaba> review the kanban board*
15:52:07 <karsten> okay, in that case I'd want to add something else to In Progress.
15:52:28 <gaba> ok
15:52:35 <karsten> how about we add things until the end of today, if we want to, for the current 2 weeks period?
15:53:05 <gaba> oh, you want to do 2 weeks instead of 1
15:53:19 <karsten> ah, I thought you just suggested that.
15:53:23 <karsten> we can also start with 1.
15:53:40 <gaba> I said that if 1 week is not good for us then we can do 2 weeks. I think is ok to start with 1
15:53:44 <karsten> great!
15:53:51 <karsten> works for me.
15:54:08 <irl> works for me too
15:54:35 <karsten> alright.
15:54:58 <karsten> the last item on the roadmap is an update by djackson who says he's offline today.
15:55:16 <karsten> sounds like we'll get an update from him next week.
15:55:25 <irl> do we know how well tor works from gcp?
15:55:51 * karsten doesn't know.
15:56:12 <irl> and if this is going to generate substaintial load on the network, was there some safety considerations thinking done?
15:56:30 <irl> the second question is more important, but the first is is also important
15:57:07 <karsten> hmm, do you think such a measurement could do harm to the network?
15:57:08 <irl> i just remember setting up a study to test for tcp explicit congestion notification support from azure to 2 million webservers but didn't realise until afterwards that azure bleaches the ecn bits out the header and the whole thing was a waste of time
15:57:23 <irl> i think that spinning up 2 million tor clients wouldn't be great
15:57:41 <karsten> I really hope that he has a different order of magnitude in mind.
15:58:06 <irl> if tor is started fresh each time, you get a new consensus each time, that's a lot of directory bandwidth
15:58:41 <irl> i don't think it needs to go to the safety board but some envelope calculations on how much bandwidth would be used
15:58:50 <irl> just to make sure that it's around what you expected
15:59:05 <karsten> do you want to drop him a quick email about that?
15:59:15 <irl> ok yes
15:59:32 <karsten> okay!
15:59:39 <karsten> 60 minutes mark reached.
15:59:44 <karsten> let's talk more next week!
15:59:49 <irl> we got access to the SOW document, so that and two emails i'll do after dinner
15:59:55 <gaba> :)
16:00:06 <gaba> o/
16:00:14 <karsten> thanks, everyone. bye! o/
16:00:17 <acute> bye!
16:00:18 <irl> bye!
16:00:21 <karsten> #endmeeting