13:59:21 <karsten> #startmeeting metrics team
13:59:21 <MeetBot> Meeting started Thu Jan 21 13:59:21 2016 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:59:21 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
13:59:37 <karsten> welcome back, MeetBot
13:59:44 <thms> hi meeting!
13:59:45 <karsten> who else is here for the meeting?
13:59:48 <karsten> hi thms!
13:59:48 * ww hello
13:59:54 <karsten> hi ww!
13:59:55 * qiv is lurking around too :-)
14:00:00 <karsten> and qiv
14:00:07 <qiv> hi everyone
14:00:08 <karsten> https://pad.riseup.net/p/zUNzEIFRq5S4 <- agenda pad
14:00:42 <themoep> hey everyone!
14:00:43 <karsten> let's collect meeting topics there.
14:00:46 <karsten> hi themoep!
14:04:01 <karsten> okay, looks like we have four topics. please add more if something else comes up, but let's start with the first.
14:04:09 <karsten> * Easy Metrics website fixes (karsten)
14:04:14 <karsten> https://pad.riseup.net/p/5c0cfMhCAJYx
14:04:27 <karsten> that's a list of fixes that I started working on.
14:04:38 <karsten> in particular, I tried out iframes.
14:04:47 <karsten> compare https://metrics.torproject.org/hidserv-rend-relayed-cells.html
14:04:52 <karsten> to https://metrics.torproject.org/hidserv-frac-reporting.html
14:05:25 <karsten> the second uses an iframe for the graph, the first does not.
14:05:56 <thms> feels about the same…
14:06:46 <karsten> why is that? do you have an idea?
14:07:06 <thms> there is so little text and embedded scripts that it’s rather no difference in download time, and only to trips more to the server for html and css files
14:07:23 <karsten> yeah, plus one downside:
14:07:27 <thms> it would change if there where more graphs on the page or heavier layout, images etc
14:07:37 <karsten> we need to include a "Permanent link", because the browser url stays the same.
14:07:49 <thms> right
14:08:20 <karsten> ok. I think I'll leave it up for the moment, just in case we realize it has other advantages.
14:08:29 <thms> there are workarounds for that but i fear the all rely on javascript. woud have to check though
14:08:36 <karsten> but it might be that iframes are not the solution for interactivity we have been looking for.
14:09:03 <karsten> sounds good.
14:09:11 <karsten> if you come up with something, let me know.
14:09:24 <thms> yes, but don’t put much hope in it
14:09:31 <karsten> yep.
14:09:40 <karsten> so, regarding the other items on the list,
14:09:58 <karsten> if people here would pick one or two of them and work on them, that would be neat.
14:10:11 <karsten> and it would be great if people came up with more items.
14:10:30 <karsten> easy fixes, things that can be done in a few hours and that make the metrics website better.
14:11:01 <karsten> I'll keep working on the metrics website in the next few days.
14:11:07 <thms> "Move all page content to non-Java/non-JSP files for better extensibility." <- that would profit from some guidance I assume
14:11:27 <thms> ah, you started on it already?
14:11:45 <karsten> yep, I think that one is something I should be involved with. but you could help.
14:12:29 <thms> that would be my topic "javascript"
14:12:49 <karsten> okay, let's move on to that.
14:13:02 <karsten> just scribble on the pad for easy metrics website fixes, everybody.
14:13:06 <karsten> moving on..
14:13:13 <karsten> * javascript (thms)
14:13:45 <thms> okay. i wrote a few mails and you said you’d rather prototype a little before discussing them further. how did prototyping go?
14:14:28 <karsten> not far enough to have something to show here, though I have an idea how to get everything running.
14:14:56 <thms> ready to share that idea?
14:15:07 <karsten> the prototype would be a node server that accepts requests, draws svg graphs using d3, and returns them.
14:15:36 <karsten> maybe similar to the Rserve thing we use to draw graphs right now.
14:15:56 <karsten> my plan was to use this prototype to draw the bubble graphs we already have on the website,
14:16:02 <karsten> just on the server not on the client.
14:16:13 <thms> node+D3 sounds sensible. and interactivity could be added later, or not.
14:16:15 <thms> yes
14:16:18 <karsten> however, one issue I was thinking of:
14:17:09 <karsten> node wants its code to run really fast,
14:17:29 <karsten> because it's single-threaded, using an event-based model.
14:17:45 <karsten> what if d3 visualizations are complex and take seconds to be completed.
14:17:54 <thms> node is not blocking
14:17:59 <thms> the magic of callbacks
14:18:18 <thms> so node doesn’t care about D3 taking seconds, minutes…
14:18:45 <ww> not blocking or not blocking?
14:18:56 <thms> hm?
14:19:02 <ww> as in: code should be written to return very fast to give the illusion of not blocking
14:19:11 <ww> or: actually not blocking
14:19:54 <thms> as in: node doesn’t idle until the command returns.
14:20:08 <thms> so: actually
14:20:55 <phw> oops, late for the meeting.  i messed up the time zone math.
14:21:02 <karsten> hi phw!
14:21:05 <thms> karsten: I’m sure there’s no problem there. and choosing node+D3 leaves all possibilities for interactivity open. so: good choice imho
14:21:06 <phw> hi karsten
14:21:25 <thms> karsten: Express as web framework?
14:21:46 <karsten> thms: what I meant is that there's a potential problem. I didn't run into it, it's just one thing I noted down to keep in mind.
14:22:16 <karsten> thms: possibly, yes. for this prototype it might be sufficient to leave out express and do everything manually.
14:22:18 <thms> karsten: and what I meant is that there isn’t ;-)
14:22:26 <karsten> we'll find out.
14:23:00 <karsten> (phw: agenda pad is here: https://pad.riseup.net/p/zUNzEIFRq5S4 -- feel free to add topics)
14:23:16 <karsten> okay, what else should we talk about wrt. javascript?
14:23:22 <thms> Express seems to be the de facto standard for sites taht are not heavy on user-side JavaScript
14:23:31 <thms> nothing more
14:23:39 <karsten> yep, and I think using express would work just fine.
14:23:52 <karsten> it's just one more thing to include in this prototype, and maybe it's not necessary yet.
14:24:01 <karsten> but other than that I don't see any issues in using it.
14:24:14 <karsten> ok.
14:24:17 <karsten> moving on.
14:24:22 <karsten> * analytics project (thms)
14:24:39 <thms> just wanted to report that the JSON-converter is new ready for use (in case someone was waiting for it ;-)
14:24:49 <karsten> oh!
14:24:50 <thms> and work on the Avro-converter is progressing nicely. if that works out as expected it will replace the JSON-converter (since it can switch from Avro to Parquet to JSON on demand)
14:25:06 <karsten> do you have sample data in json format?
14:25:29 <thms> ehm, I would have to check what’s in the repo right now
14:25:42 <thms> but i can generate something if you want :)
14:25:56 <karsten> I'd be curious, yes.
14:26:12 <thms> okay, will put something on github tonight
14:26:31 <karsten> great!
14:26:58 <karsten> so, would you want to get feedback on the json converter and/or output?
14:26:58 <Broya> Broya is here, sorry I am late.
14:27:01 <karsten> hi Broya!
14:27:21 <Broya> Hi Karsten and all.
14:27:22 <karsten> thms: or are you busy working on the next converter?
14:27:33 <karsten> Broya: https://pad.riseup.net/p/zUNzEIFRq5S4 <- agenda pad
14:27:47 <thms> oh, of course! the output is like i envcision it to be with the next converter too. so please comment!
14:27:53 <karsten> cool!
14:28:05 <karsten> anything else on this topic, or should we move on?
14:28:11 <thms> ok, that’s it from me
14:28:21 <karsten> great!
14:28:24 <karsten> * Graph stats reported by little-t-tor that we're currently not graphing (karsten)
14:28:51 <karsten> rob jansen started a discussion about stats we're collecting that we might not want to keep collecting.
14:29:15 <karsten> I suggested going crazy on evaluating the existing data, because that's already out there, and then deciding how to proceed.
14:29:22 <karsten> here's the posting: https://lists.torproject.org/pipermail/tor-dev/2016-January/010258.html
14:29:55 <karsten> would people here want to look at exit-stats or one of the other stats and see what useful we could be doing with this data,
14:30:02 <karsten> and what harmful things others could be doing with it?
14:30:28 <karsten> entry-ips and other *-ips would be other stats worth looking into.
14:30:40 <karsten> also, cell-stats.
14:31:11 <karsten> this could be a fun small project for a few days up to a week.
14:31:40 <themoep> i'd like to take a look, but I won't be able to start until late next week
14:32:06 <karsten> great! can you take a look earlier today and pick one that sounds most interesting to you?
14:32:09 <themoep> also FYI, I'm pretty new to this :)
14:32:20 <karsten> err, earlier than late next week, e.g., today.
14:32:30 <themoep> yeah, that I can do
14:32:36 <karsten> cool!
14:33:15 <karsten> okay, if others want to pick something, please let me know. we should avoid duplicating efforts here,
14:33:34 <karsten> .
14:34:01 <karsten> okay, moving on:
14:34:04 <qiv> i am also following that discussion with interest, but i am still trying to find time to dig deeper into the theory ....
14:35:03 <karsten> qiv: okay, maybe take a look at the various stats that are out there and *if* one of them sounds interesting and doable, let me know.
14:35:31 <karsten> in theory, the mailing list posting should contain the relevant links.
14:35:37 <qiv> sure, did not want to interrupt :-)
14:35:50 <karsten> ah, no worries. :)
14:35:55 <karsten> okay, moving on:
14:36:00 <karsten> * Getting sybilhunter's output on metrics.tpo (philipp)
14:36:35 <karsten> phw: ^
14:36:36 <phw> i think it would be nice to have two visualisations on metrics.  the churn rate and the uptime images.
14:36:54 <karsten> yes, I agree. how do we want to do that?
14:37:05 <phw> the churn rate stuff outputs a CSV that can then be plotted.  the uptime images output JPGs.
14:37:40 <karsten> can we include the data-processing parts in metrics-web?
14:37:45 <phw> we just have to run the sybilhunter binary on collector data, and then put the results on the web somehow.
14:38:08 <phw> i would have to read up on metrics-web to answer that.
14:38:11 <karsten> this is go, right?
14:38:14 <phw> yes.
14:38:47 <karsten> we should be able to do this.
14:38:58 <karsten> that would work as follows:
14:39:17 <karsten> once per day, we'd call a shell script that calls your go stuff which produces numbers and images.
14:39:37 <karsten> the .csv should probably be documented similar to the other .csv files:
14:39:52 <karsten> e.g., https://metrics.torproject.org/servers-data.html
14:40:19 <karsten> and we'd probably plot them using R/ggplot2, e.g.,
14:40:23 <karsten> https://metrics.torproject.org/networksize.html
14:40:51 <phw> a "date" field i have, but apart from that, it's basically just numbers, for every consensus.
14:41:09 <karsten> I'll have to think about putting the images on metrics, but I think we can make that work, too.
14:41:28 <karsten> so, yes, sounds doable. should we move this to email?
14:41:43 <phw> sounds good!  i'm happy to help wherever i can, including changing the code to make it easier to work with metrics.
14:42:05 <karsten> great! thanks. :)
14:42:14 <karsten> okay, moving on.
14:42:19 <karsten> * Scanning bridge reachability via side channels (Broya)
14:43:09 <Broya> So I am still in the process of perfecting the technique
14:44:16 <Broya> But it turns out we have more than 90,000 IPs that I can possibly use to measure reachability of Tor Bridges.
14:45:35 <karsten> sounds great! what do you think, how can people here help out with that?
14:45:55 <ww> ripe atlas is good for general reachability testing from multiple perspectives
14:46:41 <Broya> As of now, I don't need help.
14:47:22 <Broya> Atlas might not be ethical to use in some countries, and might not have vantage points in many places
14:48:02 <karsten> what's your projected timeframe for getting first results that you can publish?
14:48:22 <karsten> well, I mean share here.
14:49:03 <Broya> I hope in 2 months I have the system ready to go. Then it will be data collecting, and visualizing stuff. That I suffer at.
14:49:32 <karsten> ah, I'm sure we'll find people to help with visualizing fine new data!
14:50:19 <Broya> Also, I need to find a university who would be eager to let us use their network for long time
14:50:52 <karsten> and of course I don't know the exact things you're trying to improve, but 2 months is a long time to work on something before getting feedback. release early, release often.
14:51:20 <isabela> oi
14:51:21 <Broya> I have options in mind and I am confident they are onboard. I just don't know whether we have to formally get agreement, or faculties eagerness is enough
14:51:51 <karsten> huh, fine questions.
14:52:27 <thms> sorry, have to go. bye!
14:52:37 <karsten> should we move that to email maybe? I could imagine that other tor folks have better insight into legal questions.
14:52:42 <karsten> thms: thanks for coming. bye!
14:52:43 <Broya> yes.
14:53:00 <karsten> ok, great!
14:53:13 <Broya> I will send email sometimes next week
14:53:40 <karsten> please do. and let us know when you need help with anything, including visualizing results.
14:54:07 <karsten> we only have 5 minutes left. should we move on?
14:54:45 <ww> "use their network" how?
14:54:46 <Broya> sure. Thanks
14:55:08 <karsten> Broya: see ww's question, and then we'll move on.
14:55:11 <karsten> okay?
14:55:31 <Broya> my measurements need to run from a server in a univerisity
14:55:51 <ww> would tardis.ed.ac.uk work?
14:56:14 <ww> it's a student/alumni run half rack at school of informatics
14:57:05 <Broya> I need some firewall rules to get relaxed, but any university can do it. We should talk offline.
14:57:16 <ww> yes.
14:57:19 <karsten> sounds great! :)
14:57:24 <karsten> * 3 proposals put in at edinburgh, now waiting for students (ww)
14:57:27 <karsten> yay!
14:57:41 <karsten> ww: do you have an idea how long we're waiting? mostly curious.
14:57:51 <karsten> and will students contact us for details?
14:58:02 <ww> i had email that says:
14:58:11 <ww> By 28 January, you will need to submit your preferences for which
14:58:13 <ww> students you are willing to supervise on MSc projects.
14:58:35 <karsten> sounds great.
14:58:38 <ww> so, quickly. i expect they will tend to contact the informatics staff
14:58:51 <ww> they may cc you if they're observant :)
14:58:56 <karsten> also great. just let me know if you need my help.
14:59:04 <karsten> or that. cool!
14:59:15 <ww> will do.
14:59:34 <karsten> thanks again for helping with all this. curious to see where this is going.
14:59:43 * ww too!
14:59:55 <karsten> okay, looks like we ran out of time and topics.
15:00:03 <ww> perfect timing :)
15:00:10 <karsten> unless there's anything else we should talk about veeeery quickly?
15:00:54 <karsten> ok. we can always discuss on the mailing list or in two weeks from now.
15:01:01 <karsten> thanks for attending, everyone!
15:01:09 <karsten> #endmeeting