13:59:21 #startmeeting metrics team 13:59:21 Meeting started Thu Jan 21 13:59:21 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:21 Useful Commands: #action #agreed #help #info #idea #link #topic. 13:59:37 welcome back, MeetBot 13:59:44 hi meeting! 13:59:45 who else is here for the meeting? 13:59:48 hi thms! 13:59:48 * ww hello 13:59:54 hi ww! 13:59:55 * qiv is lurking around too :-) 14:00:00 and qiv 14:00:07 hi everyone 14:00:08 https://pad.riseup.net/p/zUNzEIFRq5S4 <- agenda pad 14:00:42 hey everyone! 14:00:43 let's collect meeting topics there. 14:00:46 hi themoep! 14:04:01 okay, looks like we have four topics. please add more if something else comes up, but let's start with the first. 14:04:09 * Easy Metrics website fixes (karsten) 14:04:14 https://pad.riseup.net/p/5c0cfMhCAJYx 14:04:27 that's a list of fixes that I started working on. 14:04:38 in particular, I tried out iframes. 14:04:47 compare https://metrics.torproject.org/hidserv-rend-relayed-cells.html 14:04:52 to https://metrics.torproject.org/hidserv-frac-reporting.html 14:05:25 the second uses an iframe for the graph, the first does not. 14:05:56 feels about the same… 14:06:46 why is that? do you have an idea? 14:07:06 there is so little text and embedded scripts that it’s rather no difference in download time, and only to trips more to the server for html and css files 14:07:23 yeah, plus one downside: 14:07:27 it would change if there where more graphs on the page or heavier layout, images etc 14:07:37 we need to include a "Permanent link", because the browser url stays the same. 14:07:49 right 14:08:20 ok. I think I'll leave it up for the moment, just in case we realize it has other advantages. 14:08:29 there are workarounds for that but i fear the all rely on javascript. woud have to check though 14:08:36 but it might be that iframes are not the solution for interactivity we have been looking for. 14:09:03 sounds good. 14:09:11 if you come up with something, let me know. 14:09:24 yes, but don’t put much hope in it 14:09:31 yep. 14:09:40 so, regarding the other items on the list, 14:09:58 if people here would pick one or two of them and work on them, that would be neat. 14:10:11 and it would be great if people came up with more items. 14:10:30 easy fixes, things that can be done in a few hours and that make the metrics website better. 14:11:01 I'll keep working on the metrics website in the next few days. 14:11:07 "Move all page content to non-Java/non-JSP files for better extensibility." <- that would profit from some guidance I assume 14:11:27 ah, you started on it already? 14:11:45 yep, I think that one is something I should be involved with. but you could help. 14:12:29 that would be my topic "javascript" 14:12:49 okay, let's move on to that. 14:13:02 just scribble on the pad for easy metrics website fixes, everybody. 14:13:06 moving on.. 14:13:13 * javascript (thms) 14:13:45 okay. i wrote a few mails and you said you’d rather prototype a little before discussing them further. how did prototyping go? 14:14:28 not far enough to have something to show here, though I have an idea how to get everything running. 14:14:56 ready to share that idea? 14:15:07 the prototype would be a node server that accepts requests, draws svg graphs using d3, and returns them. 14:15:36 maybe similar to the Rserve thing we use to draw graphs right now. 14:15:56 my plan was to use this prototype to draw the bubble graphs we already have on the website, 14:16:02 just on the server not on the client. 14:16:13 node+D3 sounds sensible. and interactivity could be added later, or not. 14:16:15 yes 14:16:18 however, one issue I was thinking of: 14:17:09 node wants its code to run really fast, 14:17:29 because it's single-threaded, using an event-based model. 14:17:45 what if d3 visualizations are complex and take seconds to be completed. 14:17:54 node is not blocking 14:17:59 the magic of callbacks 14:18:18 so node doesn’t care about D3 taking seconds, minutes… 14:18:45 not blocking or not blocking? 14:18:56 hm? 14:19:02 as in: code should be written to return very fast to give the illusion of not blocking 14:19:11 or: actually not blocking 14:19:54 as in: node doesn’t idle until the command returns. 14:20:08 so: actually 14:20:55 oops, late for the meeting. i messed up the time zone math. 14:21:02 hi phw! 14:21:05 karsten: I’m sure there’s no problem there. and choosing node+D3 leaves all possibilities for interactivity open. so: good choice imho 14:21:06 hi karsten 14:21:25 karsten: Express as web framework? 14:21:46 thms: what I meant is that there's a potential problem. I didn't run into it, it's just one thing I noted down to keep in mind. 14:22:16 thms: possibly, yes. for this prototype it might be sufficient to leave out express and do everything manually. 14:22:18 karsten: and what I meant is that there isn’t ;-) 14:22:26 we'll find out. 14:23:00 (phw: agenda pad is here: https://pad.riseup.net/p/zUNzEIFRq5S4 -- feel free to add topics) 14:23:16 okay, what else should we talk about wrt. javascript? 14:23:22 Express seems to be the de facto standard for sites taht are not heavy on user-side JavaScript 14:23:31 nothing more 14:23:39 yep, and I think using express would work just fine. 14:23:52 it's just one more thing to include in this prototype, and maybe it's not necessary yet. 14:24:01 but other than that I don't see any issues in using it. 14:24:14 ok. 14:24:17 moving on. 14:24:22 * analytics project (thms) 14:24:39 just wanted to report that the JSON-converter is new ready for use (in case someone was waiting for it ;-) 14:24:49 oh! 14:24:50 and work on the Avro-converter is progressing nicely. if that works out as expected it will replace the JSON-converter (since it can switch from Avro to Parquet to JSON on demand) 14:25:06 do you have sample data in json format? 14:25:29 ehm, I would have to check what’s in the repo right now 14:25:42 but i can generate something if you want :) 14:25:56 I'd be curious, yes. 14:26:12 okay, will put something on github tonight 14:26:31 great! 14:26:58 so, would you want to get feedback on the json converter and/or output? 14:26:58 Broya is here, sorry I am late. 14:27:01 hi Broya! 14:27:21 Hi Karsten and all. 14:27:22 thms: or are you busy working on the next converter? 14:27:33 Broya: https://pad.riseup.net/p/zUNzEIFRq5S4 <- agenda pad 14:27:47 oh, of course! the output is like i envcision it to be with the next converter too. so please comment! 14:27:53 cool! 14:28:05 anything else on this topic, or should we move on? 14:28:11 ok, that’s it from me 14:28:21 great! 14:28:24 * Graph stats reported by little-t-tor that we're currently not graphing (karsten) 14:28:51 rob jansen started a discussion about stats we're collecting that we might not want to keep collecting. 14:29:15 I suggested going crazy on evaluating the existing data, because that's already out there, and then deciding how to proceed. 14:29:22 here's the posting: https://lists.torproject.org/pipermail/tor-dev/2016-January/010258.html 14:29:55 would people here want to look at exit-stats or one of the other stats and see what useful we could be doing with this data, 14:30:02 and what harmful things others could be doing with it? 14:30:28 entry-ips and other *-ips would be other stats worth looking into. 14:30:40 also, cell-stats. 14:31:11 this could be a fun small project for a few days up to a week. 14:31:40 i'd like to take a look, but I won't be able to start until late next week 14:32:06 great! can you take a look earlier today and pick one that sounds most interesting to you? 14:32:09 also FYI, I'm pretty new to this :) 14:32:20 err, earlier than late next week, e.g., today. 14:32:30 yeah, that I can do 14:32:36 cool! 14:33:15 okay, if others want to pick something, please let me know. we should avoid duplicating efforts here, 14:33:34 . 14:34:01 okay, moving on: 14:34:04 i am also following that discussion with interest, but i am still trying to find time to dig deeper into the theory .... 14:35:03 qiv: okay, maybe take a look at the various stats that are out there and *if* one of them sounds interesting and doable, let me know. 14:35:31 in theory, the mailing list posting should contain the relevant links. 14:35:37 sure, did not want to interrupt :-) 14:35:50 ah, no worries. :) 14:35:55 okay, moving on: 14:36:00 * Getting sybilhunter's output on metrics.tpo (philipp) 14:36:35 phw: ^ 14:36:36 i think it would be nice to have two visualisations on metrics. the churn rate and the uptime images. 14:36:54 yes, I agree. how do we want to do that? 14:37:05 the churn rate stuff outputs a CSV that can then be plotted. the uptime images output JPGs. 14:37:40 can we include the data-processing parts in metrics-web? 14:37:45 we just have to run the sybilhunter binary on collector data, and then put the results on the web somehow. 14:38:08 i would have to read up on metrics-web to answer that. 14:38:11 this is go, right? 14:38:14 yes. 14:38:47 we should be able to do this. 14:38:58 that would work as follows: 14:39:17 once per day, we'd call a shell script that calls your go stuff which produces numbers and images. 14:39:37 the .csv should probably be documented similar to the other .csv files: 14:39:52 e.g., https://metrics.torproject.org/servers-data.html 14:40:19 and we'd probably plot them using R/ggplot2, e.g., 14:40:23 https://metrics.torproject.org/networksize.html 14:40:51 a "date" field i have, but apart from that, it's basically just numbers, for every consensus. 14:41:09 I'll have to think about putting the images on metrics, but I think we can make that work, too. 14:41:28 so, yes, sounds doable. should we move this to email? 14:41:43 sounds good! i'm happy to help wherever i can, including changing the code to make it easier to work with metrics. 14:42:05 great! thanks. :) 14:42:14 okay, moving on. 14:42:19 * Scanning bridge reachability via side channels (Broya) 14:43:09 So I am still in the process of perfecting the technique 14:44:16 But it turns out we have more than 90,000 IPs that I can possibly use to measure reachability of Tor Bridges. 14:45:35 sounds great! what do you think, how can people here help out with that? 14:45:55 ripe atlas is good for general reachability testing from multiple perspectives 14:46:41 As of now, I don't need help. 14:47:22 Atlas might not be ethical to use in some countries, and might not have vantage points in many places 14:48:02 what's your projected timeframe for getting first results that you can publish? 14:48:22 well, I mean share here. 14:49:03 I hope in 2 months I have the system ready to go. Then it will be data collecting, and visualizing stuff. That I suffer at. 14:49:32 ah, I'm sure we'll find people to help with visualizing fine new data! 14:50:19 Also, I need to find a university who would be eager to let us use their network for long time 14:50:52 and of course I don't know the exact things you're trying to improve, but 2 months is a long time to work on something before getting feedback. release early, release often. 14:51:20 oi 14:51:21 I have options in mind and I am confident they are onboard. I just don't know whether we have to formally get agreement, or faculties eagerness is enough 14:51:51 huh, fine questions. 14:52:27 sorry, have to go. bye! 14:52:37 should we move that to email maybe? I could imagine that other tor folks have better insight into legal questions. 14:52:42 thms: thanks for coming. bye! 14:52:43 yes. 14:53:00 ok, great! 14:53:13 I will send email sometimes next week 14:53:40 please do. and let us know when you need help with anything, including visualizing results. 14:54:07 we only have 5 minutes left. should we move on? 14:54:45 "use their network" how? 14:54:46 sure. Thanks 14:55:08 Broya: see ww's question, and then we'll move on. 14:55:11 okay? 14:55:31 my measurements need to run from a server in a univerisity 14:55:51 would tardis.ed.ac.uk work? 14:56:14 it's a student/alumni run half rack at school of informatics 14:57:05 I need some firewall rules to get relaxed, but any university can do it. We should talk offline. 14:57:16 yes. 14:57:19 sounds great! :) 14:57:24 * 3 proposals put in at edinburgh, now waiting for students (ww) 14:57:27 yay! 14:57:41 ww: do you have an idea how long we're waiting? mostly curious. 14:57:51 and will students contact us for details? 14:58:02 i had email that says: 14:58:11 By 28 January, you will need to submit your preferences for which 14:58:13 students you are willing to supervise on MSc projects. 14:58:35 sounds great. 14:58:38 so, quickly. i expect they will tend to contact the informatics staff 14:58:51 they may cc you if they're observant :) 14:58:56 also great. just let me know if you need my help. 14:59:04 or that. cool! 14:59:15 will do. 14:59:34 thanks again for helping with all this. curious to see where this is going. 14:59:43 * ww too! 14:59:55 okay, looks like we ran out of time and topics. 15:00:03 perfect timing :) 15:00:10 unless there's anything else we should talk about veeeery quickly? 15:00:54 ok. we can always discuss on the mailing list or in two weeks from now. 15:01:01 thanks for attending, everyone! 15:01:09 #endmeeting