13:59:41 #startmeeting metrics team 13:59:41 Meeting started Thu Feb 4 13:59:41 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:41 Useful Commands: #action #agreed #help #info #idea #link #topic. 13:59:59 hello. we're meeting here today, because there's another meeting in #tor-dev. 14:00:07 who's here for the metrics team meeting? 14:00:12 I exist. 14:00:18 hi virgil! 14:00:24 i am 14:00:27 https://pad.riseup.net/p/zUNzEIFRq5S4 <- agenda pad 14:00:30 hi thms! 14:00:40 please add topics to the pad. 14:00:46 heja 14:00:56 hi themoep! 14:03:57 maybe make a second announcement in tor-dev? 14:04:24 yep. 14:05:01 let's wait another minute or two.. 14:05:10 but, please add more topics to the pad everyone! 14:06:41 okay. shall we start? 14:06:55 * analytics project (thms) 14:07:05 working on the unified avro/parquet/json converter 14:07:34 that seems like a jungle sometimes 14:07:43 but making progress 14:07:46 anything to see? 14:07:54 had hoped to be ready today, but… 14:08:25 i made a new repo on github: tomlurge/convertor 14:08:29 sounds like there will be something to see within a day or two? 14:08:32 ah, /me looks 14:08:51 right now you can only see building material 14:09:12 and i didn’t push for 2 days i guess 14:09:30 but i hope for 5 days (fear for another 2 weeks) 14:09:56 that’s it from me 14:10:26 ok. want to let us know on metrics-team@ when there's something to look at? 14:10:30 * karsten is curious 14:10:41 ideally with a pointer where to start looking. 14:10:46 (so many files....) 14:10:50 will do 14:11:02 cool! 14:11:27 okay, next agenda item is: 14:11:32 * exit-kibibytes-* (themoep) 14:11:52 i was looking at this from the extra-info, regarding the mail about if we need to collect the data 14:11:55 * karsten will add an update on *-ips which is related to that. 14:12:42 and it seems it is not used anywhere, but I can make some graphs to see if it's interesting/useful 14:12:55 true, it's not used anywhere. 14:13:07 I think we used it a while back to write a tech report, 14:13:23 but we're not plotting these stats on Metrics or anywhere else. 14:13:45 so, yes, making some graphs would be cool. just to see if there's anything we can learn from the data that we didn't expect. 14:14:01 will do! 14:14:08 relating a bit, a question: 14:14:15 also, what fraction of relays is reporting these stats anyway? 14:14:24 please ask. 14:14:26 is there anywhere a timeline that logs interesting events? 14:14:52 hmmmmm. maybe a censorship timeline. let me look.. 14:14:55 because sometimes I see spikes in graphs that might be explained really quickly if there were 14:15:47 a very small, maybe 1% but I can give you a better estimate once I've worked through more datasets 14:15:59 sounds great! 14:16:09 so, I didn't find a good wiki page for events. 14:16:13 would you want to start one? 14:16:57 it could start with spikes in graphs that you don't have an explanation for, 14:17:06 and maybe others have those explanations. 14:17:07 sure, I'll see to get a couple events together to get it started 14:17:43 be sure to tell metrics-team@ about it. there are others, like phw, who might be interested. 14:18:09 just to be sure; which wiki do you mean? 14:18:19 https://trac.torproject.org/projects/tor/wiki/TitleIndex 14:18:35 just create a new page under doc/ . 14:18:44 ok! 14:18:58 by pretending to read that page, and then trac tells you it doesn't exist yet but you can create it. 14:19:07 great! 14:19:43 also, before sinking too much time into it, maybe ask on metrics-team@ whether there are other lists that I didn't find just now. 14:20:14 okay, moving on. 14:20:19 ah, 14:20:25 I wanted to add something to this topic. 14:20:43 I started looking into the other statistics that we might want to remove in the future. 14:21:04 namely, bridge-ips, entry-ips, dirreq-v3-ips, etc. 14:21:31 I didn't find a convincing reason to keep them, so I started hacking on a patch that removes them. 14:22:03 turns out arma2 would like to keep them, so I hope for more discussion on tor-dev@ soon. 14:22:46 ok, moving on. 14:22:57 * Roster update (virgil) 14:23:00 oh hai 14:23:09 what it says in the descriptions on the pad is basically it. 14:23:19 want to paste it here? 14:23:37 -- new badges for: country-rarity for guard and exit nodes. 14:23:38 -- new badges for: organization-rarity for guard and exit nodes. 14:23:38 -- To determine the "organization", using data from http://as-rank.caida.org/ to learn the single organizations underlying many different ASs. 14:23:39 -- Hired a contractor for pleasantness/UX. 14:23:42 -- **Looking for companies/orgs that would like to give free stuff (say free year of Github?) to high-performing relay operators** 14:24:15 not much technical stuff. I've pulled myself out of the hole of researching the proper way to calculate BGP diversity. After the Roster grant is over, OTF can pay me to figure out how to do the BGP diversity measure "correctly" 14:24:39 didn't know about http://as-rank.caida.org/ -- interesting stuff 14:24:44 as-rank.caida.org is AWESOME 14:24:47 SO AWESOME 14:25:01 it by far has the most complete graphs of the internet I've seen 14:25:20 as-is, what we're doing is (1) getting the AS-number for each relay. (2) Using the data from as-rank.caida.org to determine the "organization" underneath. Then you get extra nerd-points for being a rare organization. 14:25:41 I consider is a necessary but not sufficient condition for BGP-diversity. 14:25:55 do they offer a free database that we could include in onionoo? 14:26:01 they do! 14:26:09 to resolve IP to org name or org ID? 14:26:16 data.caida.org/datasets/as-relationships/ 14:26:23 data.caida.org/datasets/as-organizations/ 14:26:26 whoops 14:26:37 yeah okay that's correct. 14:26:55 there is also a free as/org database in geoip-database-extra. it's broken in debian but I submitted a patch fixing it to the maintainer 14:27:16 interesting, aagbsn. 14:27:36 I don't mind so much whether it's in debian, because I need to fetch new files from maxmind once per month anyway. 14:28:02 virgil: would it help you to have this in onionoo? 14:28:16 I guess the other question is, would it help other onionoo client devs? 14:28:24 I highly endorse using any data from as-rank 14:28:26 it is well-maintained. 14:28:30 karsten: yes. Just the simply as-organization would be nice. 14:28:31 also blockfinder can scrape/export routeviews.org table snapshots and export to the csv format used to build the database 14:28:32 it's just a flat dictionary 14:28:34 that doesn't change much 14:28:46 i wanted to build something that looks at rib updates though 14:28:51 routeviews.org is more complicated---the data changes depends on which point you use, etc. 14:29:13 if you want the *raw* routeviews data, I suggesthttps://bgpstream.caida.org/ 14:29:16 well, origin as shouldn't change 14:29:17 I suggest https://bgpstream.caida.org/ 14:29:22 cool 14:29:25 i will take a look 14:29:33 it will greatly simplify your life 14:29:42 virgil: would you mind creating an onionoo trac ticket for this? 14:29:54 However, if you want to do fancy analysis on the BGP data, I recommend using http://data.caida.org/datasets/as-relationships/ 14:30:02 karsten: will do. 14:30:08 thanks! 14:30:24 fwiw i think you still need traceroute to understand how the packets actually move 14:30:35 aagbsn, you can use the Sibyl system for that. 14:31:04 aagbsn, if you're into doing it raw, the RIPE ATLAS probes which I've been trying to get widely used within Tor. 14:31:59 I've been banned from moving on that valuable project, so talk to Moritz about the Atlas probes and their credits. I have about 1M credits you're welcome to use though. 14:32:58 virgil: I also hope to have a response on that two-nodes-on-same-IP-should-be-in-a-family thread. 14:33:06 +soon 14:33:16 karsten: that's just your choice on the two-nodes-sholud-be-in-same-family. 14:33:48 yeah, I just haven't had the chance to page that in. 14:33:49 karsten: IMHO it makes sense to me, but I have no strong feelings on it. Whatever other people feel. 14:34:14 just saying that it's on my radar. 14:34:19 oh sure. thank you! 14:34:26 your choice on it. I just think it makes sense. 14:34:27 SeanSaito: hello! anything you want to add to the Roster topic? 14:35:18 Hi karsten! Sorry I'm late. Nope, that's all for now. 14:35:47 okay. :) 14:36:35 looks like we ran out of topics earlier than usual. well, or people who could give updates on topics. 14:36:58 what's the status on Roster superseding Tor Weather ? 14:37:25 is it ready? 14:37:39 The email function is not implemented yet. 14:38:13 ok. I'd say as soon as it's ready, start a beta phase on tor-relays@, and if it provides reliable notifications, let's just shut down Weather. 14:38:34 Sounds good. Is there a preferred deadline? 14:38:37 okay. Sounds like permission to me. 14:38:51 that's way better than just shutting down Weather without having a replacement, though I have been thinking about that, too. 14:38:56 agreed. 14:39:13 no preferred deadline. as soon as it's ready. 14:39:19 Sure. 14:39:32 happy to beta test before you announce it on tor-relays@. 14:39:43 great, thanks 14:39:50 with a random subset of relays who I don't run. 14:39:57 err, which* 14:40:23 for example, the other day I would have been glad to receive a notification of Tonga being down. 14:41:03 Right, Tor Weather doesn't do that already? 14:42:01 in theory, yes. 14:42:08 now, am I still subscribed..... 14:43:42 what I know is that it did not notify me. 14:43:42 but, meh, I'll not mess with weather anymore. I'll wait for Roster. 14:43:42 okay, cool. what else is easier to discuss now than on the mailing list? 14:43:42 karsten: hearing you say that makes me happy. 14:45:31 okay then, looks like we're done early today. thanks for attending, everyone! 14:45:36 not for now! thanks karsten 14:45:54 #endmeeting