17:01:09 <hellais> #startmeeting
17:01:09 <MeetBot> Meeting started Mon Jun  1 17:01:09 2015 UTC.  The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:01:09 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
17:01:15 <hellais> ok so let's do this
17:01:18 <hellais> who is here?
17:07:52 <hellais> crickets.wav?
17:09:47 <hellais> well I guess I shall start
17:10:01 <hellais> so this past couple of weeks have been mostly dedicated to the new data pipeline
17:10:49 <hellais> since we yet don't have the server setup in berlin and I expect at least another month to complete that I resorted to making the batch data processing task something that can easily be run on AWS
17:11:27 <hellais> it turns out that it's not that expensive to spin up 4 32 core boxes if you just need them for a couple of hours
17:12:39 <hellais> so now https://github.com/TheTorProject/ooni-pipeline-ng has a task that creates 1 AWS 8xlarge instance and runs the batch task that will sanitise the reports, convert them to JSON vertically partioning by date for placement in HDFS, publish the sanitised yamls and add the report headers to a postgres database
17:13:02 <hellais> this database is then read by the ooni-api that can be seen here: ooni-api.herokuapp.com
17:13:33 <hellais> I am now just missing the step of uploading the reports to s3 from the various collectors
17:13:48 <hellais> I will implement it in two ways: 1) for legacy support 2) for future support
17:14:12 <hellais> the legacy way is to ssh into the boxes and rsync the reports over. This will be done using a luigi RemoteTarget.
17:14:56 <hellais> the pro way will be to install s3cmd on the collectors and they will invoke s3cmd when they have closed a report to upload it to an incoming bucket on amazon s3 that they have PUT capability on
17:15:21 <hellais> I expect the new pipeline to be online by the end of this week
17:16:01 <hellais> in other news we have received a legal memo on the risk of running ooniprobe in six countries
17:16:20 <hellais> if you are interesting in looking at it let me know
17:19:39 <hellais> saw_: ping?
17:21:25 <hellais> poly: hey
17:21:44 <poly> hellais: hello
17:21:51 <poly> sorry for being a bit late for the meeting
17:22:56 <hellais> ah no worries. Here is the backlog: http://paste.debian.net/193111/
17:23:19 <hellais> poly: do you have anything you would like to talk about?
17:23:49 <poly> sadly, I have been quite busy this week and haven't been able to work much on anything OONI-related
17:24:05 <poly> I should be free tomorrow though, I'll resume working on network-meter then
17:28:05 <saw_> hellias: Hi, I looked at the ticket here https://trac.torproject.org/projects/tor/ticket/15686 and it looks good
17:28:51 <saw_> i'll start familiarising myself with the code base
17:29:02 <hellais> poly: ah ok. Let me know if you have any questions or comments.
17:29:19 <hellais> saw_: great! Is the ticket specified clearly?
17:30:12 <hellais> an a bit more biggish thing that would require more eyes is this: https://github.com/TheTorProject/ooni-probe/pull/395
17:30:17 <hellais> https://github.com/openrightsgroup/cmp-issues/issues/78
17:30:34 <hellais> it's related to extending ooniprobe to be suitable for the openrightsgroup use case
17:31:13 <hellais> if done properly this could help us also with: https://trac.torproject.org/projects/tor/ticket/11975
17:35:17 <saw_> sounds good, I'll start looking into the 'removing incomplete reports' ticket and then the second one. Would it be alright if I ask you questions if i get stuck?
17:36:37 <hellais> saw_: yes of course, please do.
17:43:01 <hellais> are there any more things to discuss?
17:46:34 <hellais> well if there are none I will close this meeting. Thanks for attending!
17:46:38 <hellais> #endmeeting