What we’ve been reading: February

Our society members are university students and therefore, do a lot of reading!

Here are some of the most interesting things we’ve read and shared for the month of February (which was quite a bookish month). You can see January’s picks here.

Iowa Caucus vote
Results via Google.

Like many of us looking nervously across the pond at the prospect of an electoral run by someone with worse hair than Boris Johnson, Adam posted New Scientist’s article about¬†how Ted Cruz won the Iowa caucus using Facebook data driven microtargeting, borrowing techniques from President Obama’s reelection strategy. Often hailed as the future of campaigning, microtargeting campaigns still have significant hurdles to overcome, such as boors with a proverbial megaphone and no concept of the phrase ‘bad press’.


Analytics Vidhya header
Graphic header, from Analyticsvidhya.com.

Myrian posted an article from Analytics Vidhya by Kunal Jain, who made a great graphic summarising 20 lessons a data scientist needs to master. Jain learnt these over a space of 10 years so do not think you need to grasp these over the summer, but particularly interesting is their division into two categories: data science is only half the battle, but non technical lessons such as knowing your business and remembering to practice your techniques and learn new ones as they emerge.


Taxi pickups and drop offs, Todd Schneider
Todd Schneider’s map of Pickups and Dropoffs in New York.

Matt posted Todd Schneider’s wonderfully detailed investigation into Uber usage in New York. Schneider examines the different subcultures of the city and gathers insight on their lives simply though this single aspect (taxi rides), such as most bankers getting to work between 7 and 8am, and understanding whether or not that chase in Die Hard with a Vengeance was really such a nightmare. Schneider’s other work with big data sets include a diverse range of subject from mortgages to marriages to gambling, all presented in a clear and often humorous way that carries even newcomers to data right through to the end of analyses.


Earth Weather Map
Current wind conditions at 500 hPa, or roughly 5,000 km altitude.

Lastly, Louisa posted a link to a real time interactive weather map of Earth, created by Cameron Beccario. Beccario’s map is one of the most impressive weather visualisations available online, knitting together a vast array of data sources (e.g. NASA’s Goddard centre data on chemicals and particulates; NOAA data on global weather) and presenting them as a mesmerising map with many filters allowing you to change altitude or focus, such as wind speed, ocean currents, particulate density or even the type of map projection to use.


Performing Data Visualisation on Cycling Data

One of our more learned members on all things cycling and data science, Matt Whittle (Twitter, WordPress), presented a working seminar/tutorial on mapping data using R QGIS and JS. Matt’s project is to measure the cycling rates in various counties around the UK, and identify cycling hot spots in contrast to air pollution levels. Matt’s project has also garnered interest from Parliament, so our event was like a little preview (or maybe a practice run).

During the session Matt guided members through the process of creating an interactive map using R, QGIS and JS from the initial setting up of site structures, through to creating the map itself. It’s based on work he did for Road Safety Week, who are switching some of their focus to the air pollution caused by congested roads, and not just the singing hedgehogs who are surely a fixture in most British kids’ memories from the last thirty years. Many of our members are familiar with R but not so much QGIS so it was an interesting experience all round to use some specific data mapping software.

The Herd Digital Jobs Fair, Leeds

HERD flyerHerd are a technology focused recruitment agency who held their first digital jobs fair in the cake tin the First Direct Arena, which several members of data soc put a look-in to. It seemed quite busy around the halls, showing a good turn out from people both from Leeds University and Leeds Beckett, the latter of which sponsored the event. There were interesting talks on personal branding from Google and how to optimise the web presence when starting up one’s own business. Amongst others, companies featuring at the event included Call Credit, Sky, Plus Net, Unilver and William Hill. While data soc members found it interesting, it seemed quite heavily weighted towards the developers cross section of digital jobs, and so it would be nice next year if they could have a few more stands/companies interested in data scientists and analysts.

Big Data Surveillance: Snowden, Everyday Practices and the Digital Future

Leeds University School of Law hosted Professor David Lyon for its annual CCJS lecture, which several members of the Data Science Society were pleased to be able to attend. Professor Lyon is a pioneering figure in the study of surveillance and his lecture centred on how the rise of Big Data has affected the practice of surveillance by various authorities, most notably by the NSA whose activities were revealed by Edward Snowden, and how its development will continue to change surveillance practices.

A sociologist by background, Professor Lyon’s perspectives were particularly helpful to data soc members whose interests in public policy and health often intersect questions about privacy and decision making, as well as the nature of making predictions from aggregated data sets.