What we’ve been reading: March

Our society members are (mostly) university students and therefore, do a lot of reading!

Here are some of the most interesting things we’ve read and shared for the month of March. You can see posts for January and February by clicking on the ‘What we’ve been reading’ category.

We hosted our first speaker event with Professor Bill Gerrard in March. To show people some of the ideas Bill would be discussing, Lawrence posted an article from the Guardian discussing Bill’s work with the Saracens rugby union team in creating a statistically based performance management system. He also posted the trailer for Moneyball, the 2011 film based on the work of Billy Beane with the Oakland A’s baseball team.

flowsMatt attended a bike hackathon where his group’s objective was to try to find a way to reduce the use of cars as the means of primary transport around the Lake District (now that they can stop using boats), which he wrote about in his blog here. The image on the right is one produced at the hackathon, where the team mapped the origins and destinations of 8000 visitors to the Lake District National Park. Following on from that, he introduced us to the CycleStreets open source project initiatives for developers, where they have a wishlist of challenges that anyone can try to work on.

 

 

Father son occupation pairs from facebook researchLastly, Louisa posted research from Facebook where using a few million of their users, they examined whether there was any pattern to the job that parents and children have. It’s interesting research as it examines quite a common trope that we often hear in descriptions of others – a military family, or that they come from a long line of lawyers, and there is an underlying assumption in common discourse that certain jobs seem to run in families. The graphics produced by the Facebook team are comprehensive and interactive, and thus well worth a look. This picture is taken from father son pairings, where the father’s profession is military. Before your mind thinks it has spotted a pattern, there is a thicker line between father – son pairings for management, but some families do have an unbreakable pattern of dull jobs.

What we’ve been reading: February

Our society members are university students and therefore, do a lot of reading!

Here are some of the most interesting things we’ve read and shared for the month of February (which was quite a bookish month). You can see January’s picks here.

Iowa Caucus vote
Results via Google.

Like many of us looking nervously across the pond at the prospect of an electoral run by someone with worse hair than Boris Johnson, Adam posted New Scientist’s article about how Ted Cruz won the Iowa caucus using Facebook data driven microtargeting, borrowing techniques from President Obama’s reelection strategy. Often hailed as the future of campaigning, microtargeting campaigns still have significant hurdles to overcome, such as boors with a proverbial megaphone and no concept of the phrase ‘bad press’.

 

Analytics Vidhya header
Graphic header, from Analyticsvidhya.com.

Myrian posted an article from Analytics Vidhya by Kunal Jain, who made a great graphic summarising 20 lessons a data scientist needs to master. Jain learnt these over a space of 10 years so do not think you need to grasp these over the summer, but particularly interesting is their division into two categories: data science is only half the battle, but non technical lessons such as knowing your business and remembering to practice your techniques and learn new ones as they emerge.

 

Taxi pickups and drop offs, Todd Schneider
Todd Schneider’s map of Pickups and Dropoffs in New York.

Matt posted Todd Schneider’s wonderfully detailed investigation into Uber usage in New York. Schneider examines the different subcultures of the city and gathers insight on their lives simply though this single aspect (taxi rides), such as most bankers getting to work between 7 and 8am, and understanding whether or not that chase in Die Hard with a Vengeance was really such a nightmare. Schneider’s other work with big data sets include a diverse range of subject from mortgages to marriages to gambling, all presented in a clear and often humorous way that carries even newcomers to data right through to the end of analyses.

 

Earth Weather Map
Current wind conditions at 500 hPa, or roughly 5,000 km altitude.

Lastly, Louisa posted a link to a real time interactive weather map of Earth, created by Cameron Beccario. Beccario’s map is one of the most impressive weather visualisations available online, knitting together a vast array of data sources (e.g. NASA’s Goddard centre data on chemicals and particulates; NOAA data on global weather) and presenting them as a mesmerising map with many filters allowing you to change altitude or focus, such as wind speed, ocean currents, particulate density or even the type of map projection to use.

What we’ve been reading: January

Our society members are university students and therefore, do a lot of reading!

Here are some of the most interesting things we’ve read and shared for the month of (December and) January.

Commuters using Oyster - TfL Press
Commuters using Oyster Cards. Credit: TfL’s Press photos on Flickr.

Matt posted Transport for London’s blog about customer data  from from TfL’s API systems out to developers. TfL are trying to simplify the outputs from their APIs to ensure that developers can get good use out of the information provided – the volume of data that TfL produces means that it can be easy to drown in a sea of oyster journeys. Loading 1.5bn journeys to grind through a year’s worth of data is probably a great way to crash your machine!

 

R fiddle interface
Loading R-Fiddle.

“Those who are eager to get started and work on R on the go over the holidays (yay!) should check out R-fiddle!” This was Karen’s suggestion for some holiday coding. R-Fiddle is an in-browser console from the people behind Datacamp . R-Fiddle was designed to provide a rough and ready environment for quick coding needs without needing to sign in or sign up to online services, or wait for R to load on computers. It’s mobile friendly too, just in case that spark of inspiration hits you on the bus.

 

CDRC Masters Dissertation HeaderLawrence posted news from his research centre regarding Masters’ dissertation projects that the Consumer Data Research Centre (CDRC) are advertising. The CDRC is led by the University of Leeds and UCL, in partnership with Oxford and Liverpool.  The Centre works together with industry partners to gain insight from the wealth of consumer data available in the UK. CDRC projects are sponsored by companies as diverse as Sainsburys, Boots and Eon, and you can check them out here.