Perth Big Data Week 2014

The week started with a practical hands-on workshop at the University of Western Australia.  Dirk Slawinski (CSIRO) showed great courage to perform a live demo on how to extract data from the IMOS portal (http://imos.aodn.org.au/imos123/).  With a focus on Mooring data (http://www.waimos.csiro.au/) he demonstrated the use of MatLab in extracting and using IMOS data.

Scitech was a great supporter of BDW 14 with Jeff Harris (Scitech) hosting a Monday evening event at the Scitech Planetarium.  Steven Tingay (Curtin University) talked about the Murchison Widefield Array – Big Data Precursor to the Square Kilometre Array and Andreas Wicenec (UWA) spoke about The SKA Data Challenge and the Australian side of the SKA1. Big Data themes were also presented at special ‘Deep Space’ planetarium sessions throughout the week and at CoderDojo sessions for school kids on the Saturday.

Tuesday morning started with Rob Mollard (SGI) and Jamie Sunderland (AARNET) giving excellent talks on storage and with Jamie outlining storage options through AARNET. In the evening, the Mega Data Cluster event was well attended and consisted of a “Rabble” that gave presenters a maximum of 10 minutes to talk. They included presentations from:

  • Winthrop Professor Andy Whiteley, School of Earth and Environment (UWA)
  • Professor Grant Morahan, Centre for Diabetes Research (Harry Perkins Institute)
  • Peter Winn (Velrada)
  • Professor Andrew Rohl, NanoChemistry Research Institute (Curtin University)
  • Professor Jenni Harrison, Head of Data Team (iVEC)

Highlights included Winthop Professor Andy Whiteley’s talk on microbes and how there are more microbes on earth than stars in the universe as well his citizen science project (http://www.microblitz.com.au/).  Professor Andrew Rohl also talked on the importance of ensuring researchers make best use of the advances in supercomputing by ensuring research questions and code are suitably ‘parallelizable’ to use the large number of cores available.

Braving Perth’s wettest 24hrs in 9 years, an enthusiastic crowd turned out Wednesday night to the Edith Cowan University Mt Lawley campus to listen to Rob Mollard (SGI) break down data storage jargon and Maria Albertsen (ECU) talk about the analysis of genome-wide sequencing data using iVEC´s supercomputing resources.

On Thursday, Francis Mitrou (Telethon Kids Institute) gave an in-depth talk on ‘Statistical data integration using administrative data for health and social research’.  Judging by the number of questions from the audience it was definitely thought-provoking and I learnt about the great data linkage work done in Western Australia. Some useful links include:

Travis Kelleher and Mollie Hewitt from FORM (http://www.form.net.au/) treated their audience to a discussion about some of the questions that arose from their award-winning Canning Stock Route Project – http://mira.canningstockrouteproject.com/.  The process of developing Mira has involved retrospectively adding metadata to over 40,000 items of unique cultural content, and this process has raised a number of ongoing research questions.

To close the week, an iVEC panel discussion, made up of:

  • Andrew McGee: Chief Technology Officer – Australia & New Zealand (Hitachi Data Systems)
  • David Wilde: Network Architect (AARNET)
  • Trent O’Callaghan: Sales Engineer (Nextgen Group)
  • Robert Mollard: Sr. Storage Specialist, Asia Pacific (SGI)
  • Dr George Beckett: Deputy Director and Head of the Supercomputing Team (iVEC)

and moderated by David Satterthwaite (iVEC), had a wide ranging discussion.  Some of the discussion focussed on these predictions from The Guardian newspaper –http://www.theguardian.com/technology/datablog/2014/jan/14/big-data-4-predictions-for-2014

  1. In 2014, people will finally start to understand the term big data. Because, as it stands, many do not.
  2. Consumers will begin to (voluntarily) give up certain elements of privacy for personalisation.
  3. Big data-as-a-service will become a big deal
  4. And finally… remember how Hadoop is an open-source software? Expect a lot more of that.

Data privacy and data linkage were one of the main themes to the discussion.  This topics came up across many of the events and at the moment many people are not aware they are giving up ‘big data’.  An example was how private companies are now scanning car number plates in car parks to enforce parking time limits, especially overseas.  All agreed that open source and standards for ‘Big Data’ is the future and companies that embrace it will be the ones to prosper.

Finally, the ‘Big Data – Big Community Sundowner’ event by Dell was a great way to finish BDW 14 with Peter Jung (Dell – Snr. Business Developer and Solutions Architect) flying in especially for this event. He provided a great presentation on ‘Big Data’ with a focus on Hadoop. Paul Newman (iVEC) also gave an overview of the iVEC Pawsey centre to the crowd.

The variety of topics and events for BDW 14 in Perth was the highlight for me and although ‘Big Data’ as a term is often misused, the underlying technologies and problems it represents will continue to grow in importance and spread throughout society.  For example, it was raised in the recent National Commission of Audit recommendations – http://www.ncoa.gov.au/report/phase-one/part-b/10-5-data.html

iVEC was the city partner for Perth and we thank all who presented, attended and made BDW 14 a success.