London. The Nexus of Big Data and Data Science

Over the past few years I’ve been working more and more with the large volumes of data that come from M2M and the Internet of Things.  It wasn’t that long ago when “Big Data” was a novelty that was largely a vision of the future – more talked about than done.  In a few short years it’s morphed into the “next big thing” that everyone needs to have and which will save our planet and our health systems.  Of course, Big Data itself is of limited use.  What changes the game is the insight which can be extracted from it.  That’s why the headline description of big data can be unhelpful. By concentrating on the “big”, it places the spotlight on the mechanics of database structures, diverting attention from the real skills that the industry needs to make it valuable.

I’d like to share some things I’ve learnt from my experience working in this area.  The first is the continuing hype.  When I put together a conference on the use of big data at the Cabinet Office last year I was hard pressed to find anyone really doing it commercially – the hype was still far greater than the practice.  I don’t think that much has changed since then. We’re still on the lower, gentle slope of the Gartner hype curve.  My guess is that the only companies making significant money from big data at the moment are conference organisers and consultants.  But attention is being paid.

The second is the type of skills we need to cultivate.  We talk about Data Scientists as the new breed of practitioner, but that’s largely a self-invented title from data analysts who want more recognition.  Extracting value from big data, or broad data if you want to be more accurate, is more than that.  The best definition I’ve heard is that it’s about telling stories with Matlab.  It’s not about Hadoop or Cassandra – they’re just the mechanics. The reality is that Big Data needs to be about Data Storytellers if it is going to be transformational.

The third thing is that this is something we do exceedingly well in London.  Other places may collect more data, build bigger server farms or invent more capable database structures.  But we tell better stories.  So if you want to generate value from big data, London’s the place to set up your business.

The reason why London is so good is interesting, and is the reason I use the word “nexus”.  For better or worse, Britain has grown into a country with a very dominant capital city, possibly more so than any other country.  It’s often pointed out that it seems to operate as a country in its own right.  As a result, over the years it has sucked in enormous information and data based industries.  It’s probably best known for financial services, but it has equal stature in mobile applications (UK companies pioneered much of the digital mobile world), TV and media, retail and Internet.  Deloitte have recently calculated that London has the greatest number of professional workers of any city in the world.  25% more than New York and double the number in Los Angeles.    It has more universities in the global top 40 than any other individual city.  With Cambridge and Oxford less than an hour’s travel away, it has commutable access to more high class academia than any other city, something which spawns a symbiotic ecosystem of commercial and Government research facilities.

That’s a very different situation to any other country.  If you look at the US you find that the different information sectors are geographically separate.  Whilst each may be equally expert in what they do, data scientists tend to get siloed.  They may move from company to company and back and forth to academia, but for the most part they stay within one discipline.

One of the first things you notice when recruiting in London is that given the opportunity, data scientists favour flexibility.  I’ve been recruiting for people to work on advanced analytics and machine learning in energy and smart homes.  Some of our team have previous experience in these areas, but they also have experience in retail, finance, string theory, social media, gambling, genetics, meteorology and most data disciplines you could think of.  As well as a few I’d not realised were into hard core data.  That diversity of opportunity is great at exposing them to multiple different analytical methods and leads to some very fruitful innovation.

It’s interesting when talking to candidates to see how they value the different scales of challenge that a variety of career opportunities can bring.  Jeff Hammerbacher, one of Facebook’s early employees, summed up the constraint of working within that particular industry sector when he observed that “The best minds of my generation are thinking about how to make people click ads, and that sucks.”  I hear that reflected in people’s reasons for moving between sectors.  Some do it to escape the daily targets and get their teeth into meatier research, with others wanting to take on the demands of the short, sharp wins in analytics.

London entered the information age earlier than most other countries.  There’s a good argument for claiming it’s the birthplace of data science, given that Charles Babbage was instrumental in founding the Royal Statistical Society.  Where else can claim a statistical institution with Royal patronage, brought into existence by a “father of computing”?   Data Science in London has been feeding on the cross pollination of data centric industry ever since.  We’ve some strong trump cards to enhance that.  The NHS is the world’s largest health service, with the largest set of medical data in the world, feeding a vibrant medical research and pharmaceutical sector.  The BBC has pioneered most new broadcast technologies, is a global leader in data driven news and a major commissioner of new creative products.  The Government has backed the general availability of data through its Open Data Initiative and the creation of the Open Data Institute, with its mantra of Knowledge for Everyone.  It’s also been supporting interoperability of data access in the Internet of Things.  The Technology Strategy Board’s set of eight demonstrator projects has just made their first public announcement of progress with the publication of an interoperability specification.  In comparison, the US Government is only just thinking about IoT interoperability.  And in recent years, London has been investing in new digital centres for innovation, the most significant of which is TechCity at Shoreditch.  When one of the TSB demonstrator projects recently organised a hackathon to test its prototypes, almost 300 Data Scientists turned up.  That’s a major confirmation of the vibrant data science community we have.

It confirms the view that data analytics in London is booming.  Not just as a poster child for data analysts reinventing their careers, but as a discipline that more and more CEOs and entrepreneurs realise is the key ingredient for transforming business.  I stress its importance in business transformation; Big Data is not just a reporting tool for business evolution, it has the power to change whole industry sectors.  What we need, as McKinsey pointed out in their seminal Big Data report, are more data scientists to do it, and more managers to understand it.

Which is why London feels so exciting for any data related business.  Data scientists are flocking here because of the range of opportunities.  The perceived early successes of data, not just in new start-ups, but in transforming existing businesses are being noticed and emulated, leading to what has all the trappings of a virtuous circle.  We’re not just seeing analysts changing their job title, but the start of a maturity of job function as companies and individuals realise that “big data” is actually “broad data”.  Alongside the storytelling magicians I’m seeing evolving specialist roles in data curation and data engineering, both key roles in building the infrastructure that this new discipline needs.  I may be shot for saying that constructing “big databases” is the easy bit, but we do have enough options and experience to do the job.  Talking to London based companies, I see a growing realisation that as the discipline matures it needs to pay more attention to where data comes from, its integrity and provenance (not that data without either doesn’t have value), as well as its accessibility.  Once again, I hear these requirements being talked about more in the London data circles than elsewhere.

And there’s a lot else to attract data scientists here.  We have more theatres in London than any other city in the world.  According to our flamboyant Mayor we have more Michelin starred restaurants than anywhere else.  (In this case he’s wrong – we’re number three behind Paris and Tokyo, but neither are great places for data.  And we’re catching them up fast in the Michelin Guide.)  It’s a great city to live in and for data scientists to spend their money in.  So if data’s important to your future – either personally, or for your company, there’s really only one place to be.