Everyone on the Civil Service Digital and Technology fast stream gets the opportunity to spend six months in the private, charitable or wider public sectors. This is to help broaden our experience, so that we can bring back valuable skills and experience to our day-to-day jobs.
After an enjoyable six months working as a data scientist at BAE Systems Applied Intelligence, I am preparing to return to government. This was an opportunity I particularly wanted for two main reasons: the company’s big data experience aligns with a lot of emerging cross-government work in this sphere, and BAE AI have collaborated on government projects in the past so I was keen to see what things looked like ‘from the other side’. I was not disappointed.
The phrase "big data" is thrown around a lot in the world today - often cited as the saviour to any number of problems. Between 2012 and 2014 it was floating around the Peak of Inflated Expectations on Gartner's Hype Cycles, but has the concept now come of age?
At BAE AI I worked on a project using disparate open datasets to provide "smart" insights into ecosystems. The data covered off all four big data ‘Vs’:
- Volume - millions of records a day.
- Variety - unstructured and structured, clean JSON and messy scrapes.
- Velocity - peaks were 10,000s of messages a minute at any time of the day or night.
- Veracity - it was open data...
I hit the jackpot with this project as it was still in its infancy, so I was one of two guinea pigs on new open-source, cloud infrastructure and had a remit to explore, discover and challenge it.
One tool I specifically want to mention at this point is Apache Spark as I found it really useful. Once I had got my head around the science of Resilient Distributed Datasets (RDDs), I could use SparkR, SparkSQL and Pyspark to access Spark's speed and power immediately using programming languages I knew and felt comfortable writing. And the speed is jaw-dropping as this bar chart from Spark’s homepage suggests:
This speed combined with simple access to streaming and powerful analytics (including the built-in MLlib machine learning library) allowed for quick experiments and results.
So why does this matter for a civil servant? Well it means that government could have cheap and simple ways to trial ideas by quickly experimenting with data. This could provide near real-time information to policy and operational decision makers, which could in turn save money and result in better choices - that's good for everyone! Bad ideas could go in the bin quicker, and good ones could be more accessible to citizens than a lengthy report. With jobs like ‘Director of Data Science’ appearing in departments, it seems like this might become reality soon too.
The other aspect of my secondment was gaining an understanding of being on the "other" side. BAE Systems means different things to different people from historic British Aerospace to new submarines and cyber security.
Government spends millions with private companies, and BAE Systems has significant defence contracts with the MoD, however they also deliver digital and data-related projects across Whitehall, from the Home Office to HMRC, as well as clients such as TfL, Vodafone or even Sony Playstation.
I was not engaged with any government department in my work, but I was still watching, listening and questioning the private sector machine around me to better understand the relationship from the private perspective. Talking to the bid teams, for example, was obviously going to be commercially insightful, but there were also far wider cultural insights too. Challenges that we face in government such as recruitment or procurement were also evident, so hearing BAE AI’s approaches to tackling these challenges was interesting.
But perhaps the two most valuable insights for me were these: the quicker you give staff appropriate training, the quicker they can add value to their work and yours; and also, spending time and energy to ensure the best match between staff and projects quickly reaps rewards. I look forward to getting back to government and putting into practice what I've learnt!