Case Study: Onzo
Onzo operate an energy analytics platform. This allows utilities to make the most of their customer data.
I was hired as the lead architect, responsible for the complete platform. The existing platform was unreliable and did not scale well. We decided to build a new platform that is stable, reliable and scalable. The platform needed to support both streaming and batch analytics. I was also partly involved in a funding round. We successfully raised cash from a business partner and two private equity firms.
Firstly we needed to migrate customers from the legacy platform to the new one. This proved to be challenging as our customers were able to make few, if any changes on their side. I didn't want to carry the legacy of previous design decisions with us. We needed to strike a balance between starting with a clean sheet and not breaking existing deployments. Another challenge was the rapidly changing business environment. The functional requirements were changing on an almost daily basis, requiring a very fast turnaround.
In the end, I built two "platforms" to support the batch and streaming analytics. I discovered that there was very little overlap in terms of functional or non-functional requirements between the two platforms. The streaming platform run on the Akka stack. I chose core Akka, Streams, Persistence & Clustering. The batch platform is based on Apache Spark. Both systems used Apache Cassandra as their data store. The machine learning elements of the system are built primarily in Python/Tensorflow
- Google Tensorflow
What did I learn
When the direction of the business is unclear it’s best to get something up and running quickly. We need to accept that software developed this way will have to be rewritten at a later date. This is usually a hard sell to CFOs and investors. However, there is tremendous value in writing “throw away” code. The business can validate which ideas are actually viable. The tech teams can discover the challenges and pitfalls of a particular domain at relatively low cost.
It’s not the first time I’ve discovered this principle. Over the years, I’ve worked on many software projects, I would have to admit that we rarely get it right first time. My time at Onzo really reinforced this belief.
The data scientists need to "own" their models. When I joined Onzo the data scientists created models that were essentially proof of concepts. They would often be developed in Python notebooks. The POCs were then handed to Scala developers who were responsible for turning them into Scala code. This process didn't work well. We moved to a position in which the data scientists owned the models through to production. This was far more sustainable.
Want me to help with your project?
If you take on freelance work, you can use this section to prompt any potential clients to get in touch with you with their project requirements.