We started building mabl, the ML-driven automated end-to-end testing service in March 2017. All throughout, we've made many decisions regarding the technology used to build the service. Late in 2017, we began to show the service to alpha customers and since then, many have been curious as to the technology that sits behind the UI. So I thought I'd share the details.
Beginning with the infrastructure layer, we had to decide where to run mabl. This was a decision which was researched by our early engineering team. As we began to compare AWS and GCP, we focused on some of the core components we knew the application would use across both providers.
We knew we wanted to be serverless as much as possible, so we'd be heavy users of containers including a container orchestration service as well as a compute-on-demand service. In addition, since we were designing an application which would use several different analytics layers, we knew we needed a data pipeline, database/query service, and machine learning capabilities. Ultimately, we decided that because Google offered an off the shelf version of Kubernetes called Google Kubernetes Engine, it provided less lock-in to the cloud provider if we ever wanted to switch, unlike AWS which offered a custom version of Kubernetes.
On the analytics front, we did a bake off of Dataflow vs. Kineses and BiqQuery vs. DynamoDB and pinned the ML services against one another. In all of these bake offs, GCP's analytics services were a much better fit for our use-case than AWS for reasons which we’ve documented. We've now been on GCP for over a year, and we couldn't be happier. We use many of their services including Cloud Functions, Cloud Storage, Key Management Service, Stackdriver, Firebase, AppEngine, and of course, the core services that drove our decision.
Moving from backend to front end, we evaluated several different front end frameworks to support mabl, which we knew would be a single page app. This collection included Angular, Angular 2, React, Ember, Backbone, and the newest trendy sensation, Vue.
For this research project, our engineer who had the most front end experience led the analysis and looked at variables including ecosystem support, ability to get up and going quickly, how quickly new features were moving from beta to GA, as well as if there was a major contributors to the project - for example Facebook with React or Google with Angular. The shortlist came down to Angular 2 vs. React, and we ultimately decided on React for various reasons.
The first reason was the community and their satisfaction. During the time gap between Angular 1 and 2, a large community of engineers emerged - using, developing, and supporting React - and their feedback was quite positive. Many developers we spoke to were advocates of how quickly they were able to add new features in React, albeit with a steeper learning curve from other frameworks they’ve worked with.
We also appreciated the performance we’d get using React, a large part of it due to how it renders dom elements in the browser. When paired with Redux, we’re able to manage application state centrally and optimize performance better by managing app re-rendering based on state changes.
Overall, what we learned during our research of selecting a front-end framework has indeed been true - the learning curve is high but we’re able to add new features very quickly, much quicker than we did at my prior company, where we were using Angular.
Next came the decision of what language we would write the application in. Our engineering team had a broad range of skill sets, but we really wanted to make an objective decision, not one which pushed us towards the most popular language which most of our engineers had experience with.
Next was our second easiest decision, choosing between Python and Java for the backend. We wanted to choose a language which allowed us to move fast. We felt it would be easier to debug Java, as analysis by humans would be easier, since Java was a statically typed language as well as compiled. Given we didn’t have many advocates for Python and we thought Java would allow us to move faster, we landed on Java.
Lastly, we had to decide which languages to use for the machine learning and analytics component of the application. We looked at the both the different types of machine learning models we were training (performance anomaly detection and visual change detection), as well as the data pipeline.
As we dug deeper, we found the best support and libraries for our machine learning models (which includes tensorflow) was in Python, so we decided on Python for the training side. On the analysis front, we wanted to use GCP Dataflow (Apache Beam) which only supports Python for batch, not streaming, so we had to use Java in order to use Dataflow for streaming.
Despite us spending a lot of time on each of these decisions in the early days of mabl, that’s not to say we won’t revisit them over time. Our requirements may change, we may have been wrong about one of our decisions, or numerous other reasons as to why we may revisit and switch. If you feel there’s an area of our application which could benefit from different technology, we’d love to hear your perspective in the comments below.
Try mabl for free here.