In this series of posts I will discuss the evolution of machine learning algorithms with regards to scaling and performance. We will start with a naive implementation and progress to more advanced solutions finally reaching state of the art implementations, similar to what companies like Google, Netflix and others use for their data pipelines, recommendation systems or machine learning. A variety of topics will be discussed, from basics of ML, different programming models, impact of distributed environment, specifics of machine learning algorithms as compared to common business applications and much more. For those not particularly interested in machine learning the concepts discussed are chosen carefully to apply to a wide range of applications and ML itself is chosen as a good example.
In my previous blog post we looked into neural networks, their training and investigated a trivial single threaded object oriented implementation. The result was a working example that was, however, not useful in many real world scenarios for its poor performance. With large amounts of data such approach is extremely wasteful and we can achieve vastly better performance through parallelization.
The release of Jenkins2 put a huge emphasis on pipelines - a feature that Jenkins always had but wasn't the key highlight. This all changes with the default built in pipelines that help you write your pipelines in Groovy as code, check it into SCM and visualise your pipelines from a single pipeline job type. It provides huge improvements over the "workflow" view that used to exist, but the documentation doesn't come with a rich set of examples.
In this post I'm going to show you the key building blocks using Groovy, complete with code snippets and hopefully guide you on your way to true DevOps style pipelines - your pipeline as code.
Troy is an open source macro-based Cassandra driver, provides type-safe & compile-time checking for database queries, without imposing a DSL to express the queries in Scala. Instead, it allows developers to write plain Cassandra-query-language (CQL) queries within Scala code, complete with schema validation.