Posted by Martin Zapletal
Sun, Mar 8, 2015

Concepts such as event sourcing and CQRS allow an application to store all events that happen in the system using a persistence mechanism. The events can not be mutated and current state of the system in any point in history can be reconstructed by replaying all the events until that point. For performance reasons obviously the state can be cached using a snapshot. But the undisputable advantage of this approach is that the whole history of events (including user actions, behaviour or system messages - anything we decide to store) is available to us rather than just the current state. Event sourcing was thoroughly discussed before for instance in [1] or [2] and CQRS in [3], [4] or [5]

In this post we will discuss how we can store and further use these data by connecting Akka, Cassandra and Spark, focusing mostly on the configuration, Akka serialization and Akka-analytics project. Later I will follow up with another blog post building on top of this with an example of using machine learning techniques to obtain some insights to help optimize future decisions and application workflow.

Posted by Martin Zapletal
Sun, Nov 9, 2014

Apache Spark has been receiving a lot of deserved attention lately [1]. It is very understandable given the huge importance of distributed data processing for many companies and the pursuit for faster, cheaper and easier to use technologies aiming replace or complement the widely adopted Hadoop ecosystem and its MapReduce paradigm.

Posts by Topic

see all

Subscribe to Email Updates