Purity and dirty secrets

I left a dirty secret lurking in my previous post about 2 phase authentication with text messages. I hope many of you felt uneasy reading the val secret = generateSecret line. The generateSecret thing has no place in purely functional code. It does some weird side-effect wizardry and computes values that are always different. Read on to find out how to keep tabs on the dirty secrets (and other IO) in your code.

The problem again

Recall the bit of code that generated the authentication token and the secret to be delivered to the user.

val token = UUID.randomUUID()
val secret = generateSecret

// save the token
create(AuthenticationToken(UUID.randomUUID(), token, 
	new Date(), true, 2, Some(secret)))

// deliver the secret to the user
val deliveryAddress = DeliveryAddress(Some("447*******"), None)
second ! DeliverSecret(deliveryAddress, secret)

// we're logged in partially
sender ! Right(LoggedInPartially(token))

The lines that offends my sense of proper code are val token = UUID.randomUUID() and val secret = generateSecret. Let’s pick one of them and dissect it. It says that secret is equal to generateSecret, therefore, it should be possible to replace secret with generateSecret in the code below.

Oh wait. We can’t do that. The function generateSecret is not really a function at all. It does something and spouts out secrets. Just like the Daily Mail-o-matic. If we replace both occurrences of secret with generateSecret in the code above, it will fail. We must contain the side-effects.

The containment

We still need to perform these weird side-effects, but we need to contain them. We are going to use the type system with the assistance of the compiler to make sure we don’t do anything funky. Instead of having a function that returns secrets as Strings, we’ll have a function that returns boxes that carry the secret Strings. We will also notice that we can compose these boxes together to form bigger boxes.

Right-ho.

Thinking about this further, it turns out that it’s not just the secret that’s randomly generated, it is also the token number. In fact, we can expand the randomness to the entire AuthenticationToken. And so, we’ll have the AuthenticationTokenGenerator.


private[authentication] trait AuthenticationTokenGenerator {

  def generateAuthenticationToken(userRef: UUID): 
    IO[AuthenticationToken] =

    IO(AuthenticationToken(
    	userRef, UUID.randomUUID(), new Date(), false, 0, None))

  def generateSecret(token: AuthenticationToken): 
    IO[AuthenticationToken] =

    IO(token.copy(
        retries = 2,
        secret = Some(UUID.randomUUID().toString.substring(0, 5)),
        partial = true))

}

Notice the return types of the generateAuthenticationToken and generateSecret. They are no longer the dirty, random AuthenticationToken values themselves, but boxes that carry the random values. This is rather important. Whenever we call generateAuthenticationToken, we get a box that carries the generated AuthenticationToken. And so, we cannot use it when we need just the AuthenticationToken. This leads us quite nicely to being able to wire these boxes together. I will show you the LoginActor again in its entirety:

/**
 * Login actor that supervises the actors in the login process.
 */
class LoginActor(secretDelivery: ActorRef) extends 
  Actor with AuthenticationTokenGenerator with TokenOperations {

  import scalaz.syntax.monad._

  /**
   * Saves the token in some persistent store
   * 
   * @param at the token to be saved
   * @return the IO of the saved token
   */
  def createToken(at: AuthenticationToken): IO[AuthenticationToken] = 
  	IO(create(at))

  /**
   * Sends the secret to the user
   * 
   * @param at the authentication token
   * @return the IO of the token
   */
  def deliverSeret(at: AuthenticationToken): IO[AuthenticationToken] = {
    val deliveryAddress = DeliveryAddress(None, Some("some@email.com"))

    secretDelivery ! DeliverSecret(deliveryAddress, at.secret.get)
    
    IO(at)
  }

  def receive = {
    case FirstLogin(username, password, clientSignature) =>
      // check that username & password are OK
      if (username == "root" && password == "p4ssw0rd") {
        val userRef = UUID.fromString("a3372060-2b3b-11e2-81c1-0800200c9a66")
        // the account is there, and needs 2nd phase auth

        val action = generateAuthenticationToken(userRef) >>= 
                     generateSecret >>= 
                     createToken >>=
                     deliverSeret >>=
                     { at => IO(sender ! Right(LoggedInPartially(at.token))) }
        
        action.unsafePerformIO()
      } else {
        // not hardcoded username or password, so... 
        sender ! Left(BadUsernameOrPassword())
      }
    case SecondLogin(token, secret) =>
      find(token) match {
        case None =>
          sender ! Left(BadPartialToken())
        case Some(at) if !at.isValid(secret) && at.retries == 0 =>
          // no more retries
          delete(at.token)
          sender ! Left(TooManyBadSecrets())
        case Some(at) if !at.isValid(secret) && at.retries > 0 =>
          // bad secret, but retries still allowed
          update(at.copy(retries = at.retries - 1))
          sender ! Left(BadPartialToken())
        case Some(at) if at.isValid(secret) =>
          // delete the old one
          delete(at.token)
          
          val action = generateAuthenticationToken(at.userRef) >>=
                       createToken >>=
                       { at => IO(sender ! Right(LoggedInFully(at.token))) }
          
          action.unsafePerformIO()
      }
  }

}

So, ther you have it. We have carefully packaged up the side-effects into boxes and assembled the boxes together to get a bigger box. Together with the type checking we get from the compiler, we can ensure that we do not let dirty secrets escape into the rest of our codebase. And, I’m also happy that the = symbol means just that I can replace the symbol with whatever is on the right hand side. And the world is a bit nicer place.

This entry was posted in Jan's Blog and tagged , , , . Bookmark the permalink.

7 Responses to Purity and dirty secrets

  1. Julien Letrouit says:

    So is all IO doing is chaining closures together instead of executing the code directly? In other words, yes, your code is pure, but as soon as you _really_ want the value, it is unsafe? In particular unit testing will need to execute the unsafe code to test anything usefull. If you rely on the substitution model to prove predicates, you can only do it on the unexecute code, and therfore, this is of limited value. I am having trouble seeing any benefit here. Would you mind explain a bit further?

    By the way, great blog, continue the good work!

  2. Jan Machacek says:

    Thanks for the comments & encouragement–I’ll get back to you tomorrow morning!

  3. Toby says:

    Julien,

    This approach is well established in Haskell for I/O and other side effecting operations. You might enjoy reading the paper “How to declare an imperative” or “Monads for functional programming” by Philip Wadler (http://homepages.inf.ed.ac.uk/wadler/topics/monads.html) which explain the monadic style. The topic is also covered gently in Simon Thompson’s book, “Haskell—The Craft of Functional Programming” (chapter 18).

    For more Scala-oriented explanations, see http://apocalisp.wordpress.com/2011/12/19/towards-an-effect-system-in-scala-part-2-io-monad/ , http://blog.stackmob.com/2011/12/scalaz-post-part-2/

  4. Roger Turnau says:

    Monadic IO is well-established for Haskell, but it’s a bit of an uphill struggle to motivate people to embrace it in languages that are not both pure and lazy. If you’re working in Haskell, the programmer next to you will never be able to add something like putStrLn "Uh, problem here!" into the middle of your authorization code, because unless you’re already in an IO monad, the attempt will fail to compile. Not so with Scala. You can add a println literally anywhere. And many Scala programmers who started out with Java will have to actively stifle the desire to litter the code with logging statements, because that’s what they’re used to.

    So how do you convince people that monadic purity is important? Simple functional purity is an easier sell: if you’re creating a binary operator for a new class, you soon learn to value immutability and referential transparency as a simple defensive technique. But monadic purity? I’m not sure how best to explain how a function can be impure in its effects but still pure in its definition. I once tried to explain the motivation behind the Arrow typeclass to a smart, well-educated co-worker whose sole exposure to them was through the Scalaz library. After I was done, he gave me a look of unadulterated perplexity and skepticism, saying, “But why would you bother with all that?” Why indeed?

    So, backing up a step, and putting the argument into a shop-worn meme, here’s the argument for using monadic IO in Scala (and, more generally, why Scalaz is necessary, important, and not at all an attempt to shoehorn concepts native to Haskell into contexts where they don’t entirely apply):

    1. Lock all side-effecting code into its own monadic layer (and pray your co-workers will know enough to follow suit),
    2. ???
    3. Profit!

    Number two still needs, in Scala at least, better justification than I have seen for it so far. The whole point of the IO monad in Haskell is that nothing escapes from it (unless you resort to unsafePerformIO, a hack that advertises its dangers). Since Scala lacks this sort of guarantee, it seems to me that the abstraction is necessarily leaky.

    P.S.: Great post.

  5. Jan Machacek says:

    I do miss Haskell’s do notation, you can sort of replace it with Scala’s for comprehensions, but it’s still not bl**dy cricket :( And by that I mean the unsafePerformIO(), but at least having to write that sort of code is ugly and serves as a kick to remind me that something unexpected is about to happen.

  6. I understand how purity works. I still don’t get the benefits. In its pure form, it is not testable. You need to perform the unsafe operation if you want to test the result of the random secret generator for example. In the end, testing the operations that return an IO is like testing nothing, since nothing has been done when the function returns. What is the benefit of packaging all the side effects in a single call (which is not a single call, since all those closures will be called at that moment anyway)?

  7. Pingback: This week in #Scala (16/11/2012) | Cake Solutions Team Blog

Leave a Reply