Understand Functional Programming in 20 Minutes

g/christensen · June 29, 2022

Functional programming has a reputation for being notoriously hard to understand. In this post, we examine its basic principles, abstractions, and building blocks. Hopefully, after reading this, you will be able to decide for yourself whether you really want to embark on a trip into this wonderland.

Down the Rabbit Hole: The Foundational Principles

The principles discussed below have a profound impact on how the functional programs are designed. Traditional object-oriented design is focused on entities, their behaviors, and the relations between them. On the other hand, the functional world data is completely separated from behavior. Programs written with the functional approach are designed as workflow pipelines of data transformations. The functional composition is probably the central principle here, on which most of functional abstractions are based.

Immutability and Referential Transparency

Despite a widespread belief, it is not the presence of lambdas, and even lambda calculus, that makes programming functional. The truth is embarrassingly simple. In a purely functional program, all data structures should be immutable, and functions must be referentially transparent (free of side effects). A referentially transparent expression could always be replaced with its value without changing the result of the program. This means that, like mathematical functions, such functions should always return the same value when called with the same set of arguments. But there is a little quirk. To be able to participate in functional composition, all functions should have only one argument. We will see how to deal with this below using currying.

First-class functions

Having first-class functions means that functions could be returned as a value or passed as arguments. In this case, lambdas or anonymous functions make sense. The existence of lambda-functions that could be defined elsewhere implies the existence of lexical closures. In the functional languages with side effects, lexical closures may become a basis of poor man’s encapsulation. But more often, first-class functions are employed as the mechanism for another type of abstraction: higher-order functions.

Recursion and Corecursion

Recursion is a way of solving problems by representing it as a combination of some base case and a smaller version of the same problem. Usually, recursion is implemented through self-referencing function calls or through mutual recursion. While recursion starts with the last step and continues until the base case is achieved, corecursion starts with the base case and continues until the first step. Depth-first and breadth-first tree traversal algorithms are examples of recursion and corecursion respectively.

Higher-order functions

Higher-order functions are functions that accept other functions to use them as a part of the algorithm. They may be considered as an abstraction device similar to the template method in OOP. The map is a usual example.

In the context of functional programming, combinators are the higher-order functions that do not have dependencies on the outer context (do not have free variables) and use only function applications to produce results. Y is a curious combinator which allows achieving self-recursion in languages that do not support referring to a function by its name from its own code. In general, combinators are used to hierarchically organize abstractions, reducing code complexity.

Folding

In purely functional languages, the mutation is prohibited, and we can not create a loop that modifies a variable through its iterations. To iterate, we can only rely on recursion. A higher-order function called fold (also known as reduce) serves as the generalization of this recursive approach to iteration. It takes a recursive data structure, such as a list, along with a combining function, and returns a value that combines all the elements of the given data structure. Recursion may be implemented in two different ways, by combining the first or the last element of the data structure with the result of the combination of the rest. There are two corresponding kinds of the fold: the right (foldr, for example 1 + [2 + [3 + [4 + 5]]]) and the left (foldl, for example [[[1 + 2] + 3] + 4] + 5). The direction of folding may be important if the supplied combining function is not commutative. Tail-call optimization is required to circumvent the issue of stack overflow on large inputs.

Functional composition

Because functions should not produce side effects, there is little sense in sequential function evaluation. And indeed, functional languages generally allow only one statement per function. Although, if pattern matching is involved, this statement may contain more than one path of execution. One of the ways to achieve sequential evaluation in a single statement is to compose functions. For instance, while imperative programmers write f1(); f2();, functional programmers write f2(f1());. Thus, the function composition becomes the crucial primitive used to make abstractions in the functional world.

Functional programmers invented many ways to compose various things with each other. For example, Haskell has the following compositional operators: . $ <$ <$> $> <> <* <*> *> >> >>= =<< >=> <=< <|>. By using them, it is possible to conveniently make an abstraction by saying f3 = f2 . f1 which is equivalent to f3 = f2(f1()) in the point free notation. And if this is not enough, you can always create your own operators. What a splendid language!

Currying and Partial Application

To become a subject for the composition, functions should accept only one argument. Currying is the operation that creates a single-parameter function from a multi-parameter one. A curried function is a single-argument function that simply returns another function with the number of arguments decreased by one. On the other hand, partial application (not to be confused with partial functions) allows binding of predefined values to some function parameters, which also results in a function with a lesser number of arguments.

Lazy and Strict Evaluation

When working with recursive algorithms and data structures, it makes sense to defer the evaluation of the recursive part until it is actually needed. Such deferring based on syntactic constructs is called lazy evaluation and allows, for example, the creation of potentially infinite sequences. Such tricks are impossible in strict languages, where all expressions are evaluated completely just after they are encountered.

Hookah

Through the Looking-Glass: Where We Meet Category Theory

Design of purely functional programs heavily relies on category theory and the underlying algebraical laws of the employed functional primitives. We will only scratch the surface here to give a glimpse of what purely functional design may look like.

Algebraic Data Types

Although functional languages may be dynamically typed, there are plenty of statically-typed ones. At first glance, functional type systems with their typeclasses (templates that define possible operations on types) may seem complex and unwieldy. It is because they may really be complex and unwieldy.

There are two kinds of algebraic data types (ADTs): sum types and product types. They vary in the way by which their number of inhabiting values is calculated. For example, the number of possible inhabiting values of 2-tuple of bytes, which is a product type, is 256x256 = 65536. It includes all possible combinations from [0, 0] to [255, 255]. On the other hand, the number of inhabitants of the Boolean, a sum type, is 2: true and false. Product types and sum types are collectively called “algebraic” because they have algebraic properties similar to normal integers.

In practice, the sum types are often employed as containers or tags that wrap some values which then are processed with pattern matching. Such wrappers are called “contexts”. For example, the following listing defines the Maybe sum type with two value constructors used as tags. Just takes a parameter and Nothing appears as is.

data Maybe a = Just a | Nothing
x = Just 10 -- wrap an Int into the Maybe container 

Similar to a null-reference, the Maybe type may be used to distinguish between the absent and fulfilled results of a computation. Because in pattern matching compiler will enforce the implementation of the Nothing branch, it is generally more safe than the use of null, dubbed as the billion-dollar mistake.

Pattern matching

Pattern matching offers syntactic constructs needed to destructure and recombine values stored in ADTs.

add :: Int -> Int -> Int        -- Hindley-Milner signatrue declaration
add x y = x + y                 -- add takes two Ints and returns an Int 

addmb :: Maybe Int -> Maybe Int -- addmb takes an Int packed into a Maybe 
addmb x =                       -- and also returns an Int packed into a Maybe
  case x of                     -- the pattern matching takes place here
    Just x' -> Just (add x' 2)  -- x' is the Int destructured from the Maybe wrapper
    Nothing -> Nothing          -- Nothing is left as it is

It is important to note, that each branch of a pattern-matching expression spawns its own path of execution. Because exceptions disrupt the control flow of a program, functional languages use pattern matching based on contexts to process errors. Being returned somewhere in the middle of the chain of composed functions that use pattern matching, Nothing will short-circuit the computation - only the branch with Nothing will be chosen. Such change of a computational path or of a context is called an “effect” (not to be confused with side effects). More precisely, effect is a computation that transforms something into something other (probably of some other type) wrapped in a context ADT. Such ADTs are called effect types.

The two following sections contain some highly condensed abstract stuff, but it is important. Please, do not switch the channel.

Some Category Theory

A category consists of a set of objects and a set of morphisms between the objects. Morphism f between objects a and b is written as f: a -> b. Morphisms compose. In the context of functional programming, objects are types and morphisms are functions. Morphisms that map to the same type (f: a -> a) are called endomorphisms. Morphisms may map not only between types, but also between categories, for example:

-- a function that takes type a and returns type b is mapped to a function
-- that takes and returns these types wraped into the Maybe container
F :: (a -> b) -> (Maybe a -> Maybe b)

Such higher-order morphisms are called functors, and mapping into some other category (context) is called lifting. Morphisms that preserve the algebraic structure of the objects, as in the definition above, are called homomorphisms. There are also morphisms of functors. Nothing complex, just arrows everywhere.

Algebras

Algebras may be considered as a set of functions operating over some ADTs, along with a set of laws or constraints specifying relationships between these functions. For example, the magma is defined as a set that is closed under a binary operator. Functional programmers create algebras to formalize and verify the design of their programs.

All this “complex” algebraic stuff is actually what you learned in elementary school:

  • A semigroup is a magma where the binary operator is associative.
  • A monoid is a semigroup where the set has a identity element.
  • A group is a monoid where every element of the set has a unique inverse element.
  • An Abelian group is a group where the binary operator is also commutative.
  • A semiring is an algebraic structure with set R and 2 binary operators, + and ⋅, called addition and multiplication. It has the following laws:
    • (R, +) is a commutative monoid with identity of 0.
    • (R, ⋅) is a monoid with identity of 1.
    • Multiplication is left- and right-distributive.
    • Multiplication with 0 annihilates R.
  • In a ring, the set has an additive inverse.

The successful functional design relies on the proper understanding of the algebraic properties of the corresponding constructs, such as monoid, modad or functor. For instance, a monoid has the following algebraic properties:

  1. Closure: <category element> + <category element> = <element in the same category>
  2. Associativity: (1 + 2) + 3 = 1 + (2 + 3)
  3. Identity: 0 + <a number> = <the number> + 0 = <the number>

Addition of integers is a monoid, so it could be used, for example, in folding without the fear of an incorrect result.

Abstractions Based on ADTs

Let’s imagine that you decided to create some abstractions by composing some functions that accept and return some ADTs. During the course of execution, they may produce some effects. Here you are entering the territory of the functional plumbing with morphisms. Unfortunately, the constraints imposed by static typying could not be satisfied automatically. Your task is to choose the right morphisms to make the chain of composed functions to work as intended. The morphisms are usually implemented as methods of the typeclass implementation for the particular context type.

Let’s assume that mayb returns an integer result of some computation wrapped in Maybe.

mayb x = Just x -- here we just wrap x of any type into Maybe

Let’s create an abstraction addcomp, that adds an Int to the result of mayb.

addcomp :: Int -> Maybe Int
addcomp y = y + mayb 1 -- this does not work

You can not say that y + mayb 1 because there is no + operator that sums integers and Maybe. In OOP you probably would implement such an operator or a visitor. In FP you need to use a functor (<$>) to perform addition inside the context of Maybe.

addcomp y = (+ y) <$> mayb 1 -- here + is partially applied

To use the functor or another morphism with your own wrapper type, it is necessary to implement the corresponding typeclass by providing instances of its methods such as map, fmap, mappend and others. For example, this is how fmap of the functor typeclass instance for Maybe is implemented:

instance Functor Maybe where    -- <$> seen above is the infix synonym for fmap
  fmap f Nothing = Nothing      -- nothing is done with Nothing
  fmap f (Just x) = Just (f x)  -- x is destructured from Maybe, applied to f,
                                -- and placed back

fmap takes a bare function, + in our case, along with a value wrapped into a context, and applies the function inside the same context.

Sometimes computations end with functions packed inside a context. In this case, we use applicative functor (<*>) for plumbing. This appliance accepts the function inside a context, a value inside a context, takes them out, applies them, and puts the result back. For the Maybe container pure 1 returns Just 1. Watch the hands:

addcomp y = mayb (+ y) <*> pure 1 -- this implementation of addcomp produces 
                                  -- the same result as above 

All compositional operators in Haskell are intended for such plumbing. To become a skilled functional plumber, it is necessary to understand the meaning of the bottomless crevasse of such magical words as functor, bifunctor, profunctor, applicative functor, invariant, contravariant, lens, Kleisli arrow, monad, and, save the Lord, comonad. In theory, all plumbing should be checked for consistency with the corresponding algebraical laws mentioned above. Do the composed functions commute? Does the identity law hold? There is even some advanced literature on this topic that promises a path to the functional nirvana.

Monads

No text on functional programming could avoid the topic of monads . Monads are truly magical. They allow chaining sequential computations that produce effects. They even allow performing side effects or input/output, which are considered impure operations.

What are monads? Monads are monoids in the category of endofunctors, which, as you remember, are semigroups with identity. This means that they are also functors. Particularly, applicative ones. If this does not make things clear, monads just allow to compose functions of the form f: a -> m b, which are called monadic or effectful functions. Here m is a context for which you need to implement a monad typeclass instance, and, as the signature suggests, the type b may be completely different from the type a. For an example, let’s create a monadic function that returns the reciprocal of the given number:

reciprocal :: Int -> Maybe Double
reciprocal 0 = Nothing
reciprocal x = Just $ 1 / (fromIntegral x)

Because the addcomp function from the example above is also monadic, we can compose them using the monadic bind (»=):

moncomp = addcomp 1 >>= reciprocal -- in the point-free notation 
                                   -- the argument is assumed

The call moncomp 1 will result in Just 0.5.

All the magic happens in the context’s implementation of bind. This is how bind for Maybe is implemented:

instance Monad Maybe where  
...
(>>=) :: m a -> (a -> m b) -> m b  
Just x >>= f = f x  

It takes a value in a context and a monadic function, so the result remains in the context. Monads may bring their own problems: the contexts may stack like onion petals, and additional plumbing may be required to peel them off.

What is so fancy about all this?

When looking at combinators modelled as a workflow which represents some business rules, some people compare them whit prose, poetry, or haiku:

prepareForm templateURL = do
  reportErrors
  downloadTemplate
  insertCurrentDate
  trimFields

When the composition is used along with pattern matching, we also get error processing almost for free, and there are no ugly try/catch blocks around every function call. An error simply short-circuits the happy execution path. Tiny functions with focused responsibilities facilitate code malleability. Although, since functions need to be curried, and the plumbing logic may be scattered across typeclasses, there are some limits.

Afterword

OOP programmers just declare properties in their classes to maintain state. In functional programming, state is avoided whenever possible. But when it is impossible, functional programmers are bound to pass monads in and out. They usually stash state somewhere in tuples along with the results of their computations. Does this help to create clear, comprehensible, and maintainable designs? It is for you to decide. And users of the dynamically-typed functional languages do not learn category theory. They just use the comp function to compose. It is barely imaginable, how they are able to write working programs with this.

Party

Twitter, Facebook