How to Design Software Systems?

g/christensen · July 7, 2021

Software Systems Design

The notion of software design may imply a multitude of various aspects. For example, it may be concerned with the quality of user experience or subtle intricacies of project management. If you are asking yourself more mundane questions, such as: “How should I assign responsibilities to these classes?” or “How to implement API of the service layer?” this post is for you. We begin with well-known basic concepts that you may have encountered elsewhere, but not suspected that they are all parts of a larger picture, and finish with rarely-read graduate-level literature. Maybe in the end, you will find your own answer to the question - how to design software systems?

In its essence, software engineering is the art of creating abstractions. Quality and elaboration of the used abstractions separate failed and successfully designed software. The code of a well-designed system is easily comprehensible, such systems are easy to extend and maintain, and have a minimum amount of defects. So, how should we design quality systems? As in other engineering disciplines, modeling, prototyping, measurement and iteration are important elements of the development cycle. We can approach the question of design from the several following perspectives:

Software modeling
Software design in General
Domain-driven design
Design for testability
Design for resilience
Algorithm design
Modularity and separation of concerns
Managing complexity
Design imposed by programming paradigms
Design imposed by the principles of discipline
Design Patterns
Software quality attributes
Software architecture

As the introductory paragraph suggests, there are many more additional facets of the trade. By taking all the discussed principles seriously enough, you will undoubtedly be able to design quality software.

Software Modelling

To build quality systems it is customary to model software before its implementation. Modeling begins with the formulation of functional system requirements, which is a science on its own. Because we focus on software design, we assume that requirements are already subdivided into the relevant verbs and nouns. Such division corresponds to the behavior and data of the system being modeled.

Modeling is usually performed in UML by decomposing the system into subsystems, components, or modules and by defining detailed component interfaces. To achieve this, all dependencies between system components and their interactions, including possible error handling, are thoroughly examined. A throwaway prototype may be created to evaluate the model.

In an ideal world, where software is coded on punchcards, the developed model would become a project artifact, changes in which drive the corresponding changes in the implementation of the system. But you surely know what happens in these agile projects, where code is valued over documentation.

In reality, under now widespread agile development methodologies, the whole discipline of agile modeling has emerged. Modeling often takes place at the first iterations of the development cycle in parallel to writing code. This helps to define the requirements more precisely, which may be too abstract initially, in close collaboration with the customer ( or without it). Also, no code is usually thrown away, confirming the proverb that the best permanent solution is a temporary one. Instead of prototyping, some authors (Andy Hunt and Dave Thomas, “The Pragmatic Programmer”) recommend writing “tracer code” (which is called evolutionary prototyping by some other authors), where the most important aspects of the system are implemented and integrated to test the validity of the developed model, initial knowledge of which may be sparse.

It is recommended to select the riskiest aspects of the system for the initial modeling and implementation. It is also often advised that modeling should be performed with the vision of the future evolution of the system. This may help to detect potential sources of variation, for which good domain knowledge may be necessary.

Many excellent books are written on this topic. Please, check the recommended literature below.

Software Design in General

To explain the essence of a good design, authors usually insert an image of a craftsman with a chisel and tell how it is important to name variables in a consistent and obvious way, write literary-inspired comments, create small functions, and so on. Although code quality is a significant matter, design is concerned with more grandeur goals. Namely, how to prevent system rot under the pressure of the omnipresent second law of thermodynamics that strives to turn everything into an incorrigible mess? How to minimize the impact of possible changes? How to make the system easily extensible and comprehensible at the same time?

At the level of design, we have all its basic elements at our full disposal: functions, methods, classes, packages, and modules. The trick is to arrange them just in the right way under a hierarchy of maximally-decoupled layers of abstractions. Several design principles discussed below, accompanied by a range of well-established design patterns, will greatly help with this task. The sad truth is that these aspects of design still remain more art than science and require a profound knowledge of the practice of pattern application for any success.

It is a sign of skill, if after a look at the model you say: “To properly separate concerns I should implement the core functionality as a set of fine-grained classes and use decorators for optional features”. But the task of design is so cognitively daunting, that it is rarely done in the right way from the start. Usually, design is gradually improved over the course of the development, so an image of a craftsman working with clay is more appropriate.

The process of design is inseparable from decisions that imply explicit or implicit tradeoffs. For example, on the one hand, they say that YAGNI. On the other, you also need to account for the system extensibility, which may require some extra features. Successful designers are guided by the most salient features in the context of the system evolution, and some good books on thinking may help to discover them, such as “Blink: The Power of Thinking Without Thinking” by Malcolm Gladwell, or “Think Again: The Power of Knowing What You Don’t Know” by Adam Grant. More advanced cognitive techniques, such as sleep on it or insight meditation may significantly boost the success rate if used properly. At this point, many authors insert an image of a psychic with a crystal ball.

Needless to say, that a straightforward ad-hoc design without any foresight brings the most suffering along with the loads of unneeded work. Use-case-based modeling allows to estimate what subsystems and layers of functionality your software may have, and what internal APIs, facades, and adapters you may need to think about.

Domain-Driven Design

The approach of domain-driven design puts the value of domain modeling in the first place and has an idiosyncratic way of arranging domain objects in bounded contexts that define units of transactional consistency. Folks in this camp have a whole special jargon for what other people call: a jargon, a model, a package, a subsystem, an entity, an object, a persistence framework, and so on. In the DDD paradigm, you say that ubiquitous language defines a domain, which is separated into bounded contexts that are comprised of aggregates, which consist of entities and value objects stored in repositories, and so on. Each term here implies its own profound semantics. Who knows, maybe this helps to think better about domain models. The key principle here is that the units of abstraction should closely correspond to the entities of the domain, which makes it easier to keep the minimal gap between the domain model and implementation.

The domain-driven design methodology is described in the book “Domain-Driven Design: Tackling Complexity in the Heart of Software” by Eric Evans. Here it is worth listing several lessons that could be derived from this approach:

A good understanding of the problem-area jargon and its utilization in the system may be important for the creation and maintenance of an adequate domain model.
The domain code (also known as business logic or business rules elsewhere) should be kept clean from other types of application logic to minimize the gap between the domain model and its implementation. It is because of that MVC was invented. Non-domain application logic is usually placed into a separate layer of controllers or services.
To save Houston of problems, different units of measurement and other such value objects should be implemented as separate types to prevent meaningless operations between them.

Design for Testability

Test-driven development is considered an invaluable engineering practice that greatly helps to mitigate the impact of regressions in the process of making software changes. According to it, tests are written before the covered functionality, so the design becomes testable by default. But not all the testable design is neat. It is still a matter of art to choose what aspects of system behavior and paths of execution should be tested, what external systems to mock, which layers of the system should be covered by the test pyramid, and what tradeoffs should be introduced into the design of the system to improve the test coverage. In general, it is recommended to maximally cover by unit tests only the business logic, which needs to be separated into a functional (referentially transparent) core. Integration or end-to-end tests should be sparingly used for the wider aspects of the system. The book “Unit Testing: Principles, Practices, and Patterns” by Vladimir Khorikov provides some good advice on the design for testability.

Design for Resilience

When we neglect important edge cases, for example, what will happen if a socket timeout is undefined, the system may not attain the desired quality attributes. The network may be clogged or may fail, and our software will fail along with it miserably. The book “Release It!: Design and Deploy Production-Ready Software” by Michael Nygard describes a set of patterns and practices for resilient software design, such as circuit breakers and bulkheads, which prevent failures to spread all over the system. The main principle to be followed here is: fail fast.

Algorithm design

Sometimes developers are faced with the task to design an algorithm that suits their problem. Usually, they take some well-known algorithm, model their problem in the terms of its primitives, for example, graphs, and adapt the chosen algorithm to their needs. Although, they may thoroughly examine if they could eliminate existing bottlenecks with some advanced data structures, such as Fibonacci heaps or tries, which has lower computational complexity on some operations, if they are unlucky, they may face the issue of computational tractability. Developers who know their heuristics (methods that allow obtaining a faster or approximate solution), such as greedy algorithms, divide and conquer, dynamic programming, or simulated annealing, are typically able to solve their problem successfully. More details on algorithm design could be found in the book “The Algorithm Design Manual” by Steven Skiena.

Modularity and Separation of Concerns

Separation of concerns is the cornerstone of quality software. By keeping each part of the functionality in a dedicated module, you make the possible changes easier and increase the maintainability of the code. Well-separated concerns are also immune to variation - changes in some parts of the system produce a minimal impact on its other parts.

Several principles help to write well-modularized code:

Do not repeat yourself (also known as DRY). This principle is as simple as it is powerful. By not duplicating code you minimize the area for potential changes and errors.
Single responsibility principle - a component should have only one reason to change.
Decoupling - this concept implies that modules, for example, classes in object-oriented languages, should only minimally depend on each other. Two source code metrics are often used in this context: cohesion - the degree to which the elements inside a module belong together and coupling - the degree of interdependence between modules. A designer should strive to maximize cohesion and minimize coupling. There are several approaches that facilitate decoupling, for example, dependency inversion which is, in essence, is the programming to abstractions, or the GoF mediator pattern.
Packaging and maintaining component boundaries. On a more coarse-grained level of packages and components, which may contain multiple classes, software package metrics are used to think about coupling and the impact of changes. In short, system components are divided on unstable (nothing depends on them), flexible (little depends on them), and stable (many other components depend on them). The stable abstraction principle states that a stable component should also be abstract to make system extension easy.
Layers of abstraction - by isolating related modules, components or services into layers and by using lower-level layers only from the higher-level ones, you minimize the impact of the changes in any given layer on its client layers. Although a good horizontal layered architecture makes it possible to easily swap underlying libraries and technologies, it is not always optimal, so several layering schemes exist, such as hexagonal (with closely associated ports and adapters pattern) and onion layering.

Anti-patterns

“Smart UI” - a tendency to implement all application logic in UI event handlers, which may violate all the principles listed above. This inevitably results in unmaintainable error-prone code and hiders system evolution. MVC design pattern or its variations should be used instead.

Managing Complexity

The quality of the used abstractions is important in the management of software complexity. In a broad sense, complexity is characterized by the cognitive effort necessary to understand the program. The more formal notion of cyclomatic complexity has a number of practical applications, such as computing test coverage.

Carefully designed, modularized, and layered abstractions help to prevent the system from becoming an incomprehensible mess. A quality abstraction may be distinguished from a bad one by its depth (this is a technical term: John Ousterhout, “A Philosophy of Software Design”). The depth is measured by the ratio of elaboration of the abstraction interface to its functionality. A good abstraction has a succinct well-defined interface with meaningful names and covers vast functionality. The opposite is true for a bad abstraction: it has a vast interface with poorly named members that, possibly, just delegate functionality to another layer. Although, sometimes this may be a necessary evil, such as in the GoF adapter pattern. A good abstraction also hides unimportant details and reveals only important ones, minimizing the associated cognitive noise, for example, by providing reasonable defaults.

A yet another simple and powerful principle also helps to keep complexity at bay: Occam’s razor (also known as KISS). By not multiplying entities without necessity, you are not making the system less comprehensible.

Anti-patterns

Abstraction leak - it happens when some aspects of the system that should be encapsulated are propagated through the unrelated modules and components. Abstraction leak may happen in more subtle ways, for example, throuhg implicit dependencies. If a user of a dictionary library depends on the order of items stored in a dictionary, the code may regress if the container hashing algorithm changes in the next version of the library. It is recommended to put unit tests on such dependencies.

Design Imposed by Programming Paradigms

A chosen programming paradigm may impose its own concerns in the areas of design and complexity management. Generally, a paradigm allows managing complexity to the extent of what you can and can not do with it. Practice suggests that too much or too little freedom results in lamentable consequences.

Before E.D. (Edsger W. Dijkstra)

If you never seen a program in FORTRAN 66, please take a look. There you may notice the extensive use of the “GO TO” operator without any hesitation.

In his book “Structured Computer Organization” Andrew S. Tanenbaum also offers a good description of what programming looked like at the dawn of the computer era:

In these early years, most computers were "open shop," which meant that the programmer had to operate the machine personally. Next to each machine was a sign-up sheet. A programmer wanting to run a program signed up for a block of time, say Wednesday morning 3 to 5 A.M. (many programmers liked to work when it was quiet in the machine room). When the time arrived, the programmer headed for the machine room with a deck of 80-column punched cards (an early input medium) in one hand and a sharpened pencil in the other. Upon arriving in the computer room, he or she gently nudged the previous programmer toward the door and took over the computer.

If the programmer wanted to run a FORTRAN program, the following steps were necessary:

He went over to the cabinet where the program library was kept, took out the big green deck labeled FORTRAN compiler, put it in the card reader, and pushed the START button.

He put his FORTRAN program in the card reader and pushed the CONTINUE button. The program was read in.

When the computer stopped, he read his FORTRAN program in a second time. Although some compilers required only one pass over the input, many required two or more. For each pass, a large card deck had to be read in.

Finally, the translation neared completion. The programmer often became nervous near the end because if the compiler found an error in the program, he had to correct it and start the entire process all over again. If there were no errors, the compiler punched out the translated machine language program on cards

The programmer then put the machine language program in the card reader along with the subroutine library deck and read them both in.

The program began executing. More often than not it did not work and unexpectedly stopped in the middle. Generally, the programmer fiddled with the console switches and looked at the console lights for a while. If lucky, he figured out the problem, corrected the error, and went back to the cabinet containing the big green FORTRAN compiler to start over again. If less fortunate, he made a printout of the contents of memory, called a core dump^†, and took it home to study.

This procedure, with minor variations, was normal at many computer centers for years. It forced the programmers to learn how to operate the machine and to know what to do when it broke down, which was often. The machine was frequently idle while people were carrying cards around the room or scratching their heads trying to find out why their programs were not working properly.

† Reference to magnetic-core memory.

At least in the early versions of FORTRAN, you were not too restricted and could do anything by any existing means. There also were not so many of them. This resulted in barely intelligible code and frequent errors.

Structured Programming: GOTO Considered Harmful

To help this gloomy state of affairs, Edsger Dijkstra wholeheartedly promoted structured programming as a discipline to adhere only to the structured control flow constructs. This was the first major step in the taming of complexity, but it had its own major drawback: the global program state still was exposed, which does not help to minimize coupling. Because this is also undoubtedly harmful, a technique that addresses this problem was necessary.

Object-Oriented Programming

Object-oriented programming introduced three new concepts that greatly aid in the minimization of coupling:

Encapsulation - also known under the more general name: information hiding, which allows hiding module internal state, facilitating decoupling.
Subclassing - class inheritance. Subclassing facilitates code reuse while maintaining encapsulation.
Polymorphism - ability to call methods of a derived class through a reference of a superclass type, which, for example, allows dependency inversion.

Although OOP had tremendous success and remains a dominant programming paradigm to this day, it is not free of its own problems inherited from the procedural style, such as uncontrollable side effects, which, for example, make parallelism a non-trivial task. Because of that, adepts of functional programming consider object-oriented programming harmful, probably, not without a reason. Being a great and simple tool for managing complexity, OOP requires equally great experience and discipline for its proper use.

Generic Programming

In strongly-typed languages, generic programming allows generating multiple type-parametrized versions of the code from a single template. This helps to observe the DRY principle or even achieve static polymorphism. Being a simple approach by itself, generic programming requires mastering of several related concepts to unleash its full potential:

Covariance and contravariance in Java-based languages.
Template specialization and template type deduction in C++ (which, for example, includes universal reference collapsing).
SFINAE in C++, which is utilized in the template metaprogramming.

Because extensive use of generics may result in uncontrollable binary code bloat, this approach is often considered harmful by byte-counting programmers.

Aspect-Oriented Programming

Something is still rotten in your kingdom, say proponents of aspect-oriented programming. You have cross-cutting concerns (this is, obviously, a technical term), such as logging or transaction management, that dangle from here and there, and pollute your code. Let’s take them into a single place using advices (this is a technical term that may be familiar to Lisp programmers), which will result in a better separation of concerns. Other programmers are shrugging, and consider this paradigm if not harmful, but of limited use, since pointcuts (this is yet another technical term, denoting the places where advices are applied) often do not provide the same granularity as the direct use of the functionality being abstracted away. So, it may be appropriate primarily in monstrous enterprise systems with a large number of coarse-grained cross-cutting concerns.

Metaprogramming

Metaprogramming allows to generate code programmatically. Such code may be template-based, dynamically evaluated, or implemented through syntactic macros. It tends to produce abstractions of such depth and eloquence, that it often results in the apparent elimination of complexity through the introduction of obscurity, opacity of which is limited only by the creativity of the author. This happens both in the places of abstraction use and implementation. Because of this, metaprogramming is usually considered harmful by the programmers that are not accustomed to it.

Functional Programming

Although functional programming was not developed as a direct response to the problems of OOP and takes its roots in category theory and lambda calculus of Alonso Church, it can successfully eliminate some hurdles of OOP by severely restricting programmers at the level of language. For example, side effects are prohibited in most places, and looping constructs are replaced by recursion, which is usually done through folding. Abstractions are built using higher-order functions and functional composition. To successfully use the functional approach, a programmer needs to learn how to bypass imposed restrictions (often referenced by the technical term “purity”) by utilizing and accustoming to a dozen of mind-boggling concepts with gut-wrenching names, such as:

If you already have extensive OOP experience, the only way to master FP is to use only (and only) languages that enforce purity, such as Haskell or ML. But because the ability of the pure FP to reduce cognitive effort is questionable for any real-life application, OOP programmers often consider it harmful, probably, not without a reason. In moderate amounts, though, it can produce truly elegant solutions.

You may try to convince yourself of this by performing the following steps exactly in the order listed (the list may be a little opinionated):

Read “The Joy of Kotlin” by Pierre-Yves Saumont, which thoroughly describes the basics of functional programming and does a tremendous job to demonstrate how ugly it may look being practiced with an imperative programming language.
Read “Functional Programming in Scala” by Paul Chiusano and Runar Bjarnason, which tries to teach more advanced functional patterns and shows how an ambiguous unintuitive syntax of a hybrid programming language may obscure clear enough concepts.
Read “Real World Haskell” by Bryan O’Sullivan, Don Stewart, and John Goerzen, which tells how it is painful and futile to build something more complex than “Hello, World!” in a purely functional way without industry-grade tooling and frameworks. In this sense, the book has lost absolutely nothing for almost 15 years of its existence.
“Functional Programming Made Easier” by Charles Scalfani adds a yet another dimension of despair. In two thousand pages of fine print, it explains that you can not skillfully build functional abstractions without the knowledge of category theory and attempts to teach you some of it along the way.
If you are still struggling, read Professor Frisby’s Mostly Adequate Guide to Functional Programming which demonstrates how to do things in a purely functional way in JavaScript.
If all this does not help, try reading this, and start over again, if necessary.

Now, after you have learned how to compose comonads and write interpreters for domain-specific languages based on algebraic types, you are ready to obtain the black belt of functional programming. But if there are tools that allow achieving the same with much less headache and with much more fun, such as Clojure…

Contract Programming

Design by contract, elements of which could be found in Eiffel, Clojure, and Scheme, aims to eliminate implicit properties of module interfaces and related obscurity by explicitly defining them in interface contracts. This may be achieved, for example, in the form of pre- and post-conditions of function calls, that are programmatically verifiable. Being by itself a good academic concept, it is often considered harmful by industrial programmers, who face permanently changing requirements and also have unit tests to maintain.

Reactive Programming

Originally, this approach was a sprinkle of programming language magic, where change-observation event handlers of selected variables were created and maintained automatically by the programming environment. In the modern days, reactive programming is essentially a message-passing architecture maintained by a framework, where whole event streams could be functionally composed, throttled, and harnessed by the back-pressure.

Because it often employs parallelism, and the lion’s share of parallelism-related complexity is hidden beneath the libraries or language constructs, reactive programming is often sold as a separate paradigm on the complexity-management and related markets, which allows for improving overall application responsivity.

Parallelized reactive programming is where you offload multiple complex tasks from the main thread of execution to a thread pool (or a multiprocessing environment) and wait for their completion by using futures or promises or a subscriber/publisher framework that utilizes a message queue called a reactive event stream. You can filter, modify or combine event streams together.

While events are processed, it is possible to display beautiful ads to the user in full-HD and 60 FPS, until the results are ready. Unfortunately, if your business is ad-based, this approach may harm it, since a parallelized set of tasks usually executes faster than the same set of sequential ones. So, there are places where reactive programming may be considered harmful. In other areas, it may be considered harmful because of unnecessary bloat.

Domain-Specific Languages

Domain-specific languages allow the creation of reasonably deep and clear abstractions of such quality, that sometimes they could be used even by non-specialists in computer science. Although, the development of a good DSL, based on metaprogramming or special tools, such as Xtext or Spoofax, usually requires an amount of effort that is an order of magnitude higher than that for a regular OOP solution. So, more than often a DSL results in semantics, that only its authors can understand. Thus, more than often DSLs are considered harmful.

Anti-patterns

Optional types are often offered as a solution to the billion dollar mistake. Such types wrap a value that may contain a null-reference. Since they heavily rely on functional programming concepts, their use may look obscure and ugly, especially in languages without built-in pattern matching, such as Java. A clever solution, implemented, for example, in Kotlin is just to prohibit the use of null-references for non-nullable types and provide nullable ones with null-coalescing operators.

Design Imposed by the Principles of Discipline

SOLID Principles

These five related principles are considered a crucial discipline in the building of abstractions that are clear, flexible, and maintainable:

The Single-responsibility principle - “A module should have only one reason to change.” This principle facilitates the separation of concerns and low coupling.
The Open–closed principle - “A module should be open for extension, but closed for modification.” This principle facilitates the stability of the module interface to minimize the impact of changes. It is not surprising, that stable interfaces are usually well-thought-out.
The Liskov substitution principle - “A module of a given type should be interchangeable with a module of its subtype without any unexpected side effects.” This principle promotes the consistency of behavior in type hierarchies, which results in simple and clear abstraction semantics.
The Interface segregation principle - “No client of a module should be forced to depend on methods it does not use.” This principle facilitates the separation of concerns (interfaces) and high cohesion, increasing code maintainability.
The Dependency inversion principle - “A client module should not depend on other modules directly, but through abstract interfaces.” The use of the term “inversion” in the name of this principle is usually explained by the fact that dependencies between modules are customary made to concretions. The aim of this principle is to facilitate dependencies on abstractions. It may make sense in the case of a concretion that is highly volatile and may propagate undesired changes. An abstract interface is created, that hides such concretion, which also eliminates the direct dependency on the referred module source code, increasing decoupling. Abstract factories are usually used to create instances of such volatile objects.

All these principles are described in more detail in the books by Robert C. Martin, for example, in “Clean Architecture: A Craftsman’s Guide to Software Structure and Design”. The name of this book may be misleading because this is a book not about architecture but about tidiness.

The Law of Demeter

The Law of Demeter facilitates the abstinence of the dependence on the inner structure of the used classes and components. If there is such a dependence, it is better to be reimplemented as an incapsulated method of the dependent class that utilizes this functionality.

Command Query Separation

Command-query separation principle states that every method should either be a command that performs an action, or a query that returns data to the caller, but not both. It is believed that CQS has a simplifying effect on programs, making their states (via queries) and state changes (via commands) more comprehensible.

GRASP Principles

These are nine less-known principles, that are focused on assigning responsibilities to objects (General Responsibility Assignment Software Patterns). They are also worth considering since responsibilities are a common currency in the object-oriented world.

Information expert - a responsibility is preferably assigned to the class that has all the necessary information to fulfill it.
Creator - this pattern determines which class should have the responsibility to create some other class. Several possible concerns apply here:
- Instances of the creator contain or compositely aggregate instances of the created class.
- Instances of the creator persist instances of the created class.
- Instances of the creator closely use instances of the created class.
- Instances of the creator have the initializing information for instances of the created class and pass it on creation.
Controller - in this pattern, some objects are assigned the responsibility to process non-UI events of the application. This allows separating application-specific logic from the presentation- and business-logic.
Indirection - this is a more general pattern, similar to the GoF mediator design pattern. For example, the controller in MVC acts as a mediator between the presentation and model layers that do not communicate directly and hence are decoupled.
Low coupling and High cohesion patterns are equivalent to the source code metrics discussed above, which should be invoked when considering the assignment of responsibilities (behaviors).
Polymorphism - this pattern facilitates the use of the built-in polymorphic constructs over hardcoded switch/case tables when assigning responsibilities, for example, as in GoF visitor pattern, to ease a possible refactoring (although visitor, probably, is the most complex pattern in the GoF line).
Protected variations - this pattern facilitates isolation of possible sources of variation under the abstract interfaces, which can represent different (changeable) polymorphic classes, so instability does not propagate to the other parts of the system.
Pure fabrication - is an artificial module, that does not present in the system model and is introduced solely to maintain abstraction and decoupling.

GRASP principles are described in more detail in the book “Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development” by Craig Larman.

Design Patterns

First published in the book “Design Patterns: Elements of Reusable Object-Oriented Software” (1994) by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides who are also known as the Gang of Four (GoF), these patterns comprise a clever set of abstractions that eliminate many undesirable side effects of naive software design. Since many excellent books are written on this topic, I will not discuss them. For example, here, you may find a good brief description of each pattern, of the problem it solves, and the corresponding code examples.

Anti-Patterns

Excessive use of inheritance instead of composition. This anti-pattern is a classical example of an undesirable side effect of the straightforward software design: combinatorial breeding of the derived classes. For example, your goal is to draw N different shapes by M different algorithms that are applicable to each shape. If you want to utilize polymorphism to draw the shapes, you may create a base class Shape and derive a Cartesian product (MxN) of shape subclasses from it, each drawn by its own algorithm, for example: Shape1Alg1, Shape1Alg2, Shape2Alg1, Shape2Alg2, … etc. Although, if you utilize GoF bridge pattern, where algorithms are invoked using composition, you will need only M+N derived classes for your implementation, while still using separation of concerns and polymorphism on the Shape subclasses.

Software Quality Attributes

Earlier we implicitly invoked some software quality attributes, such as simplicity, reusability, maintainability, clarity, flexibility, correctness, and evolvability. To understand what may act as major considerations in the choice of software architecture, and also by which aspects it may be constrained or evaluated, we need more examples:

Granularity - the number of physical nodes needed to deploy an architectural solution.
Performance - implies system performance characteristics such as throughput and response times.
Scalability - the extent to which the system is capable of growing after its initial deployment.
Elasticity - the ability of a system to add and remove capacity based on demand.
Availability - addresses system failure and its impact on users or other systems.
Resilience - the ability to provide and maintain an acceptable level of service in the face of faults and challenges to normal operation.

Software Architecture

Software architecture is often described as a high-level system design, which operates on the level of system layers and subsystems, rather than on the level of separate modules or classes.

In the narrowest sense, software architecture is the art of maintaining dependencies and boundaries between coarse-grained system components to keep the system testable, maintainable, and evolvable. To do this you need to learn a set of architectural patterns, described, for example, in the books: “Java Application Architecture” by Kirk Knoernschild, or now mostly obsolete “Patterns of Enterprise Application Architecture” by Martin Fowler. If you are destined to read only one book on practical software architecture, please read “Architecture Patterns with Python” by Harry Percival and Bob Gregory.

In the broader sense, software architecture is an engineering discipline that is concerned with keeping the system in accordance with various criteria and requirements (mostly non-functional). At this particular level, the trick is to choose the right method or tool, that is the most appropriate to the problem at hand. To master this you need to learn a set of architectural styles, which are described in the books: “Fundamentals of Software Architecture” by Mark Richards and Neal Ford, or also in AOSA.

Although many divisions are possible, system architecture may be viewed from the five different standpoints:

View of use-cases - this is a functional requirements view, which is an input to develop the software architecture. Each use case describes the sequence of interactions between one or more actors (external users) and the system.
Static view - the architecture is depicted in structural terms of subsystems or components and relationships between them.
Dynamic view - this view described dynamic interactions between components comprising the system.
Deployment view - this view depicts a specific configuration of the distributed architecture with components assigned to hardware nodes.
Non-functional requirements view - evaluates non-functional architecture properties such as performance or resilience.

Design on this level generally requires a level of knowledge in technology, soft skills, and modeling that differs from the skills of a regular software developer. Although, there is an alternative point of view.

Architectural Patterns

The domain of software architecture has its own repository of patterns which is discussed in the “Pattern-Oriented Software Architecture” book series, and also in “Architectural Patterns” by Pethuru Raj, Anupama Raman, and Harihara Subramanian.

Architecture Styles

Monolithic and distributed architectures represent two opposite approaches, the choice between which is often driven by the non-functional requirements. Distributed architectures may employ centralized or decentralized coordination, which is referred to by the technical terms “orchestration” and “choreography” respectively. Architectures also may be technology- or domain-partitioned, in which decomposition to components is performed according to the corresponding paradigm. The most common architectural styles used nowadays are listed below.

Layered architecture - components within the layered architecture are organized into logical horizontal layers, with each layer performing a specific role within the application. Most layered architectures consist of four standard layers: presentation, business logic, persistence, and database. This is an example of a technologically-partitioned architecture.
Pipeline architecture - components within this architecture perform only one task and are sequentially connected by mediator components (“pipes”) which form a single pipeline with input and output ends. Widely used in applications that require simple one-way data processing.
Microkernel architecture (also known as plug-in architecture) - is a monolithic architecture that consists of the core that provides minimal functionality necessary to run the whole system and of multiple plugin components that implement the main application functionality. Eclipse IDE is a good example of such an application. Often used in software product lines.
Service-based architecture - is a distributed architecture that may consist of a separately deployed presentation node, a set of coarse-grained domain-partitioned service nodes, and a monolithic database. A very popular choice for many business-related applications.
Event-driven architecture - is made up of decoupled event processing components that asynchronously receive and process events. The components may be deployed as a clustered domain-partitioned set of nodes with centralized or decentralized coordination. This architecture is used in highly-scalable responsive and resilient applications.
Space-based architecture - consist of many processing units that contain application logic, virtualized controlling middleware, asynchronous data cache, and a database that receives updates from users asynchronously, possibly with a delay. The main aim of this architecture is to overcome the bottlenecks that emerge when traditional web-server deployments are trying to synchronously process requests from many users. A typical use case is a concert ticket ordering system, which is idle most of the time, but elastically handles high loads before concerts by deploying as many processing units as necessary.
Microservices architecture - this architecture takes its roots in the domain-driven design and consists of a large set of maximally decoupled domain-partitioned fine-grained services, each of which implements a bounded context, or an individual aggregate from the domain model. Each service is meant to be independent, which implies that every service usually contains its own database. Because this architecture is scalable, elastic, and maintainable, it is a popular choice for modern distributed applications.

Software architecture styles are described in more detail in the book “Fundamentals of Software Architecture: An Engineering Approach” by Mark Richards and Neal Ford.

Anti-Patterns

Big ball of mud - an architecture (or the lack of it), where components are not divided into layers or services and promiscuously depend on each other.

Recommended Literature

The literature listed below offers a real introduction to software engineering. Its choice is somewhat arbitrary, but many books contain references to other works. Only after reading it all, one can approach the beginnings of the understanding of system design essentials. Some book titles may seem misleading, but every book there is dedicated, at some degree, to important foundational concepts in software design.

Andrew S. Tanenbaum, Todd Austin - Structured Computer Organization
Andrew S. Tanenbaum, Bos Herbert - Modern Operating Systems
Andrew S. Tanenbaum, Nick Feamster, David J. Wetherall - Computer Networks
Steven S. Skiena - The Algorithm Design Manual
Andry Burkov - The Hundred-Page Machine Learning Book
Robert C. Martin - Clean Code: A Handbook of Agile Software Craftsmanship
David Thomas, Andrew Hunt - The Pragmatic Programmer: Your Journey to Mastery
John Ousterhout - A Philosophy of Software Design
David Farley - Modern Software Engineering: Doing What Works to Build Better Software Faster
Vladimir Khorikov - Unit Testing: Principles, Practices, and Patterns
Michael Nygard - Release It!: Design and Deploy Production-Ready Software
Rubin Kenneth - Essential Scrum: A Practical Guide to the Most Popular Agile Process
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides - Design Patterns
Eric Freeman, Kathy Sierra, Bert Bates, Elisabeth Robson - Head First Design Patterns
Joshua Kerievsky - Refactoring to Patterns
Clean Architecture: A Craftsman’s Guide to Software Structure and Design
Craig Larman - Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development
Hassan Gomaa - Software Modeling and Design: Uml, Use Cases, Patterns, and Software Architectures
Mark Richards, Neal Ford - Fundamentals of Software Architecture: An Engineering Approach
Harry Percival - Architecture Patterns with Python: Enabling Test-Driven Development, Domain-Driven Design, and Event-Driven Microservices
Humberto Cervantes - Designing Software Architectures: A Practical Approach
Andrew S. Tanenbaum, Maarten van Steen - Distributed Systems: Principles and Paradigms
Andrei Alexandrescu - Modern C++ Design: Generic Programming and Design Patterns Applied
Eric Evans - Domain-Driven Design: Tackling Complexity in the Heart of Software
Harold Abelson, Gerald Jay Sussman, Julie Sussman - Structure and Interpretation of Computer Programs
Michael Swaine - Functional Programming: A PragPub Anthology: Exploring Clojure, Elixir, Haskell, Scala, and Swift
Michael Fogus, Chris Houser - The Joy of Clojure
Scott Wlaschin - Domain Modeling Made Functional: Tackle Software Complexity with Domain-Driven Design and F#
Martin Fowler - Domain-Specific Languages
Ryan D. Kelker - Clojure for Domain-Specific Languages
Markus Voelter - DSL Engineering: Designing, Implementing and Using Domain-Specific Languages
Tomasz Nurkiewicz, Ben Christensen - Reactive Programming with RxJava
Riichiro Inagaki - Dr. Stone

The devil is in the details. Happy studying!

Share: Twitter, Facebook