Software Architecture and Decision-Making Book notes
Book Core concepts
It explains principles and concepts I believe a senior architect must understand deeply and discusses how to employ those principles to manage uncertainty
Books for leadership
The Hard Things About Hard Things by Ben Horowitz,
Trillion Dollar Coach by Eric Schmidt et al.,
Team of Teams:
New Rules of Engagement for a Complex World by Stanley McChrystal
Good Strategy, Bad Strategy by Richard Rumel
Three layers of architecture of The Open Group Architecture Framework (TOGAF)
Two prominent approaches to system architecture:
Waterfall ->identify the system’s requirements in full detail beforehand and start building
Agile -> Iterative way (collaborating with users to refine requirements and construct a system that can genuinely benefit the user)
Understanding Systems, Design, and Architecture
When writing a cloud app, we have two choices: We can choose a single cloud, taking advantage of its unique strengths for our application, or we can make the application portable across several cloud providers
How to Design a System
Five questions and 7 Principles to understand the context of system we building
When is the best time to market?
- If the feature need to go urgent becasue of market urgent. we can design the way simple and fast. and we also ready to re-write it.
What is the skill level of the team?
What is our system’s performance sensitivity?
When can we rewrite the system?
What are the hard problems?
Seven principles
Drive everything from the user’s journey.
Use an iterative thin slice strategy
- Unless you have a specific reason, always start with simple architectural choices. Measure the system, find the bottlenecks, and improve the system later.
On each iteration, add the most value for the least effort to support more users
Make decisions and absorb the risks
Design deeply things that are hard to change but implement them slowly
Eliminate the unknowns and learn from the evidence by working on hard problems early and in parallel
Understand the trade-offs between cohesion and flexibility in the software architecture
Mental Models for Understanding and Explaining System Performance
Eight mental models that help us think about and understand performance
Cost of Switching to the Kernel Mode from the User Mode
- Every time anapplication enters kernel mode, a context switch occurs, which adds nonessential costs to the system, such as time to save the stack and to rest the cache. To improve performance, we need to reduce the number of system calls.
Operations Hierarchy
Operation | Time/Speed (Example) | Description |
L1 Cache Reference | 0.5 ns | Accessing data from Level 1 cache in the CPU. |
L2 Cache Reference | 7 ns | Accessing data from Level 2 cache in the CPU. |
L3 Cache Reference | 20 ns | Accessing data from Level 3 cache in the CPU. |
Main Memory (RAM) Access | 100 ns | Retrieving data from the system's RAM. |
SSD Storage Access | 1,000,000 ns (1 ms) | Accessing data from Solid State Drive (SSD). |
HDD Storage Access | 10,000,000 ns (10 ms) | Accessing data from Hard Disk Drive (HDD). |
Network Packet Round Trip (US to India) | 150,000,000 ns (150 ms) | Sending a packet from the US to India and receiving the acknowledgment. |
CPU Processing (Instruction) | 0.2 ns per instruction | Executing a single CPU instruction. |
GPU Processing (Parallel) | 50 ns per parallel task | Executing parallel tasks on a Graphics Processing Unit (GPU). |
PCIe Data Transfer (Device to CPU) | 2,000 ns (2 us) | Transferring data between a peripheral device and the CPU via PCIe. |
Context Switch (Kernel) | 1,000 ns (1 us) | Switching between different processes in the kernel. |
RAM-to-Cache Transfer | 20 ns | Copying data from RAM to cache in the CPU. |
Database Query (Local) | 10,000,000 ns (10 ms) | Executing a database query on a local server. |
Database Query (Remote) | 50,000,000 ns (50 ms) | Executing a database query on a remote server. |
System Call (Linux) | 1,000 ns (1 us) | Initiating a system call in a Linux environment. |
I/O Operation (Disk Write) | 10,000 ns (10 us) | Writing data to a disk. |
I/O Operation (Network Send) | 1,000,000 ns (1 ms) | Sending data over a network. |
I/O Operation (Network Receive) | 1,000,000 ns (1 ms) | Receiving data over a network. |
Context Switching Overhead
- Switching processes adds an overhead cost of about 5–7 microseconds
Amdahl’s Law
is used to predict speed up of a task execution time when it’s scaled to run on multiple processors. It simply states that the maximum speed up will be limited by the serial fraction of the task execution as it will create resource contention.
one woman nine months to make one baby, “nine women can’t make a baby in one month
Assume if have 3 thread that doing a single task it take 2 sec but the most of the time will spend on maintining the 3 thread and resources sharing
Parellel process are efficient when they are independ
Universal Scalability Law
- says that actual speedup is even worse than Amdahl’s law due to shared variables. USL defines a new parameter coherency, which is the overhead added by communication between multiple processes, threads, or nodes.
Latency and Utilization Trade-offs
- Only a single thread can use the most resources at a given time, which forces threads to wait and take turns
Designing for Throughput with the Maximal Useful Utilization (MUU) Model
Adding Latency Limits
Optimization Techniques
To optimize, we need to decide where the bottlenecks are. Bottlenecks usually come in one of three forms:
One of the resources (e.g., CPU, I/O, or memory) is the bottleneck.
Thread models are causing critical resources to be idle.
Resources are wasted on nonessential tasks (e.g., context switches, GC).
CPU Optimization Techniques
Optimize Individual Tasks
Optimize Memory
Maximize CPU Utilization
I/O Optimization Techniques
Avoid I/O (use a cache)
Buffering
Send Early, Receive Late, Don’t Ask but Tell
Prefetching
Append-Only Processing (Kafka)
Memory Optimization Techniques
- Too Many Cache Misses
Latency Optimization Techniques
Do Work in Parallel
Reduce I/O
CPU Utilization is Wrong
CPU utilization is a metric that measures the percentage of time the CPU spends executing non-idle tasks. It doesn't necessarily mean the CPU is busy with computation; it's more about the time the CPU is not running the idle thread.
The idle thread is a special task that runs when the CPU has no other tasks to perform. The operating system kernel tracks CPU utilization during context switches.
So high %CPU to mean that the processing unit is the bottleneck, which is wrong because CPU is capable of doing the process it may wait for I/O or something .
Source:
https://opensource.com/article/18/4/cpu-utilization-wrong
The USE Method
The Utilization Saturation and Errors (USE) Method is a methodology for analyzing the performance of any system
For every resource, check utilization, saturation, and errors.
resource: all physical server functional components (CPUs, disks, busses, ...)
utilization: the average time that the resource was busy servicing work
saturation: the degree to which the resource has extra work which it can't service, often queued
errors: the count of error events
Understanding User Experience (UX)
Few concepts or principles that help us to design a good UX.
Understand the Users
Do as Little as Possible
Good Products Do Not Need a Manual: Its Use Is Self-Evident
Think in Terms of Information Exchange
- Users come to our system to get something done. The faster they can find what they need to do and do it, the happier they will be. If we can provide that UX without asking them anything, that’s even better.
Make Simple Things Simple
Design UX Before Implementation
Note:
having UX expertise on the team is a must
Macro Architecture
Spliting for serivce is macro Architecture.
Macro Architectural Building Blocks are
Data Management (DB,)
Routers and Messaging (API Gateway,loadbalancer,message broker)
Executors (Actual server)
Security
Communication (Distributed hash tables,Gossip architectures,Tree of responsibility patterns)
Macro Architecture: Coordination
Drive flow from the client
Call All API calls to service from client
Another service
Have seprate service which will cordinate with other and return the result.
Implement Choreography
Event driven system, where each participant in the process listens to different events and carries out their individual part. Each action generates asynchronous events, which trigger participants downstream
Macro Architecture: Preserving Consistency of State
Two-phase commit protocol
Approaches to going beyond transactions
Redefining the Problem to Require Lesser Guarantees
Figure out a way to resolve complex situations.
Provide a button for users to forcefully refresh the page if they can tell that it is outdated
Using Compensations
Starbucks
Does Not Use Two-Phase Commit
The key idea is that if an action fails, you can compensate.
Use compenstations if below 3 have statisfy
Each individual operation can be verified.
The operation is idempotent. If we repeat the operation with the same data, no additional side effects occur.
We can handle the failure or take compensative actions
Macro Architecture: Handling High Availability and Scale
Before going to scale the system to N Do the POC of single system capacity how much they can handle and if possible can tweak that to improve performance.
There are four tactics (techniques) to keep communication low, and they help us scale.
Share Nothing
Distribution
Caching
Async Processing
Macro Architecture: Microservices Considerations
Microservices let us split the structure into many loosely connected parts that can be developed, released, improved, and managed independently.
The decisions to make before going microservice
Handling Shared Database(s)
Each microservice should have its own database, and two microservices must not share data via the same database. This rule removes a common cause that leads to tight coupling between services
If we want to share then make sure only one Service do update to avoid data reduntency
Use transaction if we want both service to update
Securing Microservices
Coordinating Microservices
Avoiding Dependency Hell
One service depolyment not to break other service if we have dependency use Backward Compatibility or Forward Compatibility
Backward Compatibility: If we have updated the API to v2 which will accpet query params but make sure that API support without query params that's how before it work's.
Forward Compatibility: If V2 not working try V1 (it just temporary solution)
Avoid dependency hell by
Be conservative in what you do, be liberal in what you accept from others
Server Architectures
some guidelines for writing efficient and simple services
Do not reinvent the wheel
Use pools to reuse complex objects (need to see how moongose have pool of connections what they doing)
Make service operations idempotent whenever possible
Service Architecture
Thread-per-Request Architecture
Event-Driven (Nonblocking) Architecture
Staged Event-Driven Architecture (SEDA)
classification of applications based on their resource usage
CPU-bound applications (CPU >> Memory and no I/O)
Memory-bound applications (CPU + Bound Memory and no I/O)
Balanced applications (CPU + Memory + I/O)
I/O-bound applications (I/O > CPU)
Building Stable Systems
scenarios that affect stability:
Unexpected workloads
Resource and dependency failures, plus service-level agreement violations
Software bugs in both ours and borrowed code
Handling Unexpected Load
understand the load and set up a system that can keep up with the load most of the time. This is called capacity planning
Autoscaling
Admission Control: set of policies, procedures, and checks that regulate the admission or acceptance of incoming entities or requests into a system example: Assume server is already overloaded instead of accepting new connection we through error.
Noncritical Functionality: Turn off the Noncritical Functionality
Disaster recovery (have a plan for what to when X go down)
Handling Human Changes
Blue and green Deployment (Having two system one with new updated one and one with old one if new one fails redirect traffic to old one)
Canary deployments (small percentage of users to the new system and gradually increase the load to that system)
Handle Unknown Errors
- Observability (Application Performance Monitoring (APM) tool, Open telementry)
Building and Evolving the Systems
Get the Basics Right
Understand the Design Process
Conway law: organizations design systems that mirror their own communication structure rather than fighting it
Make Decisions and Absorb Risks
Create checklists of questions that are useful for different situations and use them.
Communicating the Design
Growth hacking funnel (find where we stuck does in user side or in development are we not pusing not much feature etc..)
External links given in the book
http://highscalability.com/blog/2012/5/16/big-list-of-20-common-bottlenecks.html
https://www.brendangregg.com/usemethod.html [bottleneck]
https://medium.freecodecamp.org/what-makes-apache-kafka-so-fast-a8d4f94ab145
An Analysis of Web Servers Architectures Performances on Commodity Multicore https://hal.inria.fr/hal-00674475/document
https://www.thoughtworks.com/insights/blog/scaling-microservices-event-stream
Scalability! But at What COST? (https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf)
http://martinfowler.com/articles/microservice-trade-offs.html
https://stackoverflow.com/questions/3570610/what-is-seda-staged-event-driven-architecture.
https://mechanical-sympathy.blogspot.com/2011/09/single-writer-principle.html.
https://mechanical-sympathy.blogspot.com/2011/07/memory-barriersfences.html.
Note : this article originaly published on here follow me for more insights like this