Generate viral LinkedIn posts in your style for free.

Generate LinkedIn posts
ByteByteGo

ByteByteGo

These are the best posts from ByteByteGo.

36 viral posts with 86,334 likes, 1,195 comments, and 10,293 shares.
36 image posts, 0 carousel posts, 0 video posts, 0 text posts.

👉 Go deeper on ByteByteGo's LinkedIn with the ContentIn Chrome extension 👈

Best Posts by ByteByteGo on LinkedIn

Top 12 Tips for API Security
.
.
- Use HTTPS
- Use OAuth2
- Use WebAuthn
- Use Leveled API Keys
- Authorization
- Rate Limiting
- API Versioning
- Whitelisting
- Check OWASP API Security Risks
- Use API Gateway
- Error Handling
- Input Validation


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Netflix Tech Stack - (CI/CD Pipeline)

Planing: Netflix Engineering uses JIRA for planning and Confluence for documentation.

Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases.

Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases.

Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release.

Testing: Testing emphasizes the production culture's focus on building chaos tools.

Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment.

Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies.

Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling.


Get a Free System Design PDF (158 pages) by subscribing to our weekly newsletter today: https://bit.ly/42Ex9oZ

#systemdesign #coding #interviewtips
.
Recently discovered a cheat sheet that covers many design patterns, intended to jog your memory about how different patterns work.


Get a Free System Design PDF (158 pages) by subscribing to our weekly newsletter today: https://lnkd.in/g9wAgcke

#systemdesign #coding #interviewtips
.
IBM MQ -> RabbitMQ -> Kafka ->Pulsar, How do message queue architectures evolve? 
 
🔹 IBM MQ 
IBM MQ was launched in 1993. It was originally called MQSeries and was renamed WebSphere MQ in 2002. It was renamed to IBM MQ in 2014. IBM MQ is a very successful product widely used in the financial sector. Its revenue still reached 1 billion dollars in 2020. 
 
🔹 RabbitMQ 
RabbitMQ architecture differs from IBM MQ and is more similar to Kafka concepts. The producer publishes a message to an exchange with a specified exchange type. It can be direct, topic, or fanout. The exchange then routes the message into the queues based on different message attributes and the exchange type. The consumers pick up the message accordingly. 
 
🔹 Kafka 
In early 2011, LinkedIn open sourced Kafka, which is a distributed event streaming platform. It was named after Franz Kafka. As the name suggested, Kafka is optimized for writing. It offers a high-throughput, low-latency platform for handling real-time data feeds. It provides a unified event log to enable event streaming and is widely used in internet companies. 
 
Kafka defines producer, broker, topic, partition, and consumer. Its simplicity and fault tolerance allow it to replace previous products like AMQP-based message queues. 
 
🔹 Pulsar 
Pulsar, developed originally by Yahoo, is an all-in-one messaging and streaming platform. Compared with Kafka, Pulsar incorporates many useful features from other products and supports a wide range of capabilities. Also, Pulsar architecture is more cloud-native, providing better support for cluster scaling and partition migration, etc. 
 
There are two layers in Pulsar architecture: the serving layer and the persistent layer. Pulsar natively supports tiered storage, where we can leverage cheaper object storage like AWS S3 to persist messages for a longer term. 
 
Over to you: which message queues have you used? 
 
– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Things Every Developer Should Know: Concurrency is 𝐍𝐎𝐓 parallelism.
.
.
In system design, it is important to understand the difference between concurrency and parallelism.

As Rob Pyke(one of the creators of GoLang) stated:“ Concurrency is about 𝐝𝐞𝐚𝐥𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 lots of things at once. Parallelism is about 𝐝𝐨𝐢𝐧𝐠 lots of things at once.“ This distinction emphasizes that concurrency is more about the 𝐝𝐞𝐬𝐢𝐠𝐧 of a program, while parallelism is about the 𝐞𝐱𝐞𝐜𝐮𝐭𝐢𝐨𝐧.

Concurrency is about dealing with multiple things at once. It involves structuring a program to handle multiple tasks simultaneously, where the tasks can start, run, and complete in overlapping time periods, but not necessarily at the same instant.

Concurrency is about the composition of independently executing processes and describes a program's ability to manage multiple tasks by making progress on them without necessarily completing one before it starts another.

Parallelism, on the other hand, refers to the simultaneous execution of multiple computations. It is the technique of running two or more tasks or computations at the same time, utilizing multiple processors or cores within a computer to perform several operations concurrently. Parallelism requires hardware with multiple processing units, and its primary goal is to increase the throughput and computational speed of a system.

In practical terms, concurrency enables a program to remain responsive to input, perform background tasks, and handle multiple operations in a seemingly simultaneous manner, even on a single-core processor. It's particularly useful in I/O-bound and high-latency operations where programs need to wait for external events, such as file, network, or user interactions.

Parallelism, with its ability to perform multiple operations at the same time, is crucial in CPU-bound tasks where computational speed and throughput are the bottlenecks. Applications that require heavy mathematical computations, data analysis, image processing, and real-time processing can significantly benefit from parallel execution.
 
-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
What does API gateway do?
.
.
The diagram below shows the detail.

Step 1 - The client sends an HTTP request to the API gateway.

Step 2 - The API gateway parses and validates the attributes in the HTTP request.

Step 3 - The API gateway performs allow-list/deny-list checks.

Step 4 - The API gateway talks to an identity provider for authentication and authorization.

Step 5 - The rate limiting rules are applied to the request. If it is over the limit, the request is rejected.

Steps 6 and 7 - Now that the request has passed basic checks, the API gateway finds the relevant service to route to by path matching.

Step 8 - The API gateway transforms the request into the appropriate protocol and sends it to backend microservices.

Steps 9-12: The API gateway can handle errors properly, and deals with faults if the error takes a longer time to recover (circuit break). It can also leverage ELK (Elastic-Logstash-Kibana) stack for logging and monitoring. We sometimes cache data in the API gateway.

Over to you: 1) What’s the difference between a load balancer and an API gateway?
2) Do we need to use different API gateways for PC, mobile and browser separately?

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Would it be nice if the code we wrote automatically turned into architecture diagrams?

I recently discovered a Github repo that does exactly this: Diagram as Code for prototyping cloud system architectures.

𝐖𝐡𝐚𝐭 𝐝𝐨𝐞𝐬 𝐢𝐭 𝐝𝐨?
- Draw the cloud system architecture in Python code.
- Diagrams can also be rendered directly inside the Jupyter Notebooks.
- No design tools are needed. 
- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc. 
 
Github repo: mingrammer/diagrams

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Would it be nice if the code we wrote automatically turned into architecture diagrams?

A Github repo that does exactly this: Diagram as Code for prototyping cloud system architectures.

𝐖𝐡𝐚𝐭 𝐝𝐨𝐞𝐬 𝐢𝐭 𝐝𝐨?
- Draw the cloud system architecture in Python code.
- Diagrams can also be rendered directly inside the Jupyter Notebooks.
- No design tools are needed. 
- Supports the following providers: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud, etc.

Github repo link in the comment.

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Visualizing a SQL query
.
.
SQL statements are executed by the database system in several steps, including:

- Parsing the SQL statement and checking its validity
- Transforming the SQL into an internal representation, such as relational algebra
- Optimizing the internal representation and creating an execution plan that utilizes index information

Executing the plan and returning the results
- The execution of SQL is highly complex and involves many considerations, such as:
- The use of indexes and caches
- The order of table joins
- Concurrency control
- Transaction management

Over to you: what is your favorite SQL statement?


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
REST API Vs. GraphQL 
 
When it comes to API design, REST and GraphQL each have their own strengths and weaknesses. 
 
REST 
- Uses standard HTTP methods like GET, POST, PUT, DELETE for CRUD operations. 
- Works well when you need simple, uniform interfaces between separate services/applications. 
- Caching strategies are straightforward to implement. 
- The downside is it may require multiple roundtrips to assemble related data from separate endpoints. 
 
GraphQL 
- Provides a single endpoint for clients to query for precisely the data they need. 
- Clients specify the exact fields required in nested queries, and the server returns optimized payloads containing just those fields. 
- Supports Mutations for modifying data and Subscriptions for real-time notifications. 
- Great for aggregating data from multiple sources and works well with rapidly evolving frontend requirements. 
- However, it shifts complexity to the client side and can allow abusive queries if not properly safeguarded 
- Caching strategies can be more complicated than REST. 
 
The best choice between REST and GraphQL depends on the specific requirements of the application and development team. GraphQL is a good fit for complex or frequently changing frontend needs, while REST suits applications where simple and consistent contracts are preferred.

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Which HTTP status codes are most common?
.
.
The response codes for HTTP are divided into five categories:

Informational (100-199)
Success (200-299)
Redirection (300-399)
Client Error (400-499)
Server Error (500-599)

These codes are defined in RFC 9110. To save you from reading the entire document (which is about 200 pages), here is a summary of the most common ones:

Over to you: HTTP status code 401 is for Unauthorized. Can you explain the difference between authentication and authorization, and which one does code 401 check for?


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/42Ex9oZ

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Git merge vs. Git rebase. The forever debate. 
.
.
What are the differences?

When we merge changes from one Git branch to another, we can use ‘git merge’ or ‘git rebase’. The diagram below shows how the two commands work.

Git merge
This creates a new commit G’ in the main branch. G’ ties the histories of both main and feature branches.

Git merge is non-destructive. Neither the main nor the feature branch is changed.

Git rebase
Git rebase moves the feature branch histories to the head of the main branch. It creates new commits E’, F’, and G’ for each commit in the feature branch.

The benefit of rebase is that it has linear commit history.

Rebase can be dangerous if “the golden rule of git rebase” is not followed.

The golden rule of Git rebase
Never use it on public branches!

Over to you: Git merge or rebase? Which one do you prefer?

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/42Ex9oZ

#systemdesign #coding #interviewtips
.
What is DevSecOps?
.
.
DevSecOps emerged as a natural evolution of DevOps practices with a focus on integrating security into the software development and deployment process. The term “DevSecOps“ represents the convergence of Development (Dev), Security (Sec), and Operations (Ops) practices, emphasizing the importance of security throughout the software development lifecycle.

The diagram below shows the important concepts in DevSecOps.

1 . Automated Security Checks
2 . Continuous Monitoring
3 . CI/CD Automation
4 . Infrastructure as Code (IaC)
5 . Container Security
6 . Secret Management
7 . Threat Modeling
8. Quality Assurance (QA) Integration
9 . Collaboration and Communication
10 . Vulnerability Management

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
What is a webhook?

The diagram below shows a comparison between polling and webhook.

Assume we run an eCommerce website. The clients send orders to the order service via the API gateway, which goes to the payment service for payment transactions. The payment service then talks to an external payment service provider (PSP) to complete the transactions.

There are two ways to handle communications with the external PSP.

🔹 1. Short polling
After sending the payment request to the PSP, the payment service keeps asking the PSP about the payment status. After several rounds, the PSP finally returns with the status.

Short polling has two drawbacks:
1) Constant polling of the status requires resources from the payment service.
2) The External service communicates directly with the payment service, creating security vulnerabilities.

🔹 2. Webhook
We can register a webhook with the external service. It means: call me back at a certain URL when you have updates on the request. When the PSP has completed the processing, it will invoke the HTTP request to update the payment status.

In this way, the programming paradigm is changed, and the payment service doesn’t need to waste resources to poll the payment status anymore.

What if the PSP never calls back? We can set up a housekeeping job to check payment status every hour.

Webhooks are often referred to as reverse APIs or push APIs because the server sends HTTP requests to the client. We need to pay attention to 3 things when using a webhook:
1) We need to design a proper API for the external service to call.
2) We need to set up proper rules in the API gateway for security reasons.
3) We need to register the correct URL at the external service.

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq
 
#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Top 6 Load Balancing Algorithms

🔹 Static Algorithms
1. Round robin
The client requests are sent to different service instances in sequential order. The services are usually required to be stateless.

2. Sticky round-robin
This is an improvement of the round-robin algorithm. If Alice’s first request goes to service A, the following requests go to service A as well.

3. Weighted round-robin
The admin can specify the weight for each service. The ones with a higher weight handle more requests than others.

4. Hash
This algorithm applies a hash function on the incoming requests’ IP or URL. The requests are routed to relevant instances based on the hash function result.

🔹 Dynamic Algorithms
5. Least connections
A new request is sent to the service instance with the least concurrent connections.

6. Least response time
A new request is sent to the service instance with the fastest response time.

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Domain-Driven Design (DDD)
.
.
DDD was introduced in Eric Evans’ classic book “Domain-Driven Design: Tackling Complexity in the Heart of Software”. It explained a methodology to model a complex business. There is a lot of content in this book, so I'll summarize the basics.

The composition of domain objects:
- Entity: a domain object that has ID and life cycle. 
- Value Object: a domain object without ID. It is used to describe the property of Entity.
- Aggregate: a collection of Entities that are bounded together by Aggregate Root (which is also an entity). It is the unit of storage.

The life cycle of domain objects:
- Repository: storing and loading the Aggregate.
- Factory: handling the creation of the Aggregate.

Behavior of domain objects:
- Domain Service: orchestrate multiple Aggregate.
- Domain Event: a description of what has happened to the Aggregate. The publication is made public so others can consume and reconstruct it.

Congratulations on getting this far. Now you know the basics of DDD. If you want to learn more, I highly recommend the book. It might help to simplify the complexity of software modeling.


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/42Ex9oZ

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
7 must-know strategies to scale your database. 
 
1 - Indexing: 
Check the query patterns of your application and create the right indexes. 
 
2 - Materialized Views: 
Pre-compute complex query results and store them for faster access. 
 
3 - Denormalization: 
Reduce complex joins to improve query performance. 
 
4 - Vertical Scaling 
Boost your database server by adding more CPU, RAM, or storage. 
 
5 - Caching 
Store frequently accessed data in a faster storage layer to reduce database load. 
 
6 - Replication 
Create replicas of your primary database on different servers for scaling the reads. 
 
7 - Sharding 
Split your database tables into smaller pieces and spread them across servers. Used for scaling the writes as well as the reads. 
 
Over to you: What other strategies do you use for scaling your databases?

-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Where do we cache data?
.
.
This diagram illustrates where we cache data in a typical architecture.

There are multiple layers along the flow.

Client apps: HTTP responses can be cached by the browser. We request data over HTTP for the first time, and it is returned with an expiry policy in the HTTP header; we request data again, and the client app tries to retrieve the data from the browser cache first.

CDN: CDN caches static web resources. The clients can retrieve data from a CDN node nearby.

Load Balancer: The load Balancer can cache resources as well.

Messaging infra: Message brokers store messages on disk first, and then consumers retrieve them at their own pace. Depending on the retention policy, the data is cached in Kafka clusters for a period of time.

Services: There are multiple layers of cache in a service. If the data is not cached in the CPU cache, the service will try to retrieve the data from memory. Sometimes the service has a second-level cache to store data on disk.

Distributed Cache: Distributed cache like Redis hold key-value pairs for multiple services in memory. It provides much better read/write performance than the database.

Full-text Search: we sometimes need to use full-text searches like Elastic Search for document search or log search. A copy of data is indexed in the search engine as well.

Database: Even in the database, we have different levels of caches:

- WAL(Write-ahead Log): data is written to WAL first before building the B tree index
- Bufferpool: A memory area allocated to cache query results
- Materialized View: Pre-compute query results and store them in the database tables for better query performance

Transaction log: record all the transactions and database updates

Replication Log: used to record the replication state in a database cluster

Over to you: With the data cached at so many levels, how can we guarantee the sensitive user data is completely erased from the systems?

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/42Ex9oZ

#systemdesign #coding #interviewtips
.
How will you design the Stack Overflow website?

If your answer is on-premise servers and monolith, you would likely fail the interview, but that's how it is built in reality!

𝐖𝐡𝐚𝐭 𝐩𝐞𝐨𝐩𝐥𝐞 𝐭𝐡𝐢𝐧𝐤 𝐢𝐭 𝐬𝐡𝐨𝐮𝐥𝐝 𝐥𝐨𝐨𝐤 𝐥𝐢𝐤𝐞:
The interviewer is probably expecting something on the left side.
1. Microservice is used to decompose the system into small components.
2. Each service has its own database. Use cache heavily.
3. The service is sharded.
4. The services talk to each other asynchronously through message queues.
5. The service is implemented using Event Sourcing with CQRS.
6. Displaying knowledge in distributed systems such as eventual consistency, CAP theorem, etc.

𝐖𝐡𝐚𝐭 𝐢𝐭 𝐚𝐜𝐭𝐮𝐚𝐥𝐥𝐲 𝐢𝐬:
Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud.

This is contrary to all our popular beliefs these days.

Subscribe to our weekly free newsletter to learn something new every week: https://bit.ly/3FEGliw

#systemdesign #coding #interviewtips
.
Types of Memory and Storage

- The fundamental duo: RAM and ROM
- DDR4 and DDR5
- Firmware and BIOS
- SRAM and DRAM
- HDD, SSD, USB Drive, SD Card

-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
How does VISA work when we 𝐬𝐰𝐢𝐩𝐞 𝐚 𝐜𝐫𝐞𝐝𝐢𝐭 𝐜𝐚𝐫𝐝 at a merchant’s shop?
.
.
VISA, Mastercard, and American Express act as card networks for the clearing and settling of funds. The card acquiring bank and the card issuing bank can be – and often are – different. If banks were to settle transactions one by one without an intermediary, each bank would have to settle the transactions with all the other banks. This is quite inefficient.  
 
The diagram below shows VISA’s role in the credit card payment process. There are two flows involved. Authorization flow happens when the customer swipes the credit card. Capture and settlement flow happens when the merchant wants to get the money at the end of the day.
 
🔹Authorization Flow
Step 0: The card issuing bank issues credit cards to its customers. 
 
Step 1: The cardholder wants to buy a product and swipes the credit card at the Point of Sale (POS) terminal in the merchant’s shop.
 
Step 2: The POS terminal sends the transaction to the acquiring bank, which has provided the POS terminal.
 
Steps 3 and 4: The acquiring bank sends the transaction to the card network, also called the card scheme. The card network sends the transaction to the issuing bank for approval.
 
Steps 4.1, 4.2 and 4.3: The issuing bank freezes the money if the transaction is approved. The approval or rejection is sent back to the acquirer, as well as the POS terminal. 
 
🔹Capture and Settlement Flow
Steps 1 and 2: The merchant wants to collect the money at the end of the day, so they hit ”capture” on the POS terminal. The transactions are sent to the acquirer in batch. The acquirer sends the batch file with transactions to the card network.
 
Step 3: The card network performs clearing for the transactions collected from different acquirers, and sends the clearing files to different issuing banks.
 
Step 4: The issuing banks confirm the correctness of the clearing files, and transfer money to the relevant acquiring banks.
 
Step 5: The acquiring bank then transfers money to the merchant’s bank. 
 
Step 4: The card network clears the transactions from different acquiring banks. Clearing is a process in which mutual offset transactions are netted, so the number of total transactions is reduced.
 
In the process, the card network takes on the burden of talking to each bank and receives service fees in return.
 
Over to you: Do you think this flow is way too complicated? What will be the future of payments in your opinion?
 

Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/42Ex9oZ
 
#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
What are the top 𝐜𝐚𝐜𝐡𝐞 strategies?

Read data from the system:
🔹 Cache aside
🔹 Read through

Write data to the system:
🔹 Write around
🔹 Write back
🔹 Write through

The diagram below illustrates how those 5 strategies work. Some can be used together.
 
--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
How is data sent over the internet? What does that have to do with the OSI model? How does TCP/IP fit into this? 
 
7 Layers in the OSI model are: 
1. Physical Layer 
2. Data Link Layer 
3. Network Layer 
4. Transport Layer 
5. Session Layer 
6. Presentation Layer 
7. Application Layer

-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
How does Docker work?

The diagram below shows the architecture of Docker and how it works when we run “docker build”, “docker pull” and “docker run”.

There are 3 components in Docker architecture:

🔹 Docker client
The docker client talks to the Docker daemon.

🔹 Docker host
The Docker daemon listens for Docker API requests and manages Docker objects such as images, containers, networks, and volumes.

🔹 Docker registry
A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use.

Let’s take the “docker run” command as an example.
1. Docker pulls the image from the registry.
2. Docker creates a new container.
3. Docker allocates a read-write filesystem to the container.
4. Docker creates a network interface to connect the container to the default network.
5. Docker starts the container.


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Netflix Tech Stack - Part 1 (CI/CD Pipeline) 
 
Planing: Netflix Engineering uses JIRA for planning and Confluence for documentation. 
 
Coding: Java is the primary programming language for the backend service, while other languages are used for different use cases. 
 
Build: Gradle is mainly used for building, and Gradle plugins are built to support various use cases. 
 
Packaging: Package and dependencies are packed into an Amazon Machine Image (AMI) for release. 
 
Testing: Testing emphasizes the production culture's focus on building chaos tools. 
 
Deployment: Netflix uses its self-built Spinnaker for canary rollout deployment. 
 
Monitoring: The monitoring metrics are centralized in Atlas, and Kayenta is used to detect anomalies. 
 
Incident report: Incidents are dispatched according to priority, and PagerDuty is used for incident handling.


Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
How many API architecture styles do you know?
 
Architecture styles define how different components of an application programming interface (API) interact with one another. As a result, they ensure efficiency, reliability, and ease of integration with other systems by providing a standard approach to designing and building APIs. Here are the most used styles: 
 
🔹SOAP: 
Mature, comprehensive, XML-based 
Best for enterprise applications 
 
🔹RESTful: 
Popular, easy-to-implement, HTTP methods 
Ideal for web services 
 
🔹GraphQL: 
Query language, request specific data 
Reduces network overhead, faster responses 
 
🔹gRPC: 
Modern, high-performance, Protocol Buffers 
Suitable for microservices architectures 
 
🔹WebSocket: 
Real-time, bidirectional, persistent connections 
Perfect for low-latency data exchange 
 
🔹Webhook: 
Event-driven, HTTP callbacks, asynchronous 
Notifies systems when events occur 
 
Over to you: Are there any other famous styles we missed?

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
What distinguishes MVC, MVP, MVVM, MVVM-C, and VIPER architecture patterns from each other?

These architecture patterns are among the most commonly used in app development, whether on iOS or Android platforms. Developers have introduced them to overcome the limitations of earlier patterns. So, how do they differ?

- MVC, the oldest pattern, dates back almost 50 years
- Every pattern has a “view“ (V) responsible for displaying content and receiving user input
- Most patterns include a “model“ (M) to manage business data
- “Controller,“ “presenter,“ and “view-model“ are translators that mediate between the view and the model (“entity“ in the VIPER pattern)
- These translators can be quite complex to write, so various patterns have been proposed to make them more maintainable

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
Authentication in REST APIs acts as the crucial gateway, ensuring that solely authorized users or applications gain access to the API's resources. 
 
Some popular authentication methods for REST APIs include: 
 
1. Basic Authentication: 
Involves sending a username and password with each request, but can be less secure without encryption. 
 
When to use: 
Suitable for simple applications where security and encryption aren’t the primary concern or when used over secured connections. 
 
2. Token Authentication: 
Uses generated tokens, like JSON Web Tokens (JWT), exchanged between client and server, offering enhanced security without sending login credentials with each request. 
 
When to use: 
Ideal for more secure and scalable systems, especially when avoiding sending login credentials with each request is a priority. 
 
3. OAuth Authentication: 
Enables third-party limited access to user resources without revealing credentials by issuing access tokens after user authentication. 
 
When to use: 
Ideal for scenarios requiring controlled access to user resources by third-party applications or services. 
 
4. API Key Authentication: 
Assigns unique keys to users or applications, sent in headers or parameters; while simple, it might lack the security features of token-based or OAuth methods. 
 
When to use: 
Convenient for straightforward access control in less sensitive environments or for granting access to certain functionalities without the need for user-specific permissions. 
 
Over to you: 
Which REST API authentication method do you find most effective in ensuring both security and usability for your applications?

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/bbg-social

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Popular interview question: What is the difference between Process and Thread? 
 
To better understand this question, let’s first take a look at what is a Program. A Program is an executable file containing a set of instructions and passively stored on disk. One program can have multiple processes. For example, the Chrome browser creates a different process for every single tab. 
 
A Process means a program is in execution. When a program is loaded into the memory and becomes active, the program becomes a process. The process requires some essential resources such as registers, program counter, and stack. 
 
A Thread is the smallest unit of execution within a process. 
 
The following process explains the relationship between program, process, and thread. 
 
1. The program contains a set of instructions. 
2. The program is loaded into memory. It becomes one or more running processes. 
3. When a process starts, it is assigned memory and resources. A process can have one or more threads. For example, in the Microsoft Word app, a thread might be responsible for spelling checking and the other thread for inserting text into the doc. 
 
Main differences between process and thread: 
 
🔹 Processes are usually independent, while threads exist as subsets of a process. 
🔹 Each process has its own memory space. Threads that belong to the same process share the same memory. 
🔹 A process is a heavyweight operation. It takes more time to create and terminate. 
🔹 Context switching is more expensive between processes. 
🔹 Inter-thread communication is faster for threads. 
 
Over to you: 
1). Some programming languages support coroutine. What is the difference between coroutine and thread? 
2). How to list running processes in Linux?

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?

These terms are all related to user identity management. When you log into a website, you declare who you are (identification). Your identity is verified (authentication), and you are granted the necessary permissions (authorization). Many solutions have been proposed in the past, and the list keeps growing.

From simple to complex, here is my understanding of user identity management:

WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today.

A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly.

To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming.

JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side.

By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information

By using OAuth 2.0, you can authorize one website to access your information on another website

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
18 Most-used Linux Commands You Should Know 
 
Linux commands are instructions for interacting with the operating system. They help manage files, directories, system processes, and many other aspects of the system. You need to become familiar with these commands in order to navigate and maintain Linux-based systems efficiently and effectively. The following are some popular Linux commands: 
 
🔹ls - List files and directories 
🔹cd - Change the current directory 
🔹mkdir - Create a new directory 
🔹rm - Remove files or directories 
🔹cp - Copy files or directories 
🔹mv - Move or rename files or directories 
🔹chmod - Change file or directory permissions 
🔹grep - Search for a pattern in files 
🔹find - Search for files and directories 
🔹tar - manipulate tarball archive files 
🔹vi - Edit files using text editors 
🔹cat - display the content of files 
🔹top - Display processes and resource usage 
🔹ps - Display processes information 
🔹kill - Terminate a process by sending a signal 
🔹du - Estimate file space usage 
🔹ifconfig - Configure network interfaces 
🔹ping - Test network connectivity between hosts 
 
Over to you: What is your favorite Linux command?

-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Do you believe that Google, Meta, Uber, and Airbnb put almost all of their code in one repository? 
 
This practice is called a monorepo. 
 
Monorepo vs. Microrepo. Which is the best? Why do different companies choose different options? 
 
Monorepo isn't new; Linux and Windows were both created using Monorepo. To improve scalability and build speed, Google developed its internal dedicated toolchain to scale it faster and strict coding quality standards to keep it consistent. 
 
Amazon and Netflix are major ambassadors of the Microservice philosophy. This approach naturally separates the service code into separate repositories. It scales faster but can lead to governance pain points later on. 
 
Within Monorepo, each service is a folder, and every folder has a BUILD config and OWNERS permission control. Every service member is responsible for their own folder. 
 
On the other hand, in Microrepo, each service is responsible for its repository, with the build config and permissions typically set for the entire repository. 
 
In Monorepo, dependencies are shared across the entire codebase regardless of your business, so when there's a version upgrade, every codebase upgrades their version. 
 
In Microrepo, dependencies are controlled within each repository. Businesses choose when to upgrade their versions based on their own schedules. 
 
Monorepo has a standard for check-ins. Google's code review process is famously known for setting a high bar, ensuring a coherent quality standard for Monorepo, regardless of the business. 
 
Microrepo can either set their own standard or adopt a shared standard by incorporating best practices. It can scale faster for business, but the code quality might be a bit different. 
 
Google engineers built Bazel, and Meta built Buck. There are other open-source tools available, including Nix, Lerna, and others. 
 
Over the years, Microrepo has had more supported tools, including Maven and Gradle for Java, NPM for NodeJS, and CMake for C/C++, among others. 
 
Over to you: Which option do you think is better? Which code repository strategy does your company use?

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Top 6 Load Balancing Algorithms.

🔹 Static Algorithms
1. Round robin
The client requests are sent to different service instances in sequential order. The services are usually required to be stateless.

2. Sticky round-robin
This is an improvement of the round-robin algorithm. If Alice’s first request goes to service A, the following requests go to service A as well.

3. Weighted round-robin
The admin can specify the weight for each service. The ones with a higher weight handle more requests than others.

4. Hash
This algorithm applies a hash function on the incoming requests’ IP or URL. The requests are routed to relevant instances based on the hash function result.

🔹 Dynamic Algorithms
5. Least connections
A new request is sent to the service instance with the least concurrent connections.

6. Least response time
A new request is sent to the service instance with the fastest response time.

--
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
DevOps vs. SRE vs. Platform Engineering. What is the difference?
 
The concepts of DevOps, SRE, and Platform Engineering have emerged at different times and have been developed by various individuals and organizations. 
 
DevOps as a concept was introduced in 2009 by Patrick Debois and Andrew Shafer at the Agile conference. They sought to bridge the gap between software development and operations by promoting a collaborative culture and shared responsibility for the entire software development lifecycle. 
 
SRE, or Site Reliability Engineering, was pioneered by Google in the early 2000s to address operational challenges in managing large-scale, complex systems. Google developed SRE practices and tools, such as the Borg cluster management system and the Monarch monitoring system, to improve the reliability and efficiency of their services. 
 
Platform Engineering is a more recent concept, building on the foundation of SRE engineering. The precise origins of Platform Engineering are less clear, but it is generally understood to be an extension of the DevOps and SRE practices, with a focus on delivering a comprehensive platform for product development that supports the entire business perspective. 
 
It's worth noting that while these concepts emerged at different times. They are all related to the broader trend of improving collaboration, automation, and efficiency in software development and operations.

Over to you: Which topics would you like us to address in our next discussion?

-- 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq

#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo
Have you heard of Domain-Driven Design (DDD), a major software design approach?

DDD was introduced in Eric Evans’ classic book “Domain-Driven Design: Tackling Complexity in the Heart of Software”. It explained a methodology to model a complex business. In this book, there is a lot of content, so I'll summarize the basics.

𝐓𝐡𝐞 𝐜𝐨𝐦𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧 𝐨𝐟 𝐝𝐨𝐦𝐚𝐢𝐧 𝐨𝐛𝐣𝐞𝐜𝐭𝐬:
🔹Entity: a domain object that has ID and life cycle. 
🔹Value Object: a domain object without ID. It is used to describe the property of Entity.
🔹Aggregate: a collection of Entities that are bounded together by Aggregate Root (which is also an entity). It is the unit of storage.

𝐓𝐡𝐞 𝐥𝐢𝐟𝐞 𝐜𝐲𝐜𝐥𝐞 𝐨𝐟 𝐝𝐨𝐦𝐚𝐢𝐧 𝐨𝐛𝐣𝐞𝐜𝐭𝐬:
🔹Repository: storing and loading the Aggregate.
🔹Factory: handling the creation of the Aggregate.

𝐁𝐞𝐡𝐚𝐯𝐢𝐨𝐫 𝐨𝐟 𝐝𝐨𝐦𝐚𝐢𝐧 𝐨𝐛𝐣𝐞𝐜𝐭𝐬:
🔹Domain Service: orchestrate multiple Aggregate.
🔹Domain Event: a description of what has happened to the Aggregate. The publication is made public so others can consume and reconstruct it.

Congratulations on getting this far. Now you know the basics of DDD. If you want to learn more, I highly recommend the book. It might help to simplify the complexity of software modeling.

Over to you: do you know how to check the equality of two Value Objects? How about two Entities?

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips
.
Post image by ByteByteGo
8 Key Data Structures That Power Modern Databases
.
.
🔹Skiplist: a common in-memory index type. Used in Redis
🔹Hash index: a very common implementation of the “Map” data structure (or “Collection”)
🔹SSTable: immutable on-disk “Map” implementation
🔹LSM tree: Skiplist + SSTable. High write throughput
🔹B-tree: disk-based solution. Consistent read/write performance
🔹Inverted index: used for document indexing. Used in Lucene
🔹Suffix tree: for string pattern search
🔹R-tree: multi-dimension search, such as finding the nearest neighbor

– 
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://bit.ly/3KCnWXq 
 
#systemdesign #coding #interviewtips 
.
Post image by ByteByteGo

Related Influencers