Scaling at “Container-level” and “Workload-level”

Increasing the multiple instances of servers when more and more requests are coming to the server, might be the case when instance of your application running at certain server utilizing high CPU or memory. In such kind of cases we need to increase our resource like CPU, memory, etc.
But when we are talking about scaling in container world, people often confused between “Workload-level scaling” and “Container-level scaling”. Let’s try to understand these both with the help of some examples.
CONTAINER LEVEL SCALING
Scaling at the container level means increasing or decreasing the number of containers that are running within a system or a cluster. For example, if we have a web application that is running in a container, we might scale the container by adding more instances of the container to handle increased traffic. This would be scaling at the container level, as we are changing the number of containers that are running.

WORKLOAD LEVEL SCALING
Scaling at the workload level means increasing or decreasing the amount of resources (e.g. CPU, memory) that are allocated to a specific workload.
If we wanted to handle increased traffic by allocating more CPU or memory to the web application, we would be scaling at the workload level. This might involve adjusting the resource limits or reservations for the workload, or adding more resources to the system as a whole.

NOTE : Scaling at the container level can be useful when you want to quickly and easily increase or decrease the capacity of your system to handle changes in demand. Scaling at the workload level can be useful when you want to fine-tune the resources that are available to a specific workload to optimize its performance.
Let us understand this by taking an example, Consider an application that is running in a container and we want to scale it up to handle the traffic. We can simply do this by adding more containers in our cluster. But suppose a database is running inside a container and we want to improve its performance, which can be done by increasing the CPU or memory resources (workload-scaling) that are allocated to the container.