CPU Sharing Explained: Performance Impact in Multi-Instance Systems

The Battle for the Core: Sharing CPU in Multi-Instance Environments

When multiple instances of the same application run concurrently on a single system, they must share limited CPU resources. This phenomenon is known as CPU sharing. It becomes particularly critical when CPU capacity is saturated, leading to performance degradation & increased latency

Scenario: Multiple instances of the same application

Test infra

System: 1 Core 2 Logical Processor.

OS Type: Windows

Application Type: Single threaded.

Data: Independent data sets

Operation Type: Read operations (likely I/O + CPU mixed)

Test-1 (Single instance - Standalone)

Instance-1 performed a read operation, taking 42 seconds and utilizing 100% of a single logical processor. This indicates that the application is single-threaded, as it fully consumed one CPU to process the data.

In the below image, blue line indicates instance-1 CPU and red line indicates overall system CPU.

Test-2 (Two instances - Parallel)

Instance-1 was initiated first, followed by Instance-2, both performing read operations. Although Instance-1 started earlier, it completed last, taking 92 seconds, while Instance-2, which was initiated later, completed first in 72 seconds.

During this activity, system CPU core is fully occupied.

In the below image, blue line indicates instance-1 CPU, green line indicates instance-2 CPU and red line indicates overall system CPU.

Reason for performance degrade between Test-1 & Test-2:

Standalone Instance-1: 42 seconds

Parallel Execution Instance-1 & Instance-2: 92 seconds & 57 seconds

Even though Windows shows 2 CPUs in Task Manager, both are logical siblings on the same core. They share execution units, caches, and queues, so true parallelism is not possible. Windows scheduler tries to balance them, but:

Instance 1 loses priority when instance 2 starts

Context switching and cache thrashing increase

Thermal or power throttling may occur under load

Hence, sufficient core should be available to perform equivalent to standalone performance.

Test-3 (Three instances - Parallel)

Instance-1 was initiated first, followed by Instance-2 and Instance-3 to perform read operations. During this test, CPU sharing occurred due to insufficient CPU resources, which led to performance degradation.

In the below image, blue line indicates instance-1 CPU, green line indicates instance-2 CPU, purple line indicates instance-3 CPU and red line indicates overall system CPU

Instance-1 was initiated first (blue line) and consumed 100% of one logical processor, while one logical processor remained available. When Instance-2 was initiated (green line), it fully utilized the second logical processor, leaving no additional logical processors available. At this point, only CPU sharing is possible. When Instance-3 was initiated (purple line), it had to share CPU resources with Instance-1. This resource sharing significantly degraded performance.

Instance-1 took 121 seconds

Instance-2 took 60 seconds

Instance-3 took 94 seconds

Reason for performance degrade

1. CPU overcommitment

a. Only 2 logical processors available.

b. single-threaded processes running concurrently.

c. The OS must context switch frequently to keep all 3 moving.

d. High CPU queue length leads to waiting time → longer execution.

2. Shared physical core

All three instances compete for:

a. Execution units (ALUs, FPUs)

b. CPU caches

c. Memory bus access

This causes:

Inter-process interference

Increased latency per operation

3. Unfair scheduling effects

Instance-1 gets penalized more heavily as it started first and ran the longest.

Instance-2 may have benefited from temporarily better access to CPU resources.

Instance-3 likely overlapped more with Instance-1’s tail, causing moderate slowdown.

Summary

Test	Ins-1	Ins-2	Ins-3	Performance
Test-1	42	X	X	Best Performance
Test-2	92	72	X	Both logical processors in use, moderate degradation
Test-3	121	60	94	CPU sharing occurs as third instance compete for limited logical CPU

For real-world usage, the number of concurrent single-threaded processes should not exceed the number of physical or logical CPUs available, depending on the workload sensitivity.

Get 7-days FREE Trial!