T2 Credits
I’ve now been using an Amazon “t2.nano” EC2 instance for my web hosting. These are designed to allow “Burstable Performance”. Amazon has a few different descriptions from the simplistic:
T2 instances accrue CPU Credits when they are idle, and use CPU credits when they are active.
to a much more detailed explanation.
The later document says that the credits are processed at millisecond resolution. But, the free monitoring tools only sample at 5-minute granularity, so it is difficult to see anything of finer granularity than that.
The essence is that each T2 instance is given a specific amount of CPU it is allowed to have (5% for nano, 10% for micro, and so on). Without the credit system (and if the credits are expired), the instance is allowed that baseline performance (5% is quite slow).
The Credit system allows the instance to use its 5% CPU averaged out over a longer period (24 hours) instead of being continual. This works well for uses where the CPU is mostly idle, but is occasionally needed for burst behavior. This helps with upgrades, and compilation of utilities and even web apps.
The benchmark
The somewhat simplistic description above suggests that in a given 24-hour period, a nano instance would be allowed 5% of 24-hours of full CPU, or 72 minutes. I won’t be considering the baseline 5% CPU it gets when the credits expire, mostly because I didn’t want to run my instance down to zero credits to see what happens (I’ve done it before, it gets very slow).
For a benchmark, I decided to compile my rsure program. Once
the dependencies are downloaded, a --release
build of rsure takes a
little over 5 minutes of CPU on one of these instances. Since 5
minutes is about the granularity of the monitoring, this helps to
watch the behavior of the CPU credits.
When I started, my instance had about 92 credits. Interestingly, this credit balance had been gradually declining (from a high of about 96). I’ll get to what is going on here in a minute. Each compilation would consume a little over 5 credits.
I ran the compilation a number of times, and was able to observe the credits dropping each time. Ultimately, I brought the credits down to about 42, and stopped the benchmark. I then watched the graph over the next few hours, expecting the CPU Credits to gradually increase as I earned back the credit.
However, it didn’t go back up. It just stayed right down at 42. This caused me to do some digging and to try to understand the various explanations. I was torn between wondering if the accounting method was just broken, and considering whether I needed to file a ticket.
The next day, however, it started to be clearer what was happening, which I’ll attempt to explain below.
CPU Credits
The CPU Credit system appears to be designed to allow the instance to use up to a certain percentage of the CPU, when looking at any given 24 hour window. The utilization will apply no matter what window you look at.
As far as I can tell, it works something like this. My explanation assumes some granularity that the credits are applied. I’ll use 5 minutes as an example, and 5% utilization, to match my benchmark. The real system uses finer granularity, but as mentioned above, I don’t have visibility into this, and the end result doesn’t really matter.
- The account starts with a certain positive balance of credits, in this case 30. These are described as “used first” and “never expiring”.
- For a given utilization, we can determine the number of credits that would apply to our window. For example, if we try this with a 5 minute window, we would get 5% of 5 minutes, or 0.25 credits every 5 minute window.
- During each window, we also monitor the CPU to determine the real utilization. This could be as high as 100% times the number of CPUs in the instance. If the instance ran out of credits, and was running at baseline, it would just be the baseline, which in this case would be 5%. This value can also be computed as a credit.
- Adjust the credit balance by adding the earned credits (0.25 in this case), and subtracting the used credits. Remember this value, as we will needed it in 24 hours to compute the expiration. If we still have a remaining initial credit balance, subtract from those, instead of subtracting from the day’s credit balance.
- In addition to the above, if the window 24 hours before the current window had a positive credit adjustment balance, subtract those credits from the current balance.
I will also speculate that the credits are limited to positive values, and will just remain at zero if they above were to cause them to go negative.
The above does explain the behavior I was seeing. When my instance had sat idle for a few days, the earned balance each slot was slightly less than 0.25 (due to a small amount of CPU on an idle system), and matched the retired balance of nearly 0.25 from the previous day’s idle period.
The initial balance at my benchmark (in the high 90s) was from the nearly 72 credits from 24-hours of accumulation, plus the remaining initial credits from creating the instance. This gradually declined as the near-idle usage consumed them.
When I ran the compilations, the credit balance went down. However, it did not climb immediately, because the incoming credits still matched the retired earned credits from 24 hours earlier. It didn’t start to climb again until the credits were matched by the less than full usage expiring. In fact, while there were still initial credits remaining, the expiration would always exceed the utilization, and the amount would continue to drop gradually.
My credits should continue to climb to nearly the 72 max credits earnable in 24 hours. It will never quite reach this value though, because any CPU usage during that window would lower the maximum possible.
The end result is a little confusing still, but does seem to match the “utilization averaged over a 24-hour window”.