Last week, we rolled out a new set of powerful monitoring items to our PowerCloud Dashboard. Our objective, while designing the new items to add, was to give both Cloudways and our customers a much better insight into their application performance and to help uncover weaknesses in the caching approach that the application is using (being proper caching the pillar for optimal performance in a cloud environment).
In this post, we will go over the new items in detail, so we can have a good understanding of them and how to use them to gauge your application caching performance.
Here you can see the new server view on the Dashboard:
We have rearranged the server information (under server name) and with a quick glance, you can now check the health of the server, uptime, public IP and the list of websites present in the server, information about cloud and location, OS, processor, and memory.
Under server information, we have the Services Control area that I hope you are already familiar with. Thanks to the network orchestration layer that we recently deployed, we are now able to detect the services in each server that are susceptible of being acted upon (Start, Stop, Restart, …). In the screenshot above, the system has detected Nginx and Memcached that can be stopped or restarted and Varnish that can be stopped, restarted and purged. The system automatically decides which actions can be taken depending on the service and the service state.
Under the Service Control, you have the Backup Status Information where you can see the last backup time for different types of backup available.
Finally, on the left side of the panel (under Monitoring), you have the complete list of monitoring items, including the all new ones.
As this article focuses precisely on these new items, let’s review them in detail.
Under the typical items that have always been present in our Dashboard Monitoring area (Incoming/Outgoing Traffic, Free Memory, Free Disk, and Idle CPU), Aggregated Bandwidth is the first new item that you will find. It is also the only one not directly related to caching.
It basically tracks the aggregated bandwidth on a monthly basis and you can use it to follow the monthly trend of bandwidth usage. Let’s see an example:
Here you can see that total bandwidth last month reached 853 GB and that this month we are already over 500 GB.
Next two new items relate to the caching domain. These focus on APC. Alternative PHP Cache (APC) is a PHP caching mechanism that caches PHP bytecode compiler output; thus, hugely increases the page generation speed (up to x3 or x4). It is CRITICAL for your application performance that APC works smoothly and caches as much PHP code as possible.
We monitor two items here:
- APC Fill Ratio: So, how filled your APC cache is? The only thing to watch here is if we are consistently filling the 100% of the available memory (typically 128MB). If so, we will need to increase the memory available to APC.
- APC Hits: This is the most important metric and monitors the percentage of successful hits to the cache (so pages requested for which bytecode was present in the cache already). It should always be well above 90% on a healthy server.
Let’s see screenshots of our server:
So, here we can see that we have room in our APC Cache (average 78.51% full over the last 24 hours).
And here that the hit ratio is optimal (99.07% average over the last 24 hours).
Like APC, Memcached is another cache layer that stores data in RAM. As APC focuses on PHP bytecode, Memcached (in Cloudways context) is used mainly to cache database calls (although, it can be used to cache other type of data too). Thus, having a healthy Memcached greatly offloads the database, consequently increasing overall application performance.
Like APC, and with the same meaning, the items we specifically monitor are Fill Ratio and Number of Hits. As with APC, acceptable hit rates are always above 90%; with Memcached it will greatly depend on the application and how we have integrated Memcached to it. But as always, the higher the better.
Again, an example from our server:
Fill ratio remains constant at around 70% for the last 24 hours, which is perfectly fine.
And, the hit rate is 97% which is absolutely great.
And the last set of new metrics we have added, refer to the last layer of caching we typically use: Varnish. Varnish is an HTTP Accelerator (also known as “HTTP Reverse Proxy”). It sits in-front of the web server (i.e. Apache) as it continues to cache its responses (images, CSS, PHP code, etc.). So basically, Varnish offloads the web server and therefore, increases overall application performance.
We again focus on Hits at this level as well as on a new item called Nuked.
As with all other metrics, Hits work on “the higher the better” principle, and with Varnish, we should again aim at constantly getting values above 90%. It may depend though on server configuration that we may achieve it.
Nuked value refers to objects thrown out of the cache to make room for others. If the value is consistently above zero, then we need to increase the size of our Varnish cache.
Again, some examples:
Here we can see that average hit rate for last 24 hours is 13.73% which is very low. This means we need to take a closer look at this server to check how Varnish is integrated and if there is any way to improve this result as Varnish is under-utilized.
The Nuked value for last 24 hours is zero which means we have no issue with Varnish cache size.
So, with our new monitoring items, we have given you a quick and powerful way to assess the health of your application at a glance. This, along with the ability to control individual services from the Dashboard and the additional information we provide, gets us one step closer to be able to deliver the feature-full Power Dashboard that we envision.
Let us know if there are any other items you would like to have monitored. Finally, we are building this for you!
Pere Hospital (CISSP & OSCP) is the CTO and co-founder of Cloudways Ltd. He has over two decades of experience in IT Security, Risk Analysis and Virtualization Technologies. You can follow Pere on Twitter at @phospital and read his blog at www.perehospital.cat
Start Creating Web Apps on Managed Cloud Servers Now
Easy Web App Deployment for Agencies, Developers and E-Commerce Industry.