Tuesday, April 30, 2024

FSxN - OnTAP on AWS limited by network bandwidth of EC2 Instnace



In DBA world stakes are high for a business critical database demanding high levels of performance. As a robust & high performing storage is expected to meet the low latency requirements, OnTAP which has been delivering on storage front for decades,  extend their services on AWS Cloud as FSxN.


FSxN storage is being used as NFS mount on a Linux server (EC2 instance) hosting an Oracle database which is a part of Dataguard configuration. While a lag caused the standby database to become out of sync, once the underlying issue of extending mount was fixed, shipping & reading of archiving logs was painstakingly slow.



BlueXP & AWS Console showed no spike in key metrics of latency, IOPS or throughput, however the apply of archiving logs & copy of archive logs from remote primary database server to standby server was very slow.


Even though CPU, memory utilization was low, the copy of archive logs & apply of it was slow as the network throughput limit was reached for x1e.xlarge EC2 instance. To resolve this issue instance type was changed from x1e.xlarge to r6id.2xlarge, this has a throughput of 3125 mbps. Post instance type  change copy & apply of archive logs throughput increased & allowed to get standby in sync.