After completing migration from SQL Server from Azure VM to Azure SQL managed instance, I am faced with high IO latency issues with a few of our larger databases (400 GB, 300 GB etc.)
In the current production environment (SQL Server from Azure VM)
Current stats in SQL VM
DiskRead Bytes/Sec – 9 MB
Disk Write Bytes/Sec- 7 MB
Read and Write latencies: 4 – 6 ms
Azure SQL managed instance
Currently with “General-Purpose” tier with 8 cores you get 8 TB max storage space limit. Because the size of the database and transactional log files for a larger database directly impacts the IOPS and throughput, we would generally end up having bigger size files like 1 TB files (multiple database files and one log file) for both database and log files. This gives me IO Latencies starting from 7 to 250 ms. on the database files and 3 ms on the transactional log files.
Questions,
The DB has 3 files each of 1 TB and 1 log file of 1 TB while the actual size of the database is less than 500 GB.
-
Why am not getting the desired IO numbers? Any thoughts?
-
I already ended up using all 8 TB storage space on the managed instance (Multiple databases).
I know that our databases will not outgrow 8 TB allocated space in the next 2 years. (We delete the old data by switching the partitions). Is it safe to assume that the IO latency will not worsen over a period of time and force us to go to the next configuration (16 cores 16 TB storage space)? If this happens our costs will just double immediately which is a huge concern of course. Any thoughts on this? -
Any thoughts on using “Next-gen General Purpose (preview)” service tier?
It says Preview but the Azure documentation,
“https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/service-tiers-next-gen-general-purpose-use?view=azuresql&WT.mc_id=Portal-SqlAzureExtension”
does not seem to give a warning or anything as such. Can this be used for a production environment? Because it seems like “Next-gen General Purpose (preview)” is supposedly much better designed and should support a far better IO on storage space since this tier uses disks instead of blobs. The CPU cores, Storage space and IOPS can be configured on as per our need without having to go for a high-cost next tier with this option. Can anyone please confirm?
- If I move some databases to another instance say mi-2. Can I fail over mi-2 to “mi-sec”? While “mi-sec” is already configured as a failover group for “mi-primary”? This way I can have two smaller size mi instances configured to fail over to a single large instance. Just wondering if the vNet peering becomes an impediment. Is this allowed?
Things I tried,
-
Looked at Query store for
Regressed Queries – none found
Query Wait Statistics/Top Resource Consuming Queries – Found issues (Indices, type conversion issues, out of date Statistics) have been addressed. -
Ran Azure Diagnostics tool and took care of storage space throttling complaint by resizing the file size as recommended.
-
Tried a few variations in resizing the database files after verifying the results from “dm_io_virtual_file_stats”
Allocated space vs unused space ratio is high (meaning there is a much higher space which is unused for each database file).
Expectation
With these things in place, still the IO latency is high varying from 7 ms to 250 ms whereas I am expecting the IO latency to be within 0 – 10 ms.
Thanks for your time and help
SB