I/O calibration [message #617686] |
Wed, 02 July 2014 12:23 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Hello - I'm dealing with a performance issue, and I think the disc system is part of it. It is Windows servers, running in VMware machines. Is it possible for anyone to run the Resource Manager calibration and tell me the results with a bit of information about the environment, so that I can build up a list of what might be considered "normal"? I'ld like to get a comparison of virtualized with non-virtualized environments.
This routine will do,var al number
var mi number
var mm number
exec dbms_resource_manager.calibrate_io(max_iops=>:mi,max_mbps=>:mm,actual_latency=>:al)
print al
print mi
print mm For example, the Windows system I'm worried about gives me this,
AL 19
MI 561
MM 40
An Amazon EC2 small instance (which is a Xen VM) was,
AL 4
MI 134
MM 37
And my Sony laptop (SSD disc) is,
AL 0
MI 18682
MM 364
Thank you for any assistance.
|
|
|
Re: I/O calibration [message #617780 is a reply to message #617686] |
Thu, 03 July 2014 09:53 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
One more result, on my old Sun v20z server with 10k SCSI discs:
AL 4
MI 598
MM 1080
This is an example of what worries me: an ancient but real machine astronomically outperforms a new but virtualized machine. If anyone can do me another comparison, it would be helpful.
Incidentally, I bought my v20z a couple of years ago on eBay for fifty quid, you can buy some amazing stuff second hand really cheap.
|
|
|
|
Re: I/O calibration [message #617857 is a reply to message #617784] |
Fri, 04 July 2014 01:38 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Thank you for this. It does not confirm my hypothesis! Well, that is the point of experiment. I am committed to the scientific method.
|
|
|
|
Re: I/O calibration [message #617879 is a reply to message #617858] |
Fri, 04 July 2014 03:19 |
Roachcoach
Messages: 1576 Registered: May 2010 Location: UK
|
Senior Member |
|
|
I've fairly sure I've got nothing here that will be that low (wow, that sounds boastful ) so I'm not sure the merit of what I can post.
I'll run one and see what numbers come out.
Edit:
SET SERVEROUTPUT ON
DECLARE
lat INTEGER;
iops INTEGER;
mbps INTEGER;
BEGIN
-- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (<DISKS>, <MAX_LATENCY>, iops, mbps, lat);
DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat);
DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
DBMS_OUTPUT.PUT_LINE ('latency = ' || lat);
dbms_output.put_line('max_mbps = ' || mbps);
end;
/
max_iops = 38922
latency = 0
max_mbps = 702
SPARC T4-4, VMAX 20K under the hood. The actual SAN design is poor though, we should be getting more out of this. Lots of shared components between the host and the arrays
max_mbs is usually nearer 1800, dont know what happened there.
[Updated on: Fri, 04 July 2014 03:38] Report message to a moderator
|
|
|
Re: I/O calibration [message #617896 is a reply to message #617879] |
Fri, 04 July 2014 04:57 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Thank you, RC. Your comment that your SAN may be poorly designed makes the performance from my Windows VMs looks even more suspiciously bad.
|
|
|
Re: I/O calibration [message #617900 is a reply to message #617896] |
Fri, 04 July 2014 05:14 |
Roachcoach
Messages: 1576 Registered: May 2010 Location: UK
|
Senior Member |
|
|
It's more that it's non-live i.e. dev/test infrastructure. The prod one is correctly done - no sharing of parts. For dev/test it's not "may" it very much is.
Someone can run something heavy in a given environment and hammer shared infrastructure across a number of different areas without even knowing.
We see it a lot, random and apparently non-obvious storage degradation in test environments - drove us nuts trying to run down - at least until we managed to get the real state of play from the SAN/storage design boys and girls and realised it was mainly futile to try and diagnose.
So we have a stage environment for production exports and data masking/cleansing etc and that shares components deep in the stack with performance test so if we're hammering through some data, we can hit their performance and they have no idea why.
It might be something to consider if you're seeing erratic, otherwise inexplicable performance.
As a side note, it drives me absolutely insane when I ask SAN people to check it and get the reply back "array is fine" and after much kicking, screaming and tantrums do I get "well, yeah, the FA's totally maxed out, but the array is fine". ARGH drives me nuts. It's like me saying "yeah, database is fine" when the archivers are hung, sure it's technically fine but it is of use to no-one!
But I'm digressing (and ranting)
[Updated on: Fri, 04 July 2014 05:16] Report message to a moderator
|
|
|
Re: I/O calibration [message #617910 is a reply to message #617900] |
Fri, 04 July 2014 05:51 |
John Watson
Messages: 8944 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
No digression as far as I am concerned: very instructive. I am way out of touch with hardware nowadays (my knowledge of disc arrays is from balancing I/O across 2GB discs in RS/6000 boxes - very twentieth century) I and have been accepting whatever the SAs say as being correct.
|
|
|
Re: I/O calibration [message #617914 is a reply to message #617910] |
Fri, 04 July 2014 06:34 |
Roachcoach
Messages: 1576 Registered: May 2010 Location: UK
|
Senior Member |
|
|
Well my personal experience has been questions like "I'm seeing slow reponses/poor throughput form the storage, can you check" doesnt lead to much more than "array is fine". I've had to be extremely pushy/insistent in the past to get anywhere and whilst yes the array was fine, the bits in the middle were not and really, after the DB fires the OS call, it's out of my playpen to handle/diagnose and over to them. Perhaps that is a naive assumption on my part however.
Various exchanges tend to take place which pretty much eventually boiling down to "the database is too dumb to lie, this is what it is experiencing, can you check EVERYTHING please".
Thankfully, I now know the names of a few guys who will actually check the stack and believe me when I say the DB is suffering slow responses.
Most of my experience is with the *nix machines though, we've nothing in windows here so I don't know how much of the tech or terms are applicable but areas I've seen issues are the FAs the HBAs and the OS disk queue manage settings being gubbed. The latter only occasionally presenting symptoms as at low load it behaved.
What is the most use (to state the obvious) is a test case of something where you can initiate a good, replicable storage hit and have someone WATCH the thing whilst it is happening and start working up the stack to find the time losses. It rarely actually is the array itself in my experience, but bits in the middle. Only other thing to watch for muddying waters is the SAN cache (if there is one).
I'm sure others here can add more detail to this, my knowledge of SANs etc is pretty rudimentary and based on past issues only.
|
|
|