PGIO Update

PGIO is the SLOB method for PostreSQL. An old friend of mine, Bart Sjerps, has implemented a modified PGIO in Python and posted it on Github: https://github.com/bsjerps/pypgio

For more on PGIO:

https://dev.to/franckpachot/postgresql-double-buffering-understand-the-cache-size-in-a-managed-service-oci-2oci

https://dev.to/yugabyte/slob-on-yugabytedb-1a32

https://franckpachot.medium.com/do-you-know-what-you-are-measuring-with-pgbench-d8692a33e3d6

VMware Are Serious About Oracle Database Performance Characterization

This is just a quick blog entry to alert readers to a list of very good performance characterization articles based on SLOB-based testing and published by VMware’s own Sudhir Balasubramanian. In addition to listing these links here I also updated the SLOB Experts Page.

Little Things Doth Crabby Make – Part XXIII. IEC 60027 Still Makes Me Crabby!

I know the base 10, base 2 IEC 60027-2 thing only makes me crabby. But since it does, I had to make a Little Things post. This sort of stuff, well, makes me crabby:

OK, now that’s off my chest. Have a good day and thanks for stopping by.

Microsoft Shares Oracle Database Direct NFS Capabilities with SLOB Testing Results

This is a quick post to announce that I have added Microsoft Azure to the SLOB industry experts page. I’ve put a link in the SLOB exerts page to direct you to the Microsoft testing results web page. It is about Direct NFS and Netapp storage in cloud (Azure). It is worth a read!

Announcing SLOB 2.5.4

SLOB 2.5.4 has important fixes to support very large session count testing. Testing with a large number of sessions is generally done in conjunction with enabling SLOB think time. Please see the documentation for more information about testing with SLOB think time. SLOB 2.5.4 has been tested with 12,288 sessions on 8-node RAC clusters (1,536 sessions per RAC instance) with varying think time.

When testing with very large session count, please consider taking advantage of the SLOB 2.5.3 feature of improved SLOB sessions placement as explained here.

Announcing SLOB 2.5.3

This is just a quick blog post to inform readers that SLOB 2.5.3 is now available at the following webpage: click here.

SLOB 2.5.3 is a bug fix release. One of the fixed bugs has to do with how SLOB sessions get connected to RAC instances. SLOB users can surely connect to the SCAN service but for more repeatable testing I advise SLOB 2.5.3 and SQL*Net services configured one per RAC node. This manner of connectivity establishes affinity between schemas and RAC nodes. For example, repeatability is improved if sessions performing SLOB Operations against, say, user7’s schema, it is beneficial to do so connected to the same RAC node as you iterate through your testing.

The following is cut and pasted from SLOB/misc/sql_net/README:

The tnsnames.ora in this directory offers an example of
service names that will allow the user to test RAC with
repeatable results. Connecting SLOB sessions to the round
robin SCAN listener will result in SLOB sessions connecting
to random RAC nodes. This is acceptable but not optimal and
can result in varying run results due to slight variations
in sessions per RAC node from one test to another.
As of SLOB 2.5.3, runit.sh uses the SQLNET_SERVICE_BASE and
SQLNET_SERVICE_MAX slob.conf parameters to sequentially
affinity SLOB threads (Oracle sessions) to numbered service
names. For example:
SQLNET_SERVICE_BASE=rac
SQLNET_SERVICE_MAX=8
With these assigned values, runit.sh will connect the first
SLOB thread to rac1 then rac2 and so forth until rac8 after
which the connection rotor loops back to rac1. This manner
of RAC affinity testing requires either a single SLOB
schema (see SLOB Single Schema Model in the documentaion)
or 8 SLOB schemas to align properly with the value assigned
to SQLNET_SERVICE_MAX. The following command will connect
32 SLOB threads (Oracle sessions) to each RAC node in an
8-node RAC configuration given the tnsnames.ora example
file in this directory:
$ sh ./runit.sh -s 8 -t 32

This is just a quick blog entry to direct readers to an Amazon Web Services blog post regarding Oracle Licensing options when deploying Oracle Database in AWS.

Licensing Options for Oracle Database Deployments in Amazon Web Services

 

Following Various Legal Action Regarding “Oracle Cloud Revenue”

As a courtesy, I would like to provide a copy of the latest legal document filed in an action being brought against Oracle leadership resulting from alleged, um,  improprieties in revenue reporting over the last few years. There is nothing sacred about this document. Google can find it for you just as easily.

Click on the following link to download the PDF: click here.

From the table of contents:

 

Announcing SLOB 2.5.2.4

SLOB 2.5.2.4 is a bug-fix release available via the SLOB Resources Page. The bug fixes in this release have to do with refinements to the undocumented Obfuscated Column feature which first appeared in SLOB 2.5.2.

 

Announcing SLOB 2.5.2.3

SLOB 2.5.2.3 is a bug-fix release available via the SLOB Resources Page. The bug fixes in this release have to do with refinements to the undocumented Obfuscated Column feature which first appeared in SLOB 2.5.2.

I’d like to give special thanks for some very skilled SLOB experts at Netapp for their feedback on the Obfuscated Column feature. Your testing helped find bugs and resulted in improved functionality.

Thank you Rodrigo Nascimento, Joe Carter and Scott Lane!

 

 

Announcing SLOB 2.5.2.2

SLOB 2.5.2.2 is available via the SLOB Resources Page.

SLOB 2.5.2.2 is a bug-fix release. After announcing the undocumented Obfuscated Column Data Feature, a few SLOB users reported bugs. The bugs have been fixed in this release.

Announcing Obfuscated Column Data Loading with SLOB

BLOG UPDATE: SLOB 2.5.2.2 is available for download as per this blog post.

I announced availability of SLOB 2.5.2.2. It is primarily a small bug fix release but also has an undocumented new feature. I want to say a few words about the feature in this post and will update SLOB with full support (e.g., documentation and slob.conf parameter support) in SLOB 2.5.3.  Some SLOB users might want this new undocumented feature as soon as possible–thus this blog post.

The new feature allows SLOB users to load unique data in SLOB tables (single table in Single Schema Model and multiple tables in the Multiple Schema Model).  Default SLOB data consists of simple, repeated character strings because column data is not really that important in the SLOB Method. However, some users find that their storage compression features (e.g., enterprise-class storage arrays such as those by Dell/EMC and Netapp) achieve artificially high compression ratios with default SLOB data. For certain testing purposes, it is less desirable to test with artificially high compression ratios. Starting with SLOB 2.5.2, users can avoid this level of compression by setting OBFUSCATE_COLUMNS to TRUE in the shell environment prior to executing the SLOB data loading script (setup.sh). Again, this feature will be documented in SLOB 2.5.3.

Overview of Loading Obfuscated Data

The following is a set of screen shots that will help me explain the OBFUSCATE_COLUMNS feature. To use the undocumented figure, simply set the shell environment variable OBFUSCATE_COLUMNS to TRUE. Doing so before executing the data loader (setup.sh) will result in all column data being loaded with random data. Similarly, setting the parameter before executing the test driver (runit.sh) will result in all UPDATE statements changing columns to random values.

An Example

The best way to describe the feature is to show it in use. Figure 1 shows how one can set the OBFUSCATE_COLUMNS shell environment variable to TRUE and load data as normal with the setup.sh script.

Figure 1

Figure 2 shows how the data loader filled the Single Schema Model table with two distinct strings. As explained in the documentation, the slob.conf->LOAD_PARALLEL_DEGREE parameter controls how many concurrent data loading streams will be used to load the data. In Figure 2 I had the default number of concurrent data loading streams. With the OBFUSCATE_COLUMNS feature, each data loading stream will load uniquely generated obfuscated strings. To show a sample of the data, I randomly chose the 13th ordinal column and queried distinct values. As explained in the documentation, the SLOB Method loads only a single row per block and each row has 19 128-character VARCHAR2 columns. Obfuscated data is loaded into each column. Note: each block loaded by a data loading stream will have the same block row content–so your compression ratio might vary based on the chunk size the compressor operates on.

Figure 2

Figure 3 shows how setting slob.conf->LOAD_PARALLEL_DEGREE to 64 results in 64 distinct obfuscated strings being loaded.

Figure 3

Summary

This post offers a preview of an undocumented SLOB feature. Maybe some users will post their experiences with before and after compression levels and how that changes their testing.

Announcing SLOB 2.5.2

SLOB 2.5.2 is available for download from Github. Please visit the SLOB Resources Page for the download URLs.

SLOB 2.5.2 is a bug fix release.

So pgio Does Not Accurately Report Physical I/O In Test Results? Buffering Buffers, and Baffles.

A new user to pgio (The SLOB Method for PostgreSQL) reached out to me with the following comment:

 I’ve been testing with pgio but when I compare I/O monitored in iostat output it does not match the pgio output for physical reads. 

The user is correct–but that’s not the fault of pgio. Please allow me to explain.

Buffering Buffers, and Baffles

PostgreSQL does not open files with the O_DIRECT flag which means I/O performed by PostgreSQL is buffered I/O. The buffering uses physical memory in the Linux page cache.  For this reason, the pgio runit.sh script produces output that accounts for read IOPS (RIOPS) as opposed to RPIOPS (Read Physical IOPS). The following is an example of what the user reported and how to change the behavior.

The output in Figure 1 shows how I set up a 48-schema (32GB each) test with a single pgio thread accessing each schema. The output of runit.sh shows that the PostgreSQL primary cache (shared memory) is 4GB and that at the end of the 120 second test the internal PostgreSQL counters tallied up 436,964 reads from storage per second (RIOPS).

Figure 1

The test configuration for this has NVMe drives assembled into a single logical volume with mdadm(4) called /dev/md0.  The pgio runit.sh driver script saves iostat(1) output. Figure 2 shows the iostat report for physical read requests issued to the device. A simple glance reveals the values cannot average up to 436,964 as reported by runit.sh. That’s because runit.sh isn’t reporting physical reads from storage.

Figure 2

As shown in Figure 3, pgio saves the output of /proc/diskstats. The most accurate way to calculate physical accesses to storage by pgio is to calculate the delta between the before and after data in the pgio_diskstats.out file. The diskstats data shows that the true physical read IOPS rate was 301,599 not the 436,964 figure reported by runit.sh. Again, this is because runit.sh reports reads, not physical reads and that is because PostgreSQL doesn’t really know whether a read operation is satisfied with a page cache buffer hit or a physical access to the storage device. Also shown in Figure 3 is how the data in iostat.out and diskstats output are within .1%.

Figure 3

So, let’s see how we can change this behavior.

Figure 4 shows that the 436,964 read results were achieved on a system that can buffer 526GB of the pgio active data set.

Figure 4

If You Don’t Want To Test Buffered I/O, Don’t

Figure 5 shows an example of using the pgio_reduce_free_memory.sh script to limit the amount of memory available to page cache. For no particular reason, I chose to limit page cache to less than 16GB and then executed the pgio test again.  As Figure 5 shows, the effect of neutering page cache buffering brought the RIOPS figure and both the iostat and diskstat data to within 1% variation.

Figure 5

Summary

This post shows that that PostgreSQL performs buffered I/O and that the pgio runit.sh driver script reports read iops (RIOPS) as per PostgreSQL internal statistics. Since pgio includes a helper script to eliminate page cache buffering from your testing, you too can test physical I/O with pgio and have accurate accounting of that physical I/O by analyzing the /proc/diskstats and iostat(1) data saved by the runit.sh script.

 

 

Announcing pgio (The SLOB Method for PostgreSQL) Is Released Under Apache 2.0 and Available at GitHub

This is just a quick post to advise readers that I have released pgio (The SLOB Method for PostgreSQL) under Apache 2.0. The bits are available at the following link: https://github.com/therealkevinc/pgio/releases/tag/1.0. The README is quite informative.

My last testing before the release showed “out of the box” data loading into Amazon Aurora with PostgreSQL compatibility at a rate of 1.69 TB/h. I only modified the pgio.conf file to specify the connection string and to set scale to 128 GB per schema:

After loading data I edited pgio.conf to increase the number of threads per schema to 16 and then easily drove IOPS to the current, advertised IOPS limit of 120,000 for Amazon Aurora with PostgreSQL compatibility.

Testing PostgreSQL physical I/O on any platform could not be any easier, repeatable, nor understandable.

Enjoy pgio!

 

 


DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 743 other subscribers
Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.