A question came in about whether it is possible to measure how much data is filtered out when running a query serviced by a Smart Scan in the Exadata Storage Server grid. The following is the long answer.
An Example of Offload Processing Effectiveness Accounting
I’d like to answer this question by taking real information from a test configuration consisting of 4 Real Application Clusters nodes attached to 6 Exadata Storage Server cells with the SAS disk option (12 x 300GB 3.5″ 15K RPM).
Test Workload – The Affinity Card Program Test Database
The Affinity Card Program Test Database (ACPTD) is a test schema and set of queries that mimics the type of queries performed by a marketing group responsible for promoting the use of a retail club card and affinity credit card for a fictitious company. In this example deployment of the ACPTD, the group responsible for promoting club card and affinity credit card activity for Acme Inc has built a data mart with data from several sources, including the main corporate ERP and CRM systems; partner merchant ERP systems; and, outside syndicated data.
For this blog post I’ll focus on the main syndicated card transaction table called all_card_trans. However, there are other tables in the schema and for reference sake the following text box shows the table sizes in my current deployment of the test kit. As the query shows, the all_card_trans table is 613 GB. Yes, I know that’s small, but this is a test system and I don’t like watching 6 Exadata cells hammering 1 GB/s each for tens of minutes when a couple of minutes will do. I hope you’ll understand.
SQL> set echo on
SQL> @tab_sizes
SQL> col segment_name for a32
SQL> select segment_name, sum(bytes)/1024/1024 mb from user_segments
2 where segment_name not like 'BIN%' and segment_name not like 'SYS%'
3 group by segment_name;
SEGMENT_NAME MB
-------------------------------- ----------
PARTNER_MERCHANTS 8
PARTNER_MERCHANT_SALES 67780
ZIPCODES 8
OUR_SALES 21956
ALL_CARD_TRANS 629032
MCC 8
OUR_STORES 8
CUST_SERVICE 76
TM .03125
CUST_FACT 4708
10 rows selected.
The table definition for the all_card_trans table is shown in the following text box:
SQL> desc all_card_trans
Name Null? Type
----------------------------------------- -------- ----------------------------
CARD_NO NOT NULL VARCHAR2(20)
CARD_TYPE CHAR(20)
MCC NOT NULL NUMBER(6)
PURCHASE_AMT NOT NULL NUMBER(6)
PURCHASE_DT NOT NULL DATE
MERCHANT_CODE NOT NULL NUMBER(7)
MERCHANT_CITY NOT NULL VARCHAR2(40)
MERCHANT_STATE NOT NULL CHAR(3)
MERCHANT_ZIP NOT NULL NUMBER(6)
The following queries show the total row count in the all_card_trans table as well as the number of rows representing transactions with credit cards with leading 4 digits of 3333 and 4447, which represent the card block granted to our fictitious company, Acme Inc, for their affinity card program. This is a needle in a haystack query since only .3% of the data matches the where predicate. Now, before you look at the text box please note that this is a single instance of Real Application Clusters driving this query. And, yes, the RAC node is a small 8-core DL360. Both scans are driving storage at 6 GB/s (613GB/102s). Folks, remember that to do what I just did in the following text box with Fibre Channel I’d have to have a system with 15 FC HBAs attached to several huge SAN arrays. Oh, and I should point out that it would also require about 85 Netezza Snippet Processing Units to match this throughput.
SQL> select count(*) from all_card_trans where card_no like '4777%' or card_no like '3333%’;
COUNT(*)
----------
22465830
Elapsed: 00:01:42.71
SQL> select count(*) from all_card_trans;
COUNT(*)
----------
7897611641
Elapsed: 00:01:42.47
I Can’t Believe I Ate the Whole Thing
The following query uses a method for measuring cell activity by forcing the database to ingest all rows and all columns. Don’t try to make sense of the query because it isn’t supposed to make sense. It is only supposed to drive Exadata to return all rows and all columns, which it does.
$ more demo_offload_bytes.sql
drop view ingest;
create view ingest as
select avg(length(CARD_NO)) c1,
avg(length(CARD_TYPE)) c2,
max(MCC + PURCHASE_AMT) c3,
max(length(PURCHASE_DT)) c4,
max(MERCHANT_CODE) c5,
avg(length(MERCHANT_CITY)) c6,
avg(length(MERCHANT_STATE)) c7,
max(MERCHANT_ZIP) c8 from all_card_trans;
col MB format 99999999.9
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43 );
select c1+c2+c3+c4+c5+c6+c7 from ingest;
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43);
The following text box shows that when I ran the query Exadata (in aggregate) scanned 613 GB of disk and returned all but 6% to the RDBMS instance (delta cell physical IO interconnect bytes). Also recorded by v$sysstat (bytes eligible) is the fact that there was nothing peculiar about the data being scanned-peculiar in any such fashion that would have interfered with offload processing (e.g., chained rows, etc). Since I asked for all the data, that is what I got. It is nice to know, however, that the entirety of the data was a candidate for offload processing.
Notice I didn’t time the query. I’ll offer more on that later in this blog post.
SQL> @demo_offload_bytes
View dropped.
View created.
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 1227.80814
cell physical IO interconnect bytes 1151.52453
cell physical IO bytes eligible for predicate offload 1227.31451
C1+C2+C3+C4+C5+C6+C7
--------------------
9010106.59
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 1841.5774
cell physical IO interconnect bytes 1727.15227
cell physical IO bytes eligible for predicate offload 1840.97177
So let’s take a look at some offload processing. The following change to the “ingest” query reduces the amount of data ingested by the database tier by only selecting rows where the credit card number started with 4777 or 3333. We know from the scan of the table that this whittles down the dataset by 99.7%.
$ more demo_offload_bytes1.sql
create view ingest_less as
select avg(length(CARD_NO)) c1,
avg(length(CARD_TYPE)) c2,
max(MCC + PURCHASE_AMT) c3,
max(length(PURCHASE_DT)) c4,
max(MERCHANT_CODE) c5,
avg(length(MERCHANT_CITY)) c6,
avg(length(MERCHANT_STATE)) c7,
max(MERCHANT_ZIP) c8 from all_card_trans act
where card_no like '4777%' or card_no like '3333%';
col MB format 99999999.9
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43 );
set timing on
select c1+c2+c3+c4+c5+c6+c7 from ingest_less;
select NAME,VALUE/(1024*1024*1024) GB from gv$sysstat where STATISTIC# in (196, 44, 43);
The following text box shows that that Exadata scanned and filtered out the uninteresting rows. The projected columns accounted for only 1.7 GB across the iDB intelligent I/O fabric. That is, the database tier only had to ingest 1.7 GB, or 17 MB/s since the query completed in 103 seconds.
SQL> @demo_offload_bytes_1
View dropped.
Elapsed: 00:00:00.21
View created.
Elapsed: 00:00:00.00
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 1841.13743
cell physical IO interconnect bytes 43.1782733
cell physical IO bytes eligible for predicate offload 1840.97177
Elapsed: 00:00:00.01
C1+C2+C3+C4+C5+C6+C7
--------------------
9010106.9
Elapsed: 00:01:43.31
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 2454.80203
cell physical IO interconnect bytes 44.8776852
cell physical IO bytes eligible for predicate offload 2454.62903
Elapsed: 00:00:00.00
SQL>
So, yes, Exadata does let you see how effectively Smart Scans are working. In this example we saw a single-table scan on a small Oracle Exadata Storage Server grid weeding out 99.7% of the data.
Sneaky, Sneaky.
You’ll notice when I executed demo_offload_bytes.sql (the non-filtering example) I was not tracking query completion time. That’s because up to this point I’ve only been showing examples driven by a single Real Application Clusters node. Now, let’s think about this. I have 8 Intel processor cores and a single inbound Infiniband path on the database server. From an inbound I/O bandwidth perspective the host can “only” ingest 1.6GB/s, but the CPUs may also further throttle that since the database is doing its own projection in this case.
I’ve shown, in this post, that this test Exadata Storage grid configuration can scan disk at 6 GB/s. The question you might ask-and I’m about to answer-is how much does this single database host throttle storage and why does that matter? Well, it matters because having ample storage bandwidth with limited database server bandwidth is the classic imbalance addressed by the HP Oracle Database Machine. So, let me run it again-with timing turned on. As you’ll see, bottlenecks are bottlenecks. This single database host cannot keep up with storage, so storage is throttled and the result is 18X increase in query completion time when compared to the heavily filtered case. Both queries had to read the same amount of data but in this case there was an imbalance in the upstream ability to ingest the data both from an I/O bandwidth and CPU perspective.
SQL> @demo_offload_bytes
View dropped.
Elapsed: 00:00:00.02
View created.
Elapsed: 00:00:00.01
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 2454.81084
cell physical IO interconnect bytes 44.886501
cell physical IO bytes eligible for predicate offload 2454.62903
Elapsed: 00:00:00.00
C1+C2+C3+C4+C5+C6+C7
--------------------
9010106.59
Elapsed: 00:29:26.85
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 3068.55792
cell physical IO interconnect bytes 620.492259
cell physical IO bytes eligible for predicate offload 3068.28629
Elapsed: 00:00:00.00
Need More RDBMS Bandwidth? OK, Use It.
So, let’s see what the full-ingestion case looks like with 4 Real Application Clusters nodes. And please forgive that I forgot to aggregate the v$sysstat output. I’ve added 4-fold database grid resources so I should complete the full-ingestion query in 75% less time. The text box starts out showing that the number of RAC instances was increased from 1 to 4.
SQL> host date
Fri Oct 3 16:27:52 PDT 2008
SQL> select instance_name from gv$instance;
INSTANCE_NAME
----------------
test1
Elapsed: 00:00:00.02
SQL> select instance_name from gv$instance;
INSTANCE_NAME
----------------
test1
test4
test3
test2
Elapsed: 00:00:00.08
SQL> host date
Fri Oct 3 16:32:06 PDT 2008
SQL>
SQL> @demo_offload_bytes
View dropped.
Elapsed: 00:00:00.03
View created.
Elapsed: 00:00:00.01
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 3068.69843
cell physical IO interconnect bytes 620.632769
cell physical IO bytes eligible for predicate offload 3068.28629
physical IO disk bytes .013385296
cell physical IO interconnect bytes .013385296
cell physical IO bytes eligible for predicate offload 0
physical IO disk bytes .010355473
cell physical IO interconnect bytes .010355473
cell physical IO bytes eligible for predicate offload 0
physical IO disk bytes .031275272
cell physical IO interconnect bytes .031275272
NAME GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload 0
12 rows selected.
Elapsed: 00:00:00.02
C1+C2+C3+C4+C5+C6+C7
--------------------
9010106.59
Elapsed: 00:07:25.01
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 3222.13893
cell physical IO interconnect bytes 764.535856
cell physical IO bytes eligible for predicate offload 3221.70425
physical IO disk bytes 154.31714
cell physical IO interconnect bytes 144.732708
cell physical IO bytes eligible for predicate offload 154.280945
physical IO disk bytes 152.860648
cell physical IO interconnect bytes 143.367385
cell physical IO bytes eligible for predicate offload 152.828033
physical IO disk bytes 153.195674
cell physical IO interconnect bytes 143.6831
NAME GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload 153.13031
12 rows selected.
Elapsed: 00:00:00.02
SQL>
Well, that was predictable, but cool nonetheless. What about the filtered ingest-query? Should there be much of a speedup given the fact that storage bandwidth remains constant and the ingest rate of the filtered query was only roughly 17 MB/s with a single Real Application Clusters node? I could save the cut and paste effort and just tell you that adding Real Application Clusters nodes in this case will, of course, not reduce the query time, but since I’ve bored you this long, here it is:
SQL> @demo_offload_bytes_1
View dropped.
Elapsed: 00:00:00.01
View created.
Elapsed: 00:00:00.01
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 3222.15758
cell physical IO interconnect bytes 764.554509
cell physical IO bytes eligible for predicate offload 3221.70425
physical IO disk bytes 154.334342
cell physical IO interconnect bytes 144.74991
cell physical IO bytes eligible for predicate offload 154.280945
physical IO disk bytes 153.213285
cell physical IO interconnect bytes 143.70071
cell physical IO bytes eligible for predicate offload 153.13031
physical IO disk bytes 152.878347
cell physical IO interconnect bytes 143.385084
NAME GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload 152.828033
12 rows selected.
Elapsed: 00:00:00.01
C1+C2+C3+C4+C5+C6+C7
--------------------
9010106.9
Elapsed: 00:01:42.79
NAME GB
---------------------------------------------------------------- ----------
physical IO disk bytes 3375.27835
cell physical IO interconnect bytes 764.974159
cell physical IO bytes eligible for predicate offload 3374.81995
physical IO disk bytes 308.0746
cell physical IO interconnect bytes 144.133529
cell physical IO bytes eligible for predicate offload 307.986572
physical IO disk bytes 310.634206
cell physical IO interconnect bytes 145.189768
cell physical IO bytes eligible for predicate offload 310.5755
physical IO disk bytes 302.274069
cell physical IO interconnect bytes 143.805239
NAME GB
---------------------------------------------------------------- ----------
cell physical IO bytes eligible for predicate offload 302.218781
12 rows selected.
Elapsed: 00:00:00.01
SQL> host date
Fri Oct 3 16:49:32 PDT 2008
Summary
Exadata works and has statistics so that the solution doesn’t feel so much like a black box. In the end, however, none of the statistcs stuff really matters. What matters is whether your queries complete in the desired service times.
Like this:
Like Loading...
Recent Comments