This is installment number three in my series on Oracle Exadata Storage Server and HP Oracle Database Machine frequently asked questions. I recommend you also visit:
Exadata Storage Server Frequently Asked Questions Part I.
Exadata Storage Server Frequently Asked Questions Part II.
I’m mostly cutting and pasting questions from the comment threads of my blog posts about Exadata and mixing in some assertions I’ve seen on the web and re-phrasing them as questions. If they read as questions when I see then I cut and paste them without modification.
Q. Is there a coming Upgrading Guide kind of document or a step-by-step installation metalink note planned for the database machine?
A. HP Oracle Database Machine and Oracle Exadata Storage Servers are installed at the factory by HP.
Q The ODM spec sheets says there are four 24-port InfiniBand switches (96 total ports) in each DB machine. If each of the 8 RBMS hosts has 2 links to the switches and each Exadata server (14) also has two, then it is just 2×8 + 2 x 14= 44 links
A. Since the HP Oracle Database Machine is an appliance installed at the factory, by HP, I’m hesitant to go too deep in this area. The short answer is that the extra switches are there to address the loss of an entire HP Oracle Database Machine rack in a multi-rack scale-out configuration.
Q. How does this architecture deal with data distribution and redistribution? It seems like that’s still going to be a problem with joining data that isn’t distributed the same way. Does all the data then go back to the RAC?
A. Data distribution is a multifaceted topic. There is partition-wise data distribution and ASM extent distribution. Nonetheless, the answer is the same for both types of distribution: no change. ASM treats what we refer to as “grid disks” in Exadata Storage Cells no differently than disks in a SAN when it comes to laying out extents. Likewise, partitioning does not change. In fact, nothing about partitioning changes with Exadata.
If, for instance, you have data with poor data distribution (e.g., partitioning skew) with ASM on a SAN, it would be the same with Exadata-but at least the I/O would be extremely fast <smiley>
Exadata changes how data comes from disk, not how it is placed on disk.
Q. If I do a big query and sort, will that bottleneck one of the RAC nodes?
A. Exadata changes nothing about Oracle in this regard. Nonetheless, sorts are parallelized with Intra-node Parallel Query in a Real Applications Clusters environment so I’m at a loss for what you are referring to.
Q. Is temp space managed at the storage layer or on the RAC nodes?
A. Exadata changes nothing about Oracle in this regard. Temporary segments are a logical collection of extents in a file. It’s the same with or without Exadata.
Q. Not sure this is exactly in your field, but what does the cell do when it hits a block that was flushed to disk before being committed (ie needs an UNDO check)? Can it return a mix of rowset and block data so the DB server checks UNDO ?
A. Data consistency and transaction concurrency are not offloaded to Exadata Storage Servers. The integrity of the scan is controlled by the RDBMS. I think it is counterproductive to discuss the edge-cases where a Smart Scan will not be possible. If you are using a database as a data warehouse, you will get Smart Scans. If you are doing reporting against an active OLTP database, you will see queries that are not serviced by Smart Scans.
Smart Scans are optimized for data warehousing workloads and, just as is the case in non-Exadata environments, it is not good practice to be modifying the active query data set while running queries. Adding partitions and loading data while queries are running, sure, but changing a row here and there in a data warehouse doesn’t make much sense (at least to me).
Q. Suppose I have a server (e.g. linux) or a number of RAC nodes, how do I connect them to the Exadata and how do I access the disk space?
A. If you wish to adopt Exadata into an existing Oracle Database 11.1.0.7 environment there are SKUs for that. Talk to your sales representative and make room for Infiniband switches and HCAs.
Q. Do I need fibre or ethernet connections, switches, special hardware between my server and the Exadata?
A. Of course! Exadata is Infiniband based. You’ll at least need Infiniband HCAs and switches to get to the data stored in Exadata. Once you are up to the correct Oracle version you can run with non-Exadata and Exadata tablespaces side-by-side. This fact will aid migrations.
Q. Do I still need OS multipath software?
A. No. Well, not for Exadata.
Q. Do I see raw luns that I present to an ASM instance running on my own machine(s) or does my database communicates directly with the ASM on the Exadata?
A. ASM will have visibility to Exadata Storage Server “grid disks.” There happens to be a command line tool that makes it easy for me to illustrate the point. In the following text box I’ve cut and pasted session output from an xterm where I used the kfod tool to list all known Exadata Storage Server grid disks and grep’ed for ones I named “data1” on cell number 6 of the configuration. To further illustrate the point I then changed directories to list the DDL I used to incorporate all “data1” grid disks in the configuration into an ASM disk group called “DATA1.” Other than the fact that DATA1 is a candidate for Smart Scan, there is really nothing different between this disk group and any other Oracle Database 11g ASM disk group.
$ kfod -disk all | grep ‘data1.*cell06’
241: 117760 Mb o/192.168.50.32:5042/data1_CD_10_cell06 <unknown> <unknown> 242: 117760 Mb o/192.168.50.32:5042/data1_CD_11_cell06 <unknown> <unknown> 243: 117760 Mb o/192.168.50.32:5042/data1_CD_12_cell06 <unknown> <unknown> 244: 117760 Mb o/192.168.50.32:5042/data1_CD_1_cell06 <unknown> <unknown> 245: 117760 Mb o/192.168.50.32:5042/data1_CD_2_cell06 <unknown> <unknown> 246: 117760 Mb o/192.168.50.32:5042/data1_CD_3_cell06 <unknown> <unknown> 247: 117760 Mb o/192.168.50.32:5042/data1_CD_4_cell06 <unknown> <unknown> 248: 117760 Mb o/192.168.50.32:5042/data1_CD_5_cell06 <unknown> <unknown> 249: 117760 Mb o/192.168.50.32:5042/data1_CD_6_cell06 <unknown> <unknown> 250: 117760 Mb o/192.168.50.32:5042/data1_CD_7_cell06 <unknown> <unknown> 251: 117760 Mb o/192.168.50.32:5042/data1_CD_8_cell06 <unknown> <unknown> 252: 117760 Mb o/192.168.50.32:5042/data1_CD_9_cell06 <unknown> <unknown> $ cd $ORACLE_HOME/dbs $ cat cr_data1_dg.sql create diskgroup DATA1 normal redundancy DISK ‘o/*/*data1*’ ATTRIBUTE ‘AU_SIZE’ = ‘4M’, ‘CELL.SMART_SCAN_CAPABLE’=’TRUE’, ‘compatible.rdbms’=’11.1.0.7’, ‘compatible.asm’=’11.1.0.7’ / |
Q. Do I still have to struggle with raw devices on os level?
A. No.
Q. Can I create multiple databases in the available space?
A. Absolutely. I haven’t even started blogging about I/O Resource Management features of Exadata. This is the only platform that can prevent multiple applications from stealing resources from each other-all the way down to physical I/O.
Q. Do I still need to create asm disks or diskgroups, or do I just see one large asm disk of e.g. 168Tb?
A. Physical disks in Exadata Storage Server cells are “carved” up into what we refer to as grid disks. Each grid disk becomes an ASM disk. The fewest ASM disks you could end up with in a full-rack HP Oracle Database Machine is 168.
Q. […] don’t some of the DW vendors split the data up in a shared nothing method. Thus when the data has to be repartitioned it gets expensive. Whereas here you just add another cell and ASM goes to work in the background. (depending upon the ASM power level you set.)
A. All the DW Appliance vendors implement shared-nothing so, yes, the data is chopped up into physical partitions. If you add hardware to increase performance of queries against your current dataset the data will have to be reloaded into the new partitioning scheme. As has always been the case with ASM, adding new disks-and therefore Exadata Storage Server cells-will cause the existing data to be redistributed automatically over all (including the new) drives. This ASM data redistribution is an online function.
Q. [regarding] Supportability – Oracle software support has always been spotty. Now with a combination of Oracle Linux, Oracle database and HP hardware, it is going to be interesting to see how it all comes together – especially upgrades, patches etc.
A. Support is provided via a single phone number.
Q. How easy or difficult is it to maintain? Do we need to build specialized skills inhouse or is it hands-off like Teradata?
A. In my reckoning, you need to the same Oracle data warehousing skills you need to day, plus a primer on Exadata.
Q. [regarding] Ease of use – Can I simply move an existing oracle warehouse instance to the new database machine and can use it day 1? How easy or difficult is it? Do I need to spend significant time like with a RAC instance – partitioning etc?
A. Data from an existing data warehouse will have to be physically moved into an Exadata environment. You will be either moving entirely from one environment to another (e.g., 10g on Unix to Exadata with Linux) or adding Exadata Storage to your existing environment and copying the data into Exadata storage. The former would be done in the same manner as any cross-platform migration while the latter would require the warehouse be upgraded to Oracle Database 11g Release 11.1.0.7. Once upgraded to 11.1.0.7 and Infiniband connectivity is sorted out, the data can then copied with the simplicity of a CTAS operation or other such operation.
Q.The Exadata storage concept is excellent – more storage comes with additional CPU and Cache – Can we use it for non-oracle applications – such as Log processing etc?
A. Anything that can go into an Oracle Database can go into Exadata. So such features as SecureFiles are supported. Exadata is not scalable general-purpose storage.
Q. Why would I want to use Oracle rather than Teradata or Netezza which is proven?
A. Because, perhaps, the data you are extracting to load into Netezza is coming from an Oracle Database? There are a lot of answers to this question I suppose. In the end, I should think the choice would be based foremost on performance. Most of Netezza’s customers are either Oracle customers as well, or have migrated from Oracle. I think in Netezza’s “early days” the question was likely reversed. We aim to reverse the question.
Q. Backup using RMAN – RMAN backups are not really geared for big databases, so is there any other off host alternatives available?
A. The data stored in Exadata is under ASM control. The same backup restrictions apply for Exadata as any other ASM deployment.
Hi Kevin
Thank you for all the interesting information you provide through your blog. Nevertheless I have some questions for you.
1) Where can I find the official documentation for the Exadata Product Family? I could only find datasheet and white paper.
2) How those the optimizer decide whether to use an index or a Smart Scan for predicate filtering? I guess he will choose the faster one, but what’s faster and when?
3) What is meant in the Exadata white paper about Smart Scan Predicate Filtering by “…only rows where the employees hire date is after the specified date are sent from Exadata to the database instance..”? Does it really only return the rows matching the predicate or does it return all blocks containing rows which match the predicate? If the former is correct, how is this handled in the db block buffer?
I’m looking forward to blog entries about the inter-database resource management.
Cheers
Daniel
A technical question: for joining large datasets, such as a fact tables anda very large dimension, we currently encourage the use of equipartitioning on the join key to reduce messaging (CPU) and the volume of data distributed to PQ slaves (memory/storage).
Would there be (a) a benefit and (b) a mechanism for ensuring that matching partitions of commonly joined tables are colocated on the same cell, so that the join can be performed entirely at the storage level?
This is very impressive.
What is the latency to read cached and uncached blocks from Exadata?
What type of throughput do you get when using multiple racks (cells?) and processors must access data from cells in different racks?
What does the comment that “RMAN backups are not really geared for big databases” mean? This is the first I’ve heard anything like that.
Hi Jason,
It’s Q&A. I have no idea why some people ask the questions they ask!