Linux Thinks It’s a CPU, But What Is It Really – Part III. How Do Intel Xeon 7500 (Nehalem EX) Processors Map To Linux OS Processors?

Last year I posted a blog entry entitled Linux Thinks It’s a CPU, But What Is It Really – Part I. Mapping Xeon 5500 (Nehalem) Processor Threads to Linux OS CPUs where I discussed the Intel CPU Topology Tool. The topology tool is most helpful when trying to quickly map Linux OS processors to physical processor cores or threads. That post gets read, on average, close to 20 times per day since it was posted (10,000+views) so I thought it deserves a follow-up pertaining to more recent Intel processors and, more importantly, more recent Linux releases.

I’m happy to point out that the tool still functions just fine for Intel Xeon 7500 series processors (a.k.a. Nehalem EX see also Sun Oracle’s Sun Fire X4800), however, with recent Linux releases the tool is not quite as necessary. With both Enterprise Linux Enterprise Linux Server release 5.5 (Oracle Enterprise Linux 5.5 ) and Red Hat Enterprise Linux Server release 5.5 the numactl(8) command now renders output that makes it quite clear which sockets associate with which OS processors.

The following output was captured from an 8-socket Nehalem EX machine:

$ numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 1 2 3 4 5 6 7 64 65 66 67 68 69 70 71
node 0 size: 131062 MB
node 0 free: 122879 MB
node 1 cpus: 8 9 10 11 12 13 14 15 72 73 74 75 76 77 78 79
node 1 size: 131072 MB
node 1 free: 125546 MB
node 2 cpus: 16 17 18 19 20 21 22 23 80 81 82 83 84 85 86 87
node 2 size: 131072 MB
node 2 free: 125312 MB
node 3 cpus: 24 25 26 27 28 29 30 31 88 89 90 91 92 93 94 95
node 3 size: 131072 MB
node 3 free: 126543 MB
node 4 cpus: 32 33 34 35 36 37 38 39 96 97 98 99 100 101 102 103
node 4 size: 131072 MB
node 4 free: 125454 MB
node 5 cpus: 40 41 42 43 44 45 46 47 104 105 106 107 108 109 110 111
node 5 size: 131072 MB
node 5 free: 124881 MB
node 6 cpus: 48 49 50 51 52 53 54 55 112 113 114 115 116 117 118 119
node 6 size: 131072 MB
node 6 free: 123862 MB
node 7 cpus: 56 57 58 59 60 61 62 63 120 121 122 123 124 125 126 127
node 7 size: 131072 MB
node 7 free: 126054 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  15  20  15  15  20  20  20 
  1:  15  10  15  20  20  15  20  20 
  2:  20  15  10  15  20  20  15  20 
  3:  15  20  15  10  20  20  20  15 
  4:  15  20  20  20  10  15  15  20 
  5:  20  15  20  20  15  10  20  15 
  6:  20  20  15  20  15  20  10  15 
  7:  20  20  20  15  20  15  15  10 

A node is synonymous with a socket in this case. So, as the output shows, socket 0 maps to OS processors 0-7 and 64-71, the latter range being processor threads. Let’s see how similar this output is to the Intel CPU Topology Tool (NOTE – hover over the box and click view source for best presentation):

Package 0 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
L1D is Level 1 Data cache, size(KBytes)= 32,  Cores/cache= 2, Caches/package= 8
L1I is Level 1 Instruction cache, size(KBytes)= 32,  Cores/cache= 2, Caches/package= 8
L2 is Level 2 Unified cache, size(KBytes)= 256,  Cores/cache= 2, Caches/package= 8
L3 is Level 3 Unified cache, size(KBytes)= 24576,  Cores/cache= 16, Caches/package= 1
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|       0       64|       1       65|       2       66|       3       67|       4       68|       5       69|       6       70|       7       71|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|       1     1z16|       2     2z16|       4     4z16|       8     8z16|      10     1z17|      20     2z17|      40     4z17|      80     8z17|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ff


Package 1 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|       8       72|       9       73|      10       74|      11       75|      12       76|      13       77|      14       78|      15       79|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|     100     1z18|     200     2z18|     400     4z18|     800     8z18|     1z3     1z19|     2z3     2z19|     4z3     4z19|     8z3     8z19|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ff00


Package 2 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      16       80|      17       81|      18       82|      19       83|      20       84|      21       85|      22       86|      23       87|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|     1z4     1z20|     2z4     2z20|     4z4     4z20|     8z4     8z20|     1z5     1z21|     2z5     2z21|     4z5     4z21|     8z5     8z21|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ffz4


Package 3 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      24       88|      25       89|      26       90|      27       91|      28       92|      29       93|      30       94|      31       95|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|     1z6     1z22|     2z6     2z22|     4z6     4z22|     8z6     8z22|     1z7     1z23|     2z7     2z23|     4z7     4z23|     8z7     8z23|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ffz6


Package 4 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      32       96|      33       97|      34       98|      35       99|      36      100|      37      101|      38      102|      39      103|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|     1z8     1z24|     2z8     2z24|     4z8     4z24|     8z8     8z24|     1z9     1z25|     2z9     2z25|     4z9     4z25|     8z9     8z25|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ffz8


Package 5 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      40      104|      41      105|      42      106|      43      107|      44      108|      45      109|      46      110|      47      111|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|    1z10     1z26|    2z10     2z26|    4z10     4z26|    8z10     8z26|    1z11     1z27|    2z11     2z27|    4z11     4z27|    8z11     8z27|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ffz10


Package 6 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      48      112|      49      113|      50      114|      51      115|      52      116|      53      117|      54      118|      55      119|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|    1z12     1z28|    2z12     2z28|    4z12     4z28|    8z12     8z28|    1z13     1z29|    2z13     2z29|    4z13     4z29|    8z13     8z29|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

Combined socket AffinityMask= 0xff00000000000000ffz12


Package 7 Cache and Thread details


Box Description:
Cache  is cache level designator
Size   is cache size
OScpu# is cpu # as seen by OS
Core   is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
Extended Hex replaces trailing zeroes with 'z#'
       where # is number of zeroes (so '8z5' is '0x800000')
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+
Cache |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |     L1D         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
OScpu#|      56      120|      57      121|      58      122|      59      123|      60      124|      61      125|      62      126|      63      127|
Core  |   c0_t0    c0_t1|   c1_t0    c1_t1|   c2_t0    c2_t1|   c3_t0    c3_t1|   c4_t0    c4_t1|   c5_t0    c5_t1|   c6_t0    c6_t1|   c7_t0    c7_t1|
AffMsk|    1z14     1z30|    2z14     2z30|    4z14     4z30|    8z14     8z30|    1z15     1z31|    2z15     2z31|    4z15     4z31|    8z15     8z31|
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |     L1I         |
Size  |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |     32K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |      L2         |
Size  |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |    256K         |
      +-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+-----------------+

Cache |      L3                                                                                                                                       |
Size  |     24M                                                                                                                                       |
      +-----------------------------------------------------------------------------------------------------------------------------------------------+

I’m quite happy to see this enhancement to numactl(8). I’ll try to blog soon on why you should care about this topic.

0 Responses to “Linux Thinks It’s a CPU, But What Is It Really – Part III. How Do Intel Xeon 7500 (Nehalem EX) Processors Map To Linux OS Processors?”



  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




DISCLAIMER

I work for Amazon Web Services. The opinions I share in this blog are my own. I'm *not* communicating as a spokesperson for Amazon. In other words, I work at Amazon, but this is my own opinion.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 2,961 other followers

Oracle ACE Program Status

Click It

website metrics

Fond Memories

Copyright

All content is © Kevin Closson and "Kevin Closson's Blog: Platforms, Databases, and Storage", 2006-2015. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Kevin Closson and Kevin Closson's Blog: Platforms, Databases, and Storage with appropriate and specific direction to the original content.

%d bloggers like this: